Apache Spark: 100 terabytes (TB) of data sorted in 23 minutes | Opensource.com - 0 views
-
Gonzalo San Gil, PhD. on 16 Jan 15"In October 2014, Databricks participated in the Sort Benchmark and set a new world record for sorting 100 terabytes (TB) of data, or 1 trillion 100-byte records. The team used Apache Spark on 207 EC2 virtual machines and sorted 100 TB of data in 23 minutes."
-
Gonzalo San Gil, PhD. on 16 Jan 15"In October 2014, Databricks participated in the Sort Benchmark and set a new world record for sorting 100 terabytes (TB) of data, or 1 trillion 100-byte records. The team used Apache Spark on 207 EC2 virtual machines and sorted 100 TB of data in 23 minutes."