Skip to main content

Home/ Arquitectura?/ Group items matching "spark" in title, tags, annotations or url

Group items matching
in title, tags, annotations or url

Sort By: Relevance | Date Filter: All | Bookmarks | Topics Simple Middle
Pablo Lalloni

andypetrella/spark-notebook - 0 views

  •  
    "The main intent of this tool is to create reproducible analysis using Scala, Apache Spark and more. This is achieved through an interactive web-based editor that can combine Scala code, SQL queries, Markup or even JavaScript in a collaborative manner. The usage of Spark comes out of the box, and is simply enabled by the implicit variable named SparkContext. You should also check the website, http://Spark-notebook.io."
Pablo Lalloni

dnafrance/vagrant-hadoop-spark-cluster - 0 views

  •  
    "Vagrant project to spin up a cluster of 4 32-bit CentOS6.5 Linux virtual machines with Hadoop v2.6.0 and Spark v1.1.1"
Sebastián Zaffarano

Apache Spark: 100 terabytes (TB) of data sorted in 23 minutes | Opensource.com - 1 views

  •  
    "In October 2014, Databricks participated in the Sort Benchmark and set a new world record for sorting 100 terabytes (TB) of data, or 1 trillion 100-byte records. The team used Apache Spark on 207 EC2 virtual machines and sorted 100 TB of data in 23 minutes."
Pablo Lalloni

shark - 0 views

  •  
    "Shark is a large-scale data warehouse system for Spark designed to be compatible with Apache Hive. It can execute Hive QL queries up to 100 times faster than Hive without any modification to the existing data or queries. Shark supports Hive's query language, metastore, serialization formats, and user-defined functions, providing seamless integration with existing Hive deployments and a familiar, more powerful option for new ones."
Pablo Lalloni

Shark - Lightning Fast Data Warehouse System - 0 views

  •  
    "Shark is a large-scale data warehouse system for Spark designed to be compatible with Apache Hive. It can answer Hive QL queries up to 100 times faster than Hive without modification to the existing data nor queries. Shark supports Hive's query language, metastore, serialization formats, and user-defined functions."
Pablo Lalloni

Ferry | Big Data Development Environment Using Docker - 0 views

  •  
    "Ferry helps you create big data clusters on your local machine. Define your big data stack using YAML and share your application with Dockerfiles. Ferry supports Hadoop, Cassandra, Spark, GlusterFS, and Open MPI."
Pablo Lalloni

Ferry | Big Data Development Environment Using Docker - 0 views

  •  
    "Ferry helps you create big data clusters on your local machine. Define your big data stack using YAML and share your application with Dockerfiles. Ferry supports Hadoop, Cassandra, Spark, GlusterFS, and Open MPI."
1 - 14 of 14
Showing 20 items per page