Skip to main content

Home/ Arquitectura?/ Group items tagged mapreduce

Rss Feed Group items tagged

Pablo Lalloni

twitter/scalding · GitHub - 0 views

  •  
    "Scalding is a Scala library that makes it easy to specify Hadoop MapReduce jobs. Scalding is built on top of Cascading, a Java library that abstracts away low-level Hadoop details. Scalding is comparable to Pig, but offers tight integration with Scala, bringing advantages of Scala to your MapReduce jobs."
Pablo Lalloni

GravityLabs/HPaste - 0 views

  •  
    "HPaste unlocks the rich functionality of HBase for a Scala audience. In so doing, it attempts to achieve the following goals: Provide a strong, clear syntax for querying and filtration Perform as fast as possible while maintaining idiomatic Scala client code -- the abstractions should not show up in a profiler! Re-articulate HBase's data structures rather than force it into an ORM-style atmosphere. A rich set of base classes for writing MapReduce jobs in hadoop against HBase tables. Provide a maximum amount of code re-use between general Hbase client usage, and operation from within a MapReduce job. Use Scala's type system to its advantage--the compiler should verify the integrity of the schema. Be a verbose DSL--minimize boilerplate code, but be human readable!"
Pablo Lalloni

Rosetta Code · twitter/scalding Wiki - 0 views

  •  
    A collection of MapReduce tasks translated (from Pig, Hive, MapReduce streaming, etc.) into Scalding for comparison.
Pablo Lalloni

elasticsearch/elasticsearch-hadoop - 0 views

  •  
    "Read and write data to/from Elasticsearch within Hadoop/MapReduce libraries. Automatically converts data to/from JSON. Supports MapReduce, Cascading, Hive and Pig."
Pablo Lalloni

snappy - A fast compressor/decompressor - Google Project Hosting - 0 views

  •  
    "Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger. On a single core of a Core i7 processor in 64-bit mode, Snappy compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more. Snappy is widely used inside Google, in everything from BigTable and MapReduce to our internal RPC systems. (Snappy has previously been referred to as "Zippy" in some presentations and the likes.)"
Pablo Lalloni

Introducing Scoobi and Scalding: Scala DSLs for Hadoop MapReduce * myNoSQL - 0 views

  •  
    Buenas diapositivas introduciendo scalding y scoobi.
Pablo Lalloni

thinkaurelius/faunus - 0 views

  •  
    "Faunus is a property graph analytics engine based on Hadoop. A breadth-first version of the graph traversal language Gremlin operates on a vertex-centric property graph data structure. Faunus can be extended with new operations written using MapReduce and Blueprints."
Pablo Lalloni

Hama - a general BSP framework on top of Hadoop - 0 views

  •  
    "Apache Hama is a pure BSP (Bulk Synchronous Parallel) computing framework on top of HDFS (Hadoop Distributed File System) for massive scientific computations such as matrix, graph and network algorithms. Today, many practical data processing applications require a more flexible programming abstraction model that is compatible to run on highly scalable and massive data systems (e.g., HDFS, HBase, etc). A message passing paradigm beyond Map-Reduce framework would increase its flexibility in its communication capability. Bulk Synchronous Parallel (BSP) model fills the bill appropriately. Some of its significant advantages over MapReduce and MPI are: * Supports message passing paradigm style of application development * Provides a flexible, simple, and easy-to-use small APIs * Enables to perform better than MPI for communication-intensive applications * Guarantees impossibility of deadlocks or collisions in the communication mechanisms"
1 - 13 of 13
Showing 20 items per page