Ø The socket library that acts as a concurrency framework.
Ø Faster than TCP, for clustered products and supercomputing.
Ø Carries messages across inproc, IPC, TCP, and multicast.
Ø Connect N-to-N via fanout, pubsub, pipeline, request-reply.
Ø Asynch I/O for scalable multicore message-passing apps.
Ø Large and active open source community.
Ø 30+ languages including C, C++, Java, .NET, Python.
Ø Most OSes including Linux, Windows, OS X.
Ø LGPL free software with full commercial support from iMatix.
"Cascalog is a fully-featured data processing and querying library for Clojure or Java. The main use cases for Cascalog are processing "Big Data" on top of Hadoop or doing analysis on your local computer. Cascalog is a replacement for tools like Pig, Hive, and Cascading and operates at a significantly higher level of abstraction than those tools."
"Scalding is a Scala library that makes it easy to specify Hadoop MapReduce jobs. Scalding is built on top of Cascading, a Java library that abstracts away low-level Hadoop details. Scalding is comparable to Pig, but offers tight integration with Scala, bringing advantages of Scala to your MapReduce jobs."
"Faunus is a property graph analytics engine based on Hadoop. A breadth-first version of the graph traversal language Gremlin operates on a vertex-centric property graph data structure. Faunus can be extended with new operations written using MapReduce and Blueprints."
"Shark is a large-scale data warehouse system for Spark designed to be compatible with Apache Hive. It can execute Hive QL queries up to 100 times faster than Hive without any modification to the existing data or queries. Shark supports Hive's query language, metastore, serialization formats, and user-defined functions, providing seamless integration with existing Hive deployments and a familiar, more powerful option for new ones."
"Apache Phoenix is a SQL skin over HBase delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. The table metadata is stored in an HBase table and versioned, such that snapshot queries over prior versions will automatically use the correct schema. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows."
"In this article, we discuss the three-way relationship between three
such desirable features - fairness, isolation, and throughput (FIT) - and argue that only two out of the
three of them can be achieved simultaneously."
"Package mangos is an implementation in pure Go of the SP ("Scalability Protocols") messaging system. This makes heavy use of go channels, internally, but it can operate on systems that lack support for cgo."
"Comdb2 is a relational database built in-house at Bloomberg L.P. over the last 14 years or so. It started with a modest goal of replacing an older home-grown system to allow databases to stay in sync easier. SQL was added early in its development, and it quickly started replacing other relational databases in addition to its original goal. Comdb2 today holds a good chunk of Bloomberg's data, and is continually developed by a dedicated team."