Group items tagged hdfs - Arquitectura?

The HDF Group - Why use HDF? - 0 views

www.hdfgroup.org/why_hdf

development programming data-storage data-analysis data-manipulation library

shared by Pablo Lalloni on 06 Apr 13 - No Cached

Pablo Lalloni on 06 Apr 13

"HDF (Hierarchical Data Format) technologies are relevant when the data challenges being faced push the limits of what can be addressed by traditional database systems, XML documents, or in-house data formats. Leveraging the powerful HDF products and the expertise of The HDF Group, organizations realize substantial cost savings while solving challenges that seemed intractable using other data management technologies. Many HDF adopters have very large datasets, very fast access requirements, or very complex datasets. Others turn to HDF because it allows them to easily share data across a wide variety of computational platforms using applications written in different programming languages. Some use HDF to take advantage of the many open-source and commercial tools that understand HDF. Similar to XML documents, HDF files are self-describing and allow users to specify complex data relationships and dependencies. In contrast to XML documents, HDF files can contain binary data (in many representations) and allow direct access to parts of the file without first parsing the entire contents. HDF, not surprisingly, allows hierarchical data objects to be expressed in a very natural manner, in contrast to the tables of relational database. Whereas relational databases support tables, HDF supports n-dimensional datasets and each element in the dataset may itself be a complex object. Relational databases offer excellent support for queries based on field matching, but are not well-suited for sequentially processing all records in the database or for subsetting the data based on coordinate-style lookup."

<div class="cArrow"> </div><div class="cContentInner">"HDF (Hierarchical Data Format) technologies are relevant when the data challenges being faced push the limits of what can be addressed by traditional database systems, XML documents, or in-house data formats. Leveraging the powerful HDF products and the expertise of The HDF Group, organizations realize substantial cost savings while solving challenges that seemed intractable using other data management technologies. Many HDF adopters have very large datasets, very fast access requirements, or very complex datasets. Others turn to HDF because it allows them to easily share data across a wide variety of computational platforms using applications written in different programming languages. Some use HDF to take advantage of the many open-source and commercial tools that understand HDF. Similar to XML documents, HDF files are self-describing and allow users to specify complex data relationships and dependencies. In contrast to XML documents, HDF files can contain binary data (in many representations) and allow direct access to parts of the file without first parsing the entire contents. HDF, not surprisingly, allows hierarchical data objects to be expressed in a very natural manner, in contrast to the tables of relational database. Whereas relational databases support tables, HDF supports n-dimensional datasets and each element in the dataset may itself be a complex object. Relational databases offer excellent support for queries based on field matching, but are not well-suited for sequentially processing all records in the database or for subsetting the data based on coordinate-style lookup."</div>

...

Cancel

lyda/hdfs-docker-registry Repository | Docker Hub Registry - Repositories of Docker Images - 3 views

registry.hub.docker.com/...hdfs-docker-registry

docker hdfs registry images infrastructure tools

shared by Pablo Lalloni on 13 Oct 14 - No Cached

Pablo Lalloni on 13 Oct 14

"This is an HDFS based docker-registry."

<div class="cArrow"> </div><div class="cContentInner">"This is an HDFS based docker-registry."</div>

...

Cancel

Hama - a general BSP framework on top of Hadoop - 0 views

hama.apache.org

development programming bsp hadoop cloud-computing distributed-computing bigdata big-data

shared by Pablo Lalloni on 05 Aug 13 - No Cached

Pablo Lalloni on 05 Aug 13

"Apache Hama is a pure BSP (Bulk Synchronous Parallel) computing framework on top of HDFS (Hadoop Distributed File System) for massive scientific computations such as matrix, graph and network algorithms. Today, many practical data processing applications require a more flexible programming abstraction model that is compatible to run on highly scalable and massive data systems (e.g., HDFS, HBase, etc). A message passing paradigm beyond Map-Reduce framework would increase its flexibility in its communication capability. Bulk Synchronous Parallel (BSP) model fills the bill appropriately. Some of its significant advantages over MapReduce and MPI are: * Supports message passing paradigm style of application development * Provides a flexible, simple, and easy-to-use small APIs * Enables to perform better than MPI for communication-intensive applications * Guarantees impossibility of deadlocks or collisions in the communication mechanisms"

<div class="cArrow"> </div><div class="cContentInner">"Apache Hama is a pure BSP (Bulk Synchronous Parallel) computing framework on top of HDFS (Hadoop Distributed File System) for massive scientific computations such as matrix, graph and network algorithms. Today, many practical data processing applications require a more flexible programming abstraction model that is compatible to run on highly scalable and massive data systems (e.g., HDFS, HBase, etc). A message passing paradigm beyond Map-Reduce framework would increase its flexibility in its communication capability. Bulk Synchronous Parallel (BSP) model fills the bill appropriately. Some of its significant advantages over MapReduce and MPI are: * Supports message passing paradigm style of application development * Provides a flexible, simple, and easy-to-use small APIs * Enables to perform better than MPI for communication-intensive applications * Guarantees impossibility of deadlocks or collisions in the communication mechanisms"</div>

...

Cancel

Hadoop Distributed File System HDFS: A Cartoon Is Worth A * myNoSQL - 2 views

nosql.mypopescu.com/...system-hdfs-a-cartoon-is-worth

hadoop

shared by carlosmiranda on 23 Feb 12 - No Cached

Apache Flume - 0 views

flume.apache.org

distributed data-pipe aggregation log hdfs hbase

shared by Pablo Lalloni on 07 Feb 13 - No Cached

cloudera/cdk - 0 views

github.com/cdk

development hadoop library cloudera etl solr hdfs hbase

shared by Pablo Lalloni on 26 Jun 13 - No Cached

Pablo Lalloni on 26 Jun 13

"The Cloudera Development Kit, or CDK for short, is a set of libraries, tools, examples, and documentation focused on making it easier to build systems on top of the Hadoop ecosystem. The goals of the CDK are: Codify expert patterns and practices for building data-oriented systems and applications. Let developers focus on business logic, not plumbing or infrastructure. Provide smart defaults for platform choices. Support piecemeal adoption via loosely-coupled modules."

<div class="cArrow"> </div><div class="cContentInner">"The Cloudera Development Kit, or CDK for short, is a set of libraries, tools, examples, and documentation focused on making it easier to build systems on top of the Hadoop ecosystem. The goals of the CDK are: Codify expert patterns and practices for building data-oriented systems and applications. Let developers focus on business logic, not plumbing or infrastructure. Provide smart defaults for platform choices. Support piecemeal adoption via loosely-coupled modules."</div>

...

Cancel

Juju Charms - Charm: hadoop - 1 views

jujucharms.com/...hadoop

juju ubuntu openstack hadoop operations infrastructure ganglia hdfs mapreduce mapred dfs cloud-computing cloud

shared by Pablo Lalloni on 26 Apr 12 - No Cached

InfoQ: Apache Hadoop 1.0.0 Supports Kerberos Authentication, Apache HBase and RESTful A... - 1 views

www.infoq.com/...apache-hadoop-1.0.0

hadoop apache releases whatsnew features kerberos authentication

shared by Pablo Lalloni on 02 Feb 12 - No Cached

InfoQ: Oozie by Example - 1 views

www.infoq.com/...oozieexample

articles cloud-computing oozie process distributed workflow bpm hadoop hdfs

shared by Pablo Lalloni on 23 Aug 11 - No Cached

Big Data is Scaling BI and Analytics - 2 views

www.information-management.com/...-and-analytics-10021093-1.html

hadoop hdfs avro hbase chukwa business-intelligence bigdata map-reduce big-data

shared by carlosmiranda on 23 Sep 11 - No Cached

Pablo Lalloni liked it

Pablo Lalloni on 24 Sep 11

Excelente artículo. Habría que distribuirlo por unas cuantas oficinas.

<div class="cArrow"> </div><div class="cContentInner">Excelente artículo. Habría que distribuirlo por unas cuantas oficinas.</div>

...

Cancel

Cloudera Connector for Qlikview Download - Cloudera Support - 0 views

ccp.cloudera.com/...onnector+for+Qlikview+Download

apache hadoop connectors tools operations cloud-computing development qlikview hdfs hbase infrastructure

shared by Pablo Lalloni on 10 Apr 13 - No Cached

Pablo Lalloni on 10 Apr 13

"The Cloudera Connector for Qlikview enables your Enterprise's power users to access Hadoop data through the Qlikview 11.2. The driver achieves this by translating Open Database Connectivity (ODBC) calls from Qlikview into HiveQL queries. The driver supports CDH 4.1."

<div class="cArrow"> </div><div class="cContentInner">"The Cloudera Connector for Qlikview enables your Enterprise's power users to access Hadoop data through the Qlikview 11.2. The driver achieves this by translating Open Database Connectivity (ODBC) calls from Qlikview into HiveQL queries. The driver supports CDH 4.1."</div>

...

Cancel

Group items tagged