Group items tagged big-data - Arquitectura?

Informe s/ BigData en el gobierno de UK - 1 views

www.gov.uk/...nologies_Big_Data_report_1.pdf

government big-data uk trends opportunities case-study policy

shared by Pablo Lalloni on 09 Jan 15 - No Cached

Pablo Lalloni on 09 Jan 15

"1. The Government has already made a commitment to Big Data by classifying it as one of the 'Eight Great Technologies' which will propel the UK to future growth and help it stay ahead in the global race. The 'Information Economy Strategy' reports on the increase in data being generated and the importance of new types of computing power in order to reap the economic value of the data. 2. This paper sets out to cover the following areas: a) Defining Big Data b) High-level trends in Big Data c) Opportunities for Big Data applications 3. In developing this paper, a 'community of interest' has been established comprising policy leads and analysts from across government with an interest in Big Data. This paper draws on their insights, insights from the private sector, academics, and the extensive open source literature on the Big Data topic."

<div class="cArrow"> </div><div class="cContentInner">"1. The Government has already made a commitment to Big Data by classifying it as one of the 'Eight Great Technologies' which will propel the UK to future growth and help it stay ahead in the global race. The 'Information Economy Strategy' reports on the increase in data being generated and the importance of new types of computing power in order to reap the economic value of the data. 2. This paper sets out to cover the following areas: a) Defining Big Data b) High-level trends in Big Data c) Opportunities for Big Data applications 3. In developing this paper, a 'community of interest' has been established comprising policy leads and analysts from across government with an interest in Big Data. This paper draws on their insights, insights from the private sector, academics, and the extensive open source literature on the Big Data topic."</div>

...

Cancel

BIG DATA APPLICATIONS Fast Data: Big Data Evolved - White Paper - 0 views

info.typesafe.com/ta-Big-Data-Evolved-WP_LP.html

white paper

shared by munyeco on 17 Sep 15 - No Cached

munyeco on 17 Sep 15

There is a fundamental shift occurring in Big Data, from data at rest to data in motion. In this white paper, Dean Wampler explores the ecosystem that is emerging around Fast Data and provides handy diagrams and code samples to help you:

<div class="cArrow"> </div><div class="cContentInner">There is a fundamental shift occurring in Big Data, from data at rest to data in motion. In this white paper, Dean Wampler explores the ecosystem that is emerging around Fast Data and provides handy diagrams and code samples to help you: </div>

...

Cancel

Ferry | Big Data Development Environment Using Docker - 0 views

ferry.opencore.io/latest

development cloud-computing infrastructure devops big-data

shared by Pablo Lalloni on 05 Sep 14 - No Cached

Pablo Lalloni on 05 Sep 14

"Ferry helps you create big data clusters on your local machine. Define your big data stack using YAML and share your application with Dockerfiles. Ferry supports Hadoop, Cassandra, Spark, GlusterFS, and Open MPI."

<div class="cArrow"> </div><div class="cContentInner">"Ferry helps you create big data clusters on your local machine. Define your big data stack using YAML and share your application with Dockerfiles. Ferry supports Hadoop, Cassandra, Spark, GlusterFS, and Open MPI."</div>

...

Cancel

Big Data Poster - Data Science Central - 0 views

www.datasciencecentral.com/...big-data-poster

big-data development architecture data-science operations cloud-computing infrastructure

shared by Pablo Lalloni on 05 Oct 14 - No Cached

Pablo Lalloni on 05 Oct 14

"A great resource (PDF document) about big data, originally posted on CTOvision.com."

<div class="cArrow"> </div><div class="cContentInner">"A great resource (PDF document) about big data, originally posted on CTOvision.com."</div>

...

Cancel

Ferry | Big Data Development Environment Using Docker - 0 views

ferry.opencore.io/...index.html

development programming hadoop tools docker

shared by Pablo Lalloni on 05 Nov 14 - No Cached

Pablo Lalloni on 05 Nov 14

"Ferry helps you create big data clusters on your local machine. Define your big data stack using YAML and share your application with Dockerfiles. Ferry supports Hadoop, Cassandra, Spark, GlusterFS, and Open MPI."

<div class="cArrow"> </div><div class="cContentInner">"Ferry helps you create big data clusters on your local machine. Define your big data stack using YAML and share your application with Dockerfiles. Ferry supports Hadoop, Cassandra, Spark, GlusterFS, and Open MPI."</div>

...

Cancel

Data Modeling for NoSQL - 0 views

www.infoq.com/...data-modeling-mongodb

data modeling nosql mongodb bigdata big-data development

shared by Pablo Lalloni on 14 May 13 - No Cached

Pablo Lalloni on 14 May 13

"Tony Tam shares tips for modeling data with MongoDB for a fast and scalable system based on his experience migrating billions of records from MySQL to MongoDB."

<div class="cArrow"> </div><div class="cContentInner">"Tony Tam shares tips for modeling data with MongoDB for a fast and scalable system based on his experience migrating billions of records from MySQL to MongoDB."</div>

...

Cancel

pachyderm/pachyderm - 0 views

github.com/...pachyderm

development big-data cloud-computing infrastructure docker data-analytics

shared by Pablo Lalloni on 14 Oct 15 - No Cached

Pablo Lalloni on 14 Oct 15

"Pachyderm is a complete data analytics solution that lets you efficiently store and analyze your data using containers. We offer the scalability and broad functionality of Hadoop, with the ease of use of Docker."

<div class="cArrow"> </div><div class="cContentInner">"Pachyderm is a complete data analytics solution that lets you efficiently store and analyze your data using containers. We offer the scalability and broad functionality of Hadoop, with the ease of use of Docker."</div>

...

Cancel

Big Data is Scaling BI and Analytics - 2 views

www.information-management.com/...-and-analytics-10021093-1.html

hadoop hdfs avro hbase chukwa business-intelligence bigdata map-reduce big-data

shared by carlosmiranda on 23 Sep 11 - No Cached

Pablo Lalloni liked it

Pablo Lalloni on 24 Sep 11

Excelente artículo. Habría que distribuirlo por unas cuantas oficinas.

<div class="cArrow"> </div><div class="cContentInner">Excelente artículo. Habría que distribuirlo por unas cuantas oficinas.</div>

...

Cancel

nathanmarz/cascalog · GitHub - 0 views

github.com/cascalog

distributed-computing hadoop library programming development cloud-computing java clojure jvm

shared by Pablo Lalloni on 04 Apr 13 - No Cached

Pablo Lalloni on 04 Apr 13

"Cascalog is a fully-featured data processing and querying library for Clojure or Java. The main use cases for Cascalog are processing "Big Data" on top of Hadoop or doing analysis on your local computer. Cascalog is a replacement for tools like Pig, Hive, and Cascading and operates at a significantly higher level of abstraction than those tools."

<div class="cArrow"> </div><div class="cContentInner">"Cascalog is a fully-featured data processing and querying library for Clojure or Java. The main use cases for Cascalog are processing "Big Data" on top of Hadoop or doing analysis on your local computer. Cascalog is a replacement for tools like Pig, Hive, and Cascading and operates at a significantly higher level of abstraction than those tools."</div>

...

Cancel

Big Data Exploration, Visualization, Analytics - 0 views

www.zoomdata.com

data-visualization big-data real-time data-analysis tools spark hadoop elasticsearch mongodb oracle

shared by Pablo Lalloni on 11 Apr 15 - No Cached

elasticsearch/elasticsearch-hadoop - 0 views

github.com/...elasticsearch-hadoop

development programming scala library cloud-computing big-data map-reduce pig hive elasticsearch

shared by Pablo Lalloni on 04 Sep 13 - No Cached

Pablo Lalloni on 04 Sep 13

"Read and write data to/from Elasticsearch within Hadoop/MapReduce libraries. Automatically converts data to/from JSON. Supports MapReduce, Cascading, Hive and Pig."

<div class="cArrow"> </div><div class="cContentInner">"Read and write data to/from Elasticsearch within Hadoop/MapReduce libraries. Automatically converts data to/from JSON. Supports MapReduce, Cascading, Hive and Pig."</div>

...

Cancel

shark - 0 views

github.com/wiki

development programming bigdata big-data distributed-computing cloud-computing spark hive

shared by Pablo Lalloni on 05 Aug 13 - No Cached

Pablo Lalloni on 05 Aug 13

"Shark is a large-scale data warehouse system for Spark designed to be compatible with Apache Hive. It can execute Hive QL queries up to 100 times faster than Hive without any modification to the existing data or queries. Shark supports Hive's query language, metastore, serialization formats, and user-defined functions, providing seamless integration with existing Hive deployments and a familiar, more powerful option for new ones."

<div class="cArrow"> </div><div class="cContentInner">"Shark is a large-scale data warehouse system for Spark designed to be compatible with Apache Hive. It can execute Hive QL queries up to 100 times faster than Hive without any modification to the existing data or queries. Shark supports Hive's query language, metastore, serialization formats, and user-defined functions, providing seamless integration with existing Hive deployments and a familiar, more powerful option for new ones."</div>

...

Cancel

Hama - a general BSP framework on top of Hadoop - 0 views

hama.apache.org

development programming bsp hadoop cloud-computing distributed-computing bigdata big-data

shared by Pablo Lalloni on 05 Aug 13 - No Cached

Pablo Lalloni on 05 Aug 13

"Apache Hama is a pure BSP (Bulk Synchronous Parallel) computing framework on top of HDFS (Hadoop Distributed File System) for massive scientific computations such as matrix, graph and network algorithms. Today, many practical data processing applications require a more flexible programming abstraction model that is compatible to run on highly scalable and massive data systems (e.g., HDFS, HBase, etc). A message passing paradigm beyond Map-Reduce framework would increase its flexibility in its communication capability. Bulk Synchronous Parallel (BSP) model fills the bill appropriately. Some of its significant advantages over MapReduce and MPI are: * Supports message passing paradigm style of application development * Provides a flexible, simple, and easy-to-use small APIs * Enables to perform better than MPI for communication-intensive applications * Guarantees impossibility of deadlocks or collisions in the communication mechanisms"

<div class="cArrow"> </div><div class="cContentInner">"Apache Hama is a pure BSP (Bulk Synchronous Parallel) computing framework on top of HDFS (Hadoop Distributed File System) for massive scientific computations such as matrix, graph and network algorithms. Today, many practical data processing applications require a more flexible programming abstraction model that is compatible to run on highly scalable and massive data systems (e.g., HDFS, HBase, etc). A message passing paradigm beyond Map-Reduce framework would increase its flexibility in its communication capability. Bulk Synchronous Parallel (BSP) model fills the bill appropriately. Some of its significant advantages over MapReduce and MPI are: * Supports message passing paradigm style of application development * Provides a flexible, simple, and easy-to-use small APIs * Enables to perform better than MPI for communication-intensive applications * Guarantees impossibility of deadlocks or collisions in the communication mechanisms"</div>

...

Cancel

Presto | Distributed SQL Query Engine for Big Data - 0 views

prestodb.io

development programming opensource database presto bigdata sql facebook

shared by Pablo Lalloni on 22 May 14 - No Cached

Pablo Lalloni on 22 May 14

"Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook."

<div class="cArrow"> </div><div class="cContentInner">"Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook."</div>

...

Cancel

Luna - 0 views

www.luna-lang.org

big-data development data-processing data-pipeline architecture programming

shared by Pablo Lalloni on 29 Sep 18 - No Cached

Pablo Lalloni on 29 Sep 18

Luna is a data processing and visualization environment built on a principle that people need an immediate connection to what they are building. It provides an ever-growing library of highly tailored, domain specific components and an extensible framework for building new ones.

<div class="cArrow"> </div><div class="cContentInner">Luna is a data processing and visualization environment built on a principle that people need an immediate connection to what they are building. It provides an ever-growing library of highly tailored, domain specific components and an extensible framework for building new ones.</div>

...

Cancel

http://res.infoq.com/downloads/pdfdownloads/presentations/QConSF2012-TonyTam-Datamodeli... - 0 views

res.infoq.com/...rdocumentorienteddatabases.pdf

data modeling nosql bigdata big-data development mongodb

shared by Pablo Lalloni on 14 May 13 - No Cached

Pablo Lalloni on 14 May 13

Data Modeling with NoSQL (slides)

<div class="cArrow"> </div><div class="cContentInner">Data Modeling with NoSQL (slides)</div>

...

Cancel

Iteratees in Big Data at Klout « Klout Engineering - 0 views

engineering.klout.com/...iteratees-in-big-data-at-klout

development programming scala iteratees play! functional-programming bigdata

shared by Pablo Lalloni on 05 Feb 13 - No Cached

Do you know Big Data? - 0 views

DIIGO_FILE_HOME/9xbh/dxg6

development cloud-computing distributed-computing compute-cloud bigdata big-data

shared by Pablo Lalloni on 05 Oct 14 - No Cached

Log(Graph): A Near-Optimal High-Performance Graph Representation - 0 views

people.csail.mit.edu/...loggraph.pdf

shared by Pablo Lalloni on 29 Sep 18 - No Cached

Pablo Lalloni on 29 Sep 18

big-data graph graph-processing architecture development programming

<div class="cArrow"> </div><div class="cContentInner">big-data graph graph-processing architecture development programming</div>

...

Cancel

GravityLabs/HPaste - 0 views

github.com/HPaste

development programming scala library hbase hadoop big-data bigdata mapreduce map-reduce

shared by Pablo Lalloni on 17 Oct 13 - No Cached

Pablo Lalloni on 17 Oct 13

"HPaste unlocks the rich functionality of HBase for a Scala audience. In so doing, it attempts to achieve the following goals: Provide a strong, clear syntax for querying and filtration Perform as fast as possible while maintaining idiomatic Scala client code -- the abstractions should not show up in a profiler! Re-articulate HBase's data structures rather than force it into an ORM-style atmosphere. A rich set of base classes for writing MapReduce jobs in hadoop against HBase tables. Provide a maximum amount of code re-use between general Hbase client usage, and operation from within a MapReduce job. Use Scala's type system to its advantage--the compiler should verify the integrity of the schema. Be a verbose DSL--minimize boilerplate code, but be human readable!"

<div class="cArrow"> </div><div class="cContentInner">"HPaste unlocks the rich functionality of HBase for a Scala audience. In so doing, it attempts to achieve the following goals: Provide a strong, clear syntax for querying and filtration Perform as fast as possible while maintaining idiomatic Scala client code -- the abstractions should not show up in a profiler! Re-articulate HBase's data structures rather than force it into an ORM-style atmosphere. A rich set of base classes for writing MapReduce jobs in hadoop against HBase tables. Provide a maximum amount of code re-use between general Hbase client usage, and operation from within a MapReduce job. Use Scala's type system to its advantage--the compiler should verify the integrity of the schema. Be a verbose DSL--minimize boilerplate code, but be human readable!"</div>

...

Cancel

Group items tagged

Informe s/ BigData en el gobierno de UK - 1 views

BIG DATA APPLICATIONS Fast Data: Big Data Evolved - White Paper - 0 views

Ferry | Big Data Development Environment Using Docker - 0 views

Big Data Poster - Data Science Central - 0 views

Ferry | Big Data Development Environment Using Docker - 0 views

Data Modeling for NoSQL - 0 views

pachyderm/pachyderm - 0 views

Big Data is Scaling BI and Analytics - 2 views

nathanmarz/cascalog · GitHub - 0 views

Big Data Exploration, Visualization, Analytics - 0 views

elasticsearch/elasticsearch-hadoop - 0 views

shark - 0 views

Hama - a general BSP framework on top of Hadoop - 0 views

Presto | Distributed SQL Query Engine for Big Data - 0 views

Luna - 0 views

http://res.infoq.com/downloads/pdfdownloads/presentations/QConSF2012-TonyTam-Datamodeli... - 0 views

Iteratees in Big Data at Klout « Klout Engineering - 0 views

Do you know Big Data? - 0 views

Log(Graph): A Near-Optimal High-Performance Graph Representation - 0 views

GravityLabs/HPaste - 0 views

Related searches