Skip to main content

Home/ Arquitectura?/ Group items tagged open-data

Rss Feed Group items tagged

Pablo Lalloni

The HDF Group - Why use HDF? - 0 views

  •  
    "HDF (Hierarchical Data Format) technologies are relevant when the data challenges being faced push the limits of what can be addressed by traditional database systems, XML documents, or in-house data formats. Leveraging the powerful HDF products and the expertise of The HDF Group, organizations realize substantial cost savings while solving challenges that seemed intractable using other data management technologies. Many HDF adopters have very large datasets, very fast access requirements, or very complex datasets. Others turn to HDF because it allows them to easily share data across a wide variety of computational platforms using applications written in different programming languages. Some use HDF to take advantage of the many open-source and commercial tools that understand HDF. Similar to XML documents, HDF files are self-describing and allow users to specify complex data relationships and dependencies. In contrast to XML documents, HDF files can contain binary data (in many representations) and allow direct access to parts of the file without first parsing the entire contents. HDF, not surprisingly, allows hierarchical data objects to be expressed in a very natural manner, in contrast to the tables of relational database. Whereas relational databases support tables, HDF supports n-dimensional datasets and each element in the dataset may itself be a complex object. Relational databases offer excellent support for queries based on field matching, but are not well-suited for sequentially processing all records in the database or for subsetting the data based on coordinate-style lookup."
Pablo Lalloni

Informe s/ BigData en el gobierno de UK - 1 views

  •  
    "1. The Government has already made a commitment to Big Data by classifying it as one of the 'Eight Great Technologies' which will propel the UK to future growth and help it stay ahead in the global race. The 'Information Economy Strategy' reports on the increase in data being generated and the importance of new types of computing power in order to reap the economic value of the data. 2. This paper sets out to cover the following areas: a) Defining Big Data b) High-level trends in Big Data c) Opportunities for Big Data applications 3. In developing this paper, a 'community of interest' has been established comprising policy leads and analysts from across government with an interest in Big Data. This paper draws on their insights, insights from the private sector, academics, and the extensive open source literature on the Big Data topic."
Pablo Lalloni

bandicoot - having fun with structured data - 0 views

  •  
    "Bandicoot is an open source programming system with a new set-based programming language, persistency capabilities, and run-time environment. The language is similar to general purpose programming languages where you write functions/methods and access data through variables. Though, in Bandicoot, you always manipulate data in sets using a small set-based algebra (the relational algebra)." "Here are the main features:   - functions are automatically exposed via HTTP using CSV for data, e.g. /List, /Append  - supports persistency via global variables (with transactions and ACID)  - can run on multiple computers to scale up the read throughput  - built in operators from the relational algebra with a simple syntax, e.g. "+" (union), "-" (minus)  - small binary ~100KB"
Pablo Lalloni

OASIS Open CSA | Advancing open standards that simplify SOA application development - 0 views

  •  
    "The OASIS Open Composite Services Architecture (CSA) Member Section advances open standards that simplify SOA application development. Open CSA brings together vendors and users from around the world to collaborate on the further development and adoption of the Service Component Architecture (SCA) and Service Data Objects (SDO) families of specifications."
Pablo Lalloni

Ferry | Big Data Development Environment Using Docker - 0 views

  •  
    "Ferry helps you create big data clusters on your local machine. Define your big data stack using YAML and share your application with Dockerfiles. Ferry supports Hadoop, Cassandra, Spark, GlusterFS, and Open MPI."
Pablo Lalloni

pandas - 0 views

  •  
    "pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language."
Pablo Lalloni

Presto | Distributed SQL Query Engine for Big Data - 0 views

  •  
    "Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook."
Pablo Lalloni

Ferry | Big Data Development Environment Using Docker - 0 views

  •  
    "Ferry helps you create big data clusters on your local machine. Define your big data stack using YAML and share your application with Dockerfiles. Ferry supports Hadoop, Cassandra, Spark, GlusterFS, and Open MPI."
Pablo Lalloni

The ImageTerrier Master Project - - 0 views

  •  
    "ImageTerrier is an open-source, scalable, high-performance search engine platform for content-based image retrieval applications. The ImageTerrier platform provides a comprehensive test-bed for experimenting with image retrieval techniques. The platform incorporates a state-of-the-art implementation of the single-pass indexing technique for constructing inverted indexes and is capable of producing highly compressed index data structures. ImageTerrier is written as an extension to the open-source Terrier test-bed platform for textual information retrieval research."
Pablo Lalloni

Apache Flink: Scalable Batch and Stream Data Processing - 1 views

  •  
    "Apache Flink is an open source platform for distributed stream and batch data processing."
Pablo Lalloni

FreeIPA - 0 views

  •  
    "FreeIPA is an integrated security information management solution combining Linux (Fedora), 389 Directory Server, MIT Kerberos, NTP, DNS, Dogtag (Certificate System). It consists of a web interface and command-line administration tools. FreeIPA is an integrated Identity and Authentication solution for Linux/UNIX networked environments. A FreeIPA server provides centralized authentication, authorization and account information by storing data about user, groups, hosts and other objects necessary to manage the security aspects of a network of computers. FreeIPA is built on top of well known Open Source components and standard protocols with a very strong focus on ease of management and automation of installation and configuration tasks. Multiple FreeIPA servers can easily be configured in a FreeIPA Domain in order to provide redundancy and scalability. The 389 Directory Server is the main data store and provides a full multi-master LDAPv3 directory infrastructure. Single-Sign-on authentication is provided via the MIT Kerberos KDC. Authentication capabilities are augmented by an integrated Certificate Authority based on the Dogtag project. Optionally Domain Names can be managed using the integrated ISC Bind server. Security aspects related to access control, delegation of administration tasks and other network administration tasks can be fully centralized and managed via the Web UI or the ipa Command Line tool."
Pablo Lalloni

Google Open Sources Container Management Tool -- Virtualization Review - 1 views

  •  
    "Everything at Google, from Search to Gmail, is packaged and run in a Linux container," explained Eric Brewer, vice president of infrastructure at the Internet search giant, in announcing the open sourcing of Kubernetes. "Each week we launch more than 2 billion container instances across our global data centers, and the power of containers has enabled both more reliable services and higher, more-efficient scalability."
Pablo Lalloni

Druid | What is Druid? - 1 views

  •  
    An open-source, real-time data store designed to power interactive applications at scale.
Pablo Lalloni

Nux - Overview - 0 views

  •  
    Nux is an open-source Java toolkit making efficient and powerful XML processing easy. It is geared towards embedded use in high-throughput XML messaging middleware such as large-scale Peer-to-Peer infrastructures, message queues, publish-subscribe and matchmaking systems for Blogs/newsfeeds, text chat, data acquisition and distribution systems, application level routers, firewalls, classifiers, etc.
Pablo Lalloni

Cloudera Connector for Qlikview Download - Cloudera Support - 0 views

  •  
    "The Cloudera Connector for Qlikview enables your Enterprise's power users to access Hadoop data through the Qlikview 11.2. The driver achieves this by translating Open Database Connectivity (ODBC) calls from Qlikview into HiveQL queries. The driver supports CDH 4.1."
Pablo Lalloni

InfluxDB - 0 views

  •  
    "An open-source distributed time series database with no external dependencies. InfluxDB is the new home for all of your metrics, events, and analytics."
1 - 20 of 27 Next ›
Showing 20 items per page