Skip to main content

Home/ BI-TAGS/ Group items tagged hadoop

Rss Feed Group items tagged

cezarovidiu

Installing Hadoop for Fedora & Oracle Linux(Single Node Cluster) | accretion infinity - 0 views

  • Hadoop is a framework written in Java for running applications on large clusters of commodity hardware and incorporates features similar to those of the Google File System (GFS) and of the Map Reduce computing paradigm. Hadoop’s HDFS is a highly fault-tolerant distributed file system and, like Hadoop in general, designed to be deployed on low-cost hardware. It provides high throughput access to application data and is suitable for applications that have large data sets.
  • Some of the Hadoop projects we will talk about are: HDFS : A distributed filesystem that runs on large clusters of commodity machines. Map Reduce: A distributed data processing model and execution environment that runs on large clusters of commodity machines. Pig: A data flow language and execution environment for exploring very large datasets. Pig runs on HDFS and MapReduce clusters. HBase: A distributed, column-oriented database. HBase uses HDFS for its underlying storage, and supports both batch-style computations using MapReduce and point queries (random reads). ZooKeeper: A distributed, highly available coordination service. ZooKeeper provides primitives such as distributed locks that can be used for building distributed applications. Oozie: Oozie is a workflow scheduler system to manage Apache Hadoop jobs.
  • Oracle Linux as the operating system and Hadoop 1.1.2 or 1.2.0
cezarovidiu

Hadoop Tutorial - YDN - 0 views

  • Hadoop is designed to efficiently process large volumes of information by connecting many commodity computers together to work in parallel. The theoretical 1000-CPU machine described earlier would cost a very large amount of money, far more than 1,000 single-CPU or 250 quad-core machines. Hadoop will tie these smaller and more reasonably priced machines together into a single cost-effective compute cluster.
cezarovidiu

Rittman Mead Consulting - The Changing World of Business Intelligence - 0 views

  • Schema on write This is the traditional approach for Business Intelligence. A model, often dimensional, is built as part of the design process. This model is an abstraction of the complexity of the underlying systems, put in business terms. The purpose of the model is to allow the business users to interrogate the data in a way they understand.
  • The model is instantiated through physical database tables and the date is loaded through an ETL (extract, transform and load) process that takes data from one or more source systems and transforms it to fit the model, then loads it into the model.
  • The key thing is that the model is determined before the data is finally written and the users are very much guided or driven by the model in how they query the data and what results they can get from the system. The designer must anticipate the queries and requests in advance of the user asking the questions.
  • ...3 more annotations...
  • Schema on read Schema on read works on a different principle and is more common in the Big Data world. The data is not transformed in any way when it is stored, the data store acts as a big bucket. The modelling of the data only occurs when the data is read. Map/Reduce is the clearest example, the mapping is the understanding of the data structure. Hadoop is a large distributed file system, which is very good at storing large volumes of data, this is potential. It is only the mapping of this data that provides value, this is done when the data is read, not written.
  • New World Order So whereas Business Intelligence used to always be driven by the model, the ETL process to populate the model and the reporting tool to query the model, there is now an approach where the data is collected its raw form, and advanced statistical or analytical tools are used to interrogate the data. An example of one such tool is R.
  • The driver for which approach to use is often driven by what the user wants to find out. If the question is clearly formed and the sources of data that are required to answer it well understood, for example how many units of a product have we sold, then the traditional schema on write approach is best.
cezarovidiu

Hadoop HBase 1.0 debuts amid stiff NoSQL competition | InfoWorld - 0 views

  • Databases consisting of billions of rows and columns can be stored in HBase and retrieved via conventional SQL queries, and an HBase database can scale out by simply adding nodes to an existing cluster.
cezarovidiu

What is business intelligence (BI)? - Definition from WhatIs.com - 0 views

  • Business intelligence is a data analysis process aimed at boosting business performance by helping corporate executives and other end users make more informed decisions.
  • Business intelligence (BI) is a technology-driven process for analyzing data and presenting actionable information to help corporate executives, business managers and other end users make more informed business decisions.
  • BI encompasses a variety of tools, applications and methodologies that enable organizations to collect data from internal systems and external sources, prepare it for analysis, develop and run queries against the data, and create reports, dashboards and data visualizations to make the analytical results available to corporate decision makers as well as operational workers.
  • ...9 more annotations...
  • The potential benefits of business intelligence programs include accelerating and improving decision making; optimizing internal business processes; increasing operational efficiency; driving new revenues; and gaining competitive advantages over business rivals. BI systems can also help companies identify market trends and spot business problems that need to be addressed.
  • BI data can include historical information, as well as new data gathered from source systems as it is generated, enabling BI analysis to support both strategic and tactical decision-making processes.
  • BI programs can also incorporate forms of advanced analytics, such as data mining, predictive analytics, text mining, statistical analysis and big data analytics.
  • In many cases though, advanced analytics projects are conducted and managed by separate teams of data scientists, statisticians, predictive modelers and other skilled analytics professionals, while BI teams oversee more straightforward querying and analysis of business data.
  • Business intelligence data typically is stored in a data warehouse or smaller data marts that hold subsets of a company's information. In addition, Hadoop systems are increasingly being used within BI architectures as repositories or landing pads for BI and analytics data, especially for unstructured data, log files, sensor data and other types of big data. Before it's used in BI applications, raw data from different source systems must be integrated, consolidated and cleansed using data integration and data quality tools to ensure that users are analyzing accurate and consistent information.
  • In addition to BI managers, business intelligence teams generally include a mix of BI architects, BI developers, business analysts and data management professionals; business users often are also included to represent the business side and make sure its needs are met in the BI development process.
  • To help with that, a growing number of organizations are replacing traditional waterfall development with Agile BI and data warehousing approaches that use Agile software development techniques to break up BI projects into small chunks and deliver new functionality to end users on an incremental and iterative basis.
  • consultant Howard Dresner is credited with first proposing it in 1989 as an umbrella category for applying data analysis techniques to support business decision-making processes.
  • Business intelligence is sometimes used interchangeably with business analytics; in other cases, business analytics is used either more narrowly to refer to advanced data analytics or more broadly to include both BI and advanced analytics.
cezarovidiu

Big Data is a Solution Looking for a Problem: Gartner - CIO India News on | CIO.in - 0 views

  • Big Data is forecast to drive $34 billion of IT spending in 2013 and create 4.4 million IT jobs by 2015, but it is currently still a solution looking for a problem, according to analyst firm Gartner.
  • While businesses are keen to start mining their data stores for useful insights, and many are already experimenting with technologies like Hadoop, the biggest challenge is working out what question you are trying to answer
cezarovidiu

What Skills Does an Oracle BI Developer Need in 2011? - 0 views

  • OBIEE 11g skills, both in terms of new functionality (mapping, analyses, KPIs and Scorecards etc) and new infrastructure (WebLogic, EM, OPSS etc) A smattering of Essbase skills, focused mainly on the integration with OBIEE and Essbase (and the many workarounds and gotchas) Good ODI skills, both in terms of the basics, but also being able to write knowledge modules, integrate with OBIEE, deployment and migration Solid database skills – OBIEE gave the illusion through aggregates etc that database tuning was redundant, but time has shown it’s by far the biggest success factor in a project – get the database design and optimisation wrong, and your project is toast. You need to know partitioning, materialized views, index types, and increasingly, you need to get yourself on an Exadata project as customers are buying the technology but you can’t teach it to yourself at home BI Apps skills, but watch out for everything changing when BI Apps 11g comes out, and be prepared to learn the Fusion Apps and JDeveloper if you want to stay in the game Looking to the future, keep an eye on technologies such as in-memory (TimesTen), mid-tier caching (Coherence), plus technologies such as Business Activity Monitoring (BAM), “big data” (Hadoop, large data sets, NoSQL), complex event processing and maybe products such as Qlikview, just in case Oracle buys them, or at least to know what the competition are up to, or more importantly pitching to your boss
  • The other thing to bear in mind of course, if you’re an Oracle BI developer, is that you need to have great business, communication and data modeling skills.
1 - 20 of 27 Next ›
Showing 20 items per page