Skip to main content

Home/ BI-TAGS/ Group items tagged scheduler

Rss Feed Group items tagged

cezarovidiu

BI Brief - Four Legs of a Successful Business Intelligence (BI) Project Team - 0 views

  • 1. Project Sponsorship and Governance 2. Project Management 3. Development Team (Core Team) 4. Extended Project Team
  • 1. Project Sponsorship and Governance IT and the business should form a BI steering committee to sponsor and govern design, development, deployment, and ongoing support. It needs both the CIO and a business executive, such as CFO, COO, or a senior VP of marketing/sales to commit budget, time, and resources. The business sponsor needs the project to succeed. The CIO is committed to what is being built and how.
  • 2. Project Management Project management includes managing daily tasks, reporting status, and communicating to the extended project team, steering committee, and affected business users. The project management team needs extensive business knowledge, BI expertise, DW architecture background, and people management, project management, and communications skills. The project management team includes three functions or members: Project development manager - Responsible for deliverables, managing team resources, monitoring tasks, reporting status, and communications. Requires a hands-on IT manager with a background in iterative development. Must understand the changes caused by this approach and the impact on the business, project resources, schedule and the trade-offs. Business advisor - Works within the sponsoring business organization. Responsible for the deliverables of the business resources on the project's extended team. Serves as the business advocate on the project team and the project advocate within the business community. Often, the business advocate is a project co-manager who defers to the IT project manager the daily IT tasks but oversees the budget and business deliverables. BI/DW project advisor - Has enough expertise with architectures and technologies to guides the project team on their use. Ensures that architecture, data models, databases, ETL code, and BI tools are all being used effectively and conform to best practices and standards.
  • ...2 more annotations...
  • 3. Development Team (Core Team) The core project team is divided into four sub-teams: Business requirements - This sub-team may have business people who understand IT systems, or IT people who understand the business. In either case, the team represents the business and their interests. They are responsible for gathering and prioritizing business needs; translating them into IT systems requirements; interacting with the business on the data quality and completeness; and ensuring the business provides feedback on how well the solutions generated meet their needs. BI architecture - Develops the overall BI architecture, selects the appropriate technology, creates the data models, maps the overall data workflow from source systems to BI analytics, and oversees the ETL and BI development teams from a technical perspective. ETL development - Receives the business and data requirements, as well as the target data models to be used by BI analytics. Develops the ETL code needed to gather data from the appropriate source systems into the BI databases. Often, a system analyst who is a expert in the source systems such as SAP is part of the team to provide knowledge of the data sources, customizations, and data quality. BI development - Create the reports or analytics that the business users will interact with to do their jobs. This is often a very iterative process and requires much interaction with the business users.
  • 4. Extended Project Team There are several functions required by the project team that are often accomplished through an "extended" team: Players - A group of business users are signed up to "play with" or test the BI analytics and reports as they are developed to provide feedback to the core development team. This is a virtual team that gets together at specific periods of the project but they are committed to this role during those periods. Testers - A group of resources are gathered, similarly to the virtual team above, to perform more extensive QA testing of the BI analytics, ETL processes, and overall systems testing. You may have project members test other members' work, such as the ETL team test the BI analytics and visa versa. Operators - IT operations is often separated from the development team but it is critical that they are involved from the beginning of the project to ensure that the systems are developed and deployed within your company's infrastructure. Key functions are database administration, systems administration, and networks. In addition, this extended team may also include help desk and training resources if they are usually provided outside of development.
cezarovidiu

Top Mistakes to Avoid in Analytics Implementations | StatSlice Business Intelligence an... - 0 views

  • Mistake 1.  Not putting a strong interdisciplinary team together. It is impossible to put together an analytics platform without understanding the needs of the customers who will use it.  Sounds simple, right?  Who wouldn’t do that?  You’d be surprised how many analytics projects are wrapped up by IT because “they think” they know the customer needs.  Not assembling the right team is clearly the biggest mistake companies make.  Many times what is on your mind (and if you’re an IT person willing to admit it) is that you are considering converting all those favorite company reports.  Your goal should not be that.  Your goal is to create a system—human engineered with customers, financial people, IT folks, analysts, and others—that give people new and exciting ways to look at information.  It should give you new insights. New competitive information.  If you don’t get the right team put together, you’ll find someone longing for the good old days and their old dusty reports.  Or worse yet, still finding ways to generate those old dusty reports. Mistake 2.  Not having the right talent to design, build, run and update your analytics system.  It is undeniable that there is now high demand for business analytics specialists.  There are not a lot of them out there that really know what to do unless they’ve been burned a few times and have survived and then built successful BA systems.  This is reflected by the fact you see so many analytics vendors offer, or often recommend, third-party consulting and training to help the organization develop their business analytic skills.  Work hard to build a three-way partnership between the vendor, your own team, and an implementation partner.  If you develop those relationships, risk of failure goes way down.
  • Mistake 3.  Putting the wrong kind of analyst or designer on the project. This is somewhat related to Mistake 2 but with some subtle differences.  People have different skillsets so you need to make sure the person you’re considering to put on the project is the right “kind.”  For example, when you put the design together you need both drill-down and summary models.  Both have different types of users.  Does this person know how to do both?  Or, for example, inexperience in an analyst might lead to them believing vendor claims and not be able to verify them as to functionality or time to implement. Mistake 4.  Not understanding how clean the data is you are getting and the time frame to get it clean.  Profile your data to understand the quality of your source data.  This will allow you to adjust your system accordingly to compensate for some of those issues or more importantly push data fixes to your source systems.  Ensure high quality data or your risk upsetting your customers.  If you don’t have a good understanding of the quality of your data, you could easily find yourself way behind schedule even though the actual analytics and business intelligence framework you are building is coming along fine. Mistake 5.  Picking the wrong tools.  How often do organizations buy software tools that just sit on the shelve?  This often comes from management rushing into a quick decision based on a few demos they have seen.  Picking the right analytics tools requires an in-depth understanding of your requirements as well as the strengths and weaknesses of the tools you are evaluating.  The best way to achieve this understanding is by getting an unbiased implementation partner to build a proof of concept with a subset of your own data and prove out the functionality of the tools you are considering. Bottom Line.  Think things through carefully. Make sure you put the right team together.  Have a data cleansing plan.  If the hype sounds too good to be true—have someone prove it to you.
cezarovidiu

Google Reader (250) - 0 views

  • What this means in practice is that when the BI Server component starts up, it creates and reserves a number of threads in advance, determined by a number of parameters including SERVER_THREAD_RANGE.
  • You can see these threads running and ready to perform tasks for the BI Server component by using a tool such as Process Explorer for Windows
  • Thinking it through a bit, any given single query is, to a certain extent, only really going to use a small part of the total amount of CPUs available on a server, because it’s not the BI Server that runs queries in parallel, it’s the underlying database. For example, a single analysis against a single Oracle Database datasource would only really need a single BI Server thread to handle the query request, but when the underlying database receives the query, it might use a large number of its CPUs to process the query, returning results back to the BI Server to then pass back to the Presentation Server for display to the user.
  • ...2 more annotations...
  • The BI Server wouldn’t have any use for any more query threads, as it can’t really do anything with them – the exception to this being queries that generate multiple physical SQLs, for example to join data from multiple sources together and return a single set of data to the user, for which the BI Server could benefit from a higher CPU count if each of these queries in turn led to lots of threads being used – but two queries, in themselves, don’t neccessarily require two CPUs, because of course the BI Server, and the underlying CPUs, are themselves multi-threaded.
  • To conclude then – all things begin equal, the BI Server should make use of all of the CPUs that the underlying operating system presents to it, with the OS itself deciding what threads are scheduled against which CPUs. In-theory, all CPUs on the server are available to each BI Server component, but each OS is different and it might be worth experimenting if you’re sure that certain CPUs aren’t being used – but this is most probably unlikely and the main reason you’d really consider vertical scale-out of BI Server components is for fault-tolerance, or if you’re using a 32-bit OS and each process can only see a subset of the total overall memory. And, bear in mind that however many CPUs the BI Server has available to it, for queries that send just a single SQL statement down to the underlying database server, adding more CPUs or faster CPUs isn’t going to help as only a single (or so) thread will be needed to send the query from the BI Server to the database, and it’s the database that’s doing all of the work – all that this would help with is compilation and post-aggregation work, and enabling the server to handle a higher number of concurrent users. Invest in a better underlying database instead, sort out your data model, and make sure your data source back-end is as optimised as possible.
cezarovidiu

Magic Quadrant for Business Intelligence and Analytics Platforms - 0 views

  • Integration BI infrastructure: All tools in the platform use the same security, metadata, administration, portal integration, object model and query engine, and should share the same look and feel. Metadata management: Tools should leverage the same metadata, and the tools should provide a robust way to search, capture, store, reuse and publish metadata objects, such as dimensions, hierarchies, measures, performance metrics and report layout objects. Development tools: The platform should provide a set of programmatic and visual tools, coupled with a software developer's kit for creating analytic applications, integrating them into a business process, and/or embedding them in another application. Collaboration: Enables users to share and discuss information and analytic content, and/or to manage hierarchies and metrics via discussion threads, chat and annotations.
  • Information Delivery Reporting: Provides the ability to create formatted and interactive reports, with or without parameters, with highly scalable distribution and scheduling capabilities. Dashboards: Includes the ability to publish Web-based or mobile reports with intuitive interactive displays that indicate the state of a performance metric compared with a goal or target value. Increasingly, dashboards are used to disseminate real-time data from operational applications, or in conjunction with a complex-event processing engine. Ad hoc query: Enables users to ask their own questions of the data, without relying on IT to create a report. In particular, the tools must have a robust semantic layer to enable users to navigate available data sources. Microsoft Office integration: Sometimes, Microsoft Office (particularly Excel) acts as the reporting or analytics client. In these cases, it is vital that the tool provides integration with Microsoft Office, including support for document and presentation formats, formulas, data "refreshes" and pivot tables. Advanced integration includes cell locking and write-back. Search-based BI: Applies a search index to structured and unstructured data sources and maps them into a classification structure of dimensions and measures that users can easily navigate and explore using a search interface. Mobile BI: Enables organizations to deliver analytic content to mobile devices in a publishing and/or interactive mode, and takes advantage of the mobile client's location awareness.
  • Analysis Online analytical processing (OLAP): Enables users to analyze data with fast query and calculation performance, enabling a style of analysis known as "slicing and dicing." Users are able to navigate multidimensional drill paths. They also have the ability to write back values to a proprietary database for planning and "what if" modeling purposes. This capability could span a variety of data architectures (such as relational or multidimensional) and storage architectures (such as disk-based or in-memory). Interactive visualization: Gives users the ability to display numerous aspects of the data more efficiently by using interactive pictures and charts, instead of rows and columns. Predictive modeling and data mining: Enables organizations to classify categorical variables, and to estimate continuous variables using mathematical algorithms. Scorecards: These take the metrics displayed in a dashboard a step further by applying them to a strategy map that aligns key performance indicators (KPIs) with a strategic objective. Prescriptive modeling, simulation and optimization: Supports decision making by enabling organizations to select the correct value of a variable based on a set of constraints for deterministic processes, and by modeling outcomes for stochastic processes.
  • ...7 more annotations...
  • These capabilities enable organizations to build precise systems of classification and measurement to support decision making and improve performance. BI and analytic platforms enable companies to measure and improve the metrics that matter most to their businesses, such as sales, profits, costs, quality defects, safety incidents, customer satisfaction, on-time delivery and so on. BI and analytic platforms also enable organizations to classify the dimensions of their businesses — such as their customers, products and employees — with more granular precision. With these capabilities, marketers can better understand which customers are most likely to churn. HR managers can better understand which attributes to look for when recruiting top performers. Supply chain managers can better understand which inventory allocation levels will keep costs low without increasing out-of-stock incidents.
  • descriptive, diagnostic, predictive and prescriptive analytics
  • "descriptive"
  • diagnostic
  • data discovery vendors — such as QlikTech, Salient Management Company, Tableau Software and Tibco Spotfire — received more positive feedback than vendors offering OLAP cube and semantic-layer-based architectures.
  • Microsoft Excel users are often disaffected business BI users who are unable to conduct the analysis they want using enterprise, IT-centric tools. Since these users are the typical target users of data discovery tool vendors, Microsoft's aggressive plans to enhance Excel will likely pose an additional competitive threat beyond the mainstreaming and integration of data discovery features as part of the other leading, IT-centric enterprise platforms.
  • Building on the in-memory capabilities of PowerPivot in SQL Server 2012, Microsoft introduced a fully in-memory version of Microsoft Analysis Services cubes, based on the same data structure as PowerPivot, to address the needs of organizations that are turning to newer in-memory OLAP architectures over traditional, multidimensional OLAP architectures to support dynamic and interactive analysis of large datasets. Above-average performance ratings suggest that customers are happy with the in-memory improvements in SQL Server 2012 compared with SQL Server 2008 R2, which ranks below the survey average.
  •  
    "Gartner defines the business intelligence (BI) and analytics platform market as a software platform that delivers 15 capabilities across three categories: integration, information delivery and analysis."
cezarovidiu

Installing Hadoop for Fedora & Oracle Linux(Single Node Cluster) | accretion infinity - 0 views

  • Hadoop is a framework written in Java for running applications on large clusters of commodity hardware and incorporates features similar to those of the Google File System (GFS) and of the Map Reduce computing paradigm. Hadoop’s HDFS is a highly fault-tolerant distributed file system and, like Hadoop in general, designed to be deployed on low-cost hardware. It provides high throughput access to application data and is suitable for applications that have large data sets.
  • Some of the Hadoop projects we will talk about are: HDFS : A distributed filesystem that runs on large clusters of commodity machines. Map Reduce: A distributed data processing model and execution environment that runs on large clusters of commodity machines. Pig: A data flow language and execution environment for exploring very large datasets. Pig runs on HDFS and MapReduce clusters. HBase: A distributed, column-oriented database. HBase uses HDFS for its underlying storage, and supports both batch-style computations using MapReduce and point queries (random reads). ZooKeeper: A distributed, highly available coordination service. ZooKeeper provides primitives such as distributed locks that can be used for building distributed applications. Oozie: Oozie is a workflow scheduler system to manage Apache Hadoop jobs.
  • Oracle Linux as the operating system and Hadoop 1.1.2 or 1.2.0
1 - 8 of 8
Showing 20 items per page