Skip to main content

Home/ Groups/ SCSU Analytics
8More

How to Ensure Data Lakes Success | SmartData Collective - 0 views

  • it enables businesses to have a more unlimited view of data
  • Data lakes are defined as "a massive, easily accessible, centralized repository of large volumes of structured and unstructured data".
  • businesses must have some use cases in mind before constructing a data lake.
  • ...5 more annotations...
  • Oliver likewise suggests that businesses work with data scientists. Data scientists and engineers provide the necessary expertise required to make the data lake a successful data and analytics tool.
  • Configurable Ingestion WorkflowsNew sources of external information will continuously be available. Make sure to have an easy, secure and trackable content ingestion workflow mechanism that can rapidly add these new information into the data lake.
  • Knowledgent states that "without a high-degree of automated and mandatory metadata management, a Data Lake will rapidly become a Data Swamp" and that "attributes like data lineage, data quality, and usage history are vital to usability".
  • Data lakes must be industry-specific to cater to the industry's unique needs.
  • How to Ensure Data Lakes Success
8More

How to create effective data visualizations - 0 views

  • 5. Enough with the text, already
  • 1. Present data that matter to your audience (and not just you)
  • Data scientists by definition are naturally inquisitive and love to quantify things. That makes them a good fit for the job. The bad news is that they sometimes become a little too enthusiastic about data for data’s sake and will overwhelm their audience with irrelevant information.
  • ...5 more annotations...
  • 2. Tell a story, simply
  • 3. Choose appropriate visualizations
  • 4. Make sure graphics accurately reflect the data
  • "Try to pick a visualization that depicts not only the level of a variable, but puts it in context for how important it is,"
  • Heat maps and bubble charts are a good example of this. You can see how important a particular region or customer or division is because it takes up more space on the map. You can show other attributes of the variables with color -- e.g., red for underperforming, green for doing well. With a visual like this, managers can quickly see where the problem is, and at the same time they can see how important it is."
3More

Partners Say New Azure Machine Learning Service Could Be Microsoft's Secret Weapon In T... - 0 views

  • Azure Machine Learning is a public cloud-based service that lets developers embed predictive analytics into their applications
  • Machine learning software has been around for years but isn't easy to use or deploy, and it's also expensive, Sirosh said. Packaging up machine-learning-as-a-cloud service solves these problems, and by being first to bring it to market, Microsoft has a head start on the likes of Google, Amazon and IBM, he said. "I think, on this particular front, that we are the leaders," Sirosh told CRN.   
  • Hiring Sirosh was something of a coup for Microsoft. He joined last July from Amazon, where he spent close to nine years as a vice president in various machine-learning-related roles.
1More

Azure Machine Learning "Hello World" using R - Concurrency, Inc. - 0 views

  • am going to show you a very simple experiement where I will create a very basic “Hello World” program using R script. Yes! This is another advantage of Azure ML studio that you can incorporate your R scripts and create a knowledge management system for your company.
1More

Valley heavyweight Vinod Khosla says replacing doctors with data crunchers is good medi... - 0 views

  • “Humans are not good when 500 variables affect a disease. We can handle three to five to seven, maybe,” he said. “We are guided too much by opinions, not by statistical science.”
23More

"No More Excuses": Michael M. Crow on Analytics (EDUCAUSE Review) | EDUCAUSE.edu - 0 views

  • ombining the highest levels of academic excellence, inclusiveness to a broad demographic, and maximum societal impact.
  • the number of first-time, full-time, low-income Arizona freshmen increased 647 percent from FY2003 through FY2011
  • President Crow attributes much of this success to the use of analytics.
  • ...20 more annotations...
  • If you are instead trying to educate a broader spectrum of the population, including elite students, and you aren't using analytics, you won't know what's going on.
  • use of analytics is being driven by the objective of student success
  • at ASU, you've created a culture of innovation using analytics.
  • We've had distinguished professors in the hard sciences, such as physics, say they feel ashamed that for thirty years they didn't know why certain students were learning or weren't learning. They had no idea of the reasons
  • we have infinitely more information to allow us to help students be successful. Analytics are not the end. They are the means to the end: the successful world-class university graduate who has come to us from any family, any background, any income level.
  • Crow: For us, to be a public university means engaging the demographic complexity of our society as a whole. It means understanding that demographic complexity. It means designing the institution to deal with that demographic complexity. And it means accepting highly differentiated types of intelligence: analytical intelligence, emotional intelligence. Students are not of one type but are of many, many types.
  • At ASU, I could see that we would not be able to innovate fast enough without analytics. Without analytics, we can't understand what's going on, we can't understand the complexity of what we're trying to do, and we can't measure our progress. We needed tools to help us make better decisions—about everything. How should we design academic advising? How should we design individual courses? How should we design the overall pedagogical structure of the institution? Every facet of the institution requires robust analytics.
  • Crow: Our biggest problem has been that launching an analytical tool that is not 100 percent reliable creates tension and frustration and anger in the institution.
  • We wanted an academic advisor or a student affairs dean or an associate dean to be able to have a 360-degree analytical view of a student
  • FERPA is concerned with the university releasing information outside of the institution
  • And so, with the right training and the right controls and the right discipline, we were able to build this 360 analytical tool, which has been remarkably helpful for us in terms of creating a new way to advise students.
  • partly motivated by the Virginia Tech shooting incident,
  • At ASU, we are all responsible for the care, well-being, and success of our students.
  • It can't be just the English department or just the football team that is aware of unusual behavior, with no other campus department knowing.
  • We need to be graduating 90 percent, but we can't do that without enhanced analytics,
  • when the federal government calculates graduation rates, it calculates rates only for first-time, full-time freshmen. It does not calculate whether or not students actually graduate from somewhere else. It does not count transfer students who ultimately graduate. So we have a grossly underreported performance from some institutions, creating bad data that then goes into bad policymaking. We need more flexibility and more adaptability, and we need more recognition that students are moving around.
  • There are no more excuses. If you use these analytical tools, you will know where you are, you will know what you're doing, you will know if what you are doing is working or not, and therefore you will know whether or not you need to be doing new things customized to fit your particular school or your particular demographic to be successful. We are underutilizing these tools.
  • We're trying to change from the old agricultural cycle—or whatever it is that semesters are currently based on, because nobody really knows—to cycles based on learning outcomes. That might mean a course could take two years and other courses could take three weeks. How can we allow students to individualize their learning in a structured institution?
  • Crow: We're next headed away from hard, confined definitions of learning timeframes.
  • Allow them to game and simulate their academic careers, and allow them to engage 24/7 academic advice without having to speak to an academic advisor. That's where we've found the most positive impact.
7More

This Algorithm Can Predict Your Success At University ⚙ Co.Labs ⚙ code + comm... - 0 views

  • Administrators call it a “student success algorithm,” but it’s official name is Course Signals--and if it works, it could change the way modern universities are run.
  • six-year graduation rates are up 21.48%
  • How do students feel about having their academic careers predicted for them? How would you feel if your next student advisor was an algorithm?
  • ...4 more annotations...
  • the idea is that the more feedback students receive about their current standing, the higher the grade they will ultimately achieve.
  • getting the technology accepted by students came down to two things: understandability and access. To achieve the former, Pistilli decided to adopt the familiar metaphor of traffic-light signals to help contextualize a student’s success as they continue along a particular route.
  • a personalized message is then generated using their name, lecturer, and specific topical references (the latter to stop the message looking too automated) and sent out by email. These messages don’t simply offer students predictions about their likelihood of eventual success (or failure), but also give strategic tactical directions so that students can work to either maintain or improve their overall grades.
  • Oddly enough, students love the system. “Overwhelmingly students tell us that they want Course Signals in every single class,” Pistilli says. “They crave the the feedback.
5More

The Popularity of Data Analysis Software | r4stats.com - 0 views

  • R resides in an interestingly large gap between the other domain-specific languages, SAS and SPSS. R has not only caught up with SPSS, but surpassed it with around 50% more job postings. MATLAB has many similarities to R so it’s interesting to see that it has only around half the job postings. Note that these are specific to analtyics and MATLAB has many engineering jobs that are not counted in this total.
  • SAS is still far ahead of R in analytics job postings
  • Figure 2a shows the number of articles found for each software package for all the years that Google Scholar can search. SPSS is by far the most dominant package, likely due to its balance between power and ease-of-use. SAS has around half as many, followed by MATLAB and R.
  • ...2 more annotations...
  • Minitab, Systat and JMP are all growing but at a much lower rate than either R or Stata.
  • R still dominates the discussions on the more statistically-oriented forums
17More

Moving Your SQL Databases to Azure - Things to Know - The Microsoft MVP Award Program B... - 0 views

  • COMPARISON WITH ON-PREMAzure SQL is SQL Server behind the scenes, so most of the functionality is already there including tables, views, stored procedures, triggers, functions, primary and foreign keys and clustered indexes.
  • Of course there is no Windows authentication, and it currently uses SQL authentication only.
  • There is no to need maintain, balance, upgrade or patch the server as this is all done by Microsoft.
  • ...14 more annotations...
  • You also can't reboot the server, so if you end up with a runaway query you may have to open a support ticket.
  • There are always 3 copies of the database for high availability during disaster recovery.
  • There is a requirement for tables in a SQL Azure database to have a clustered index. This is necessary to keep the 3 copies of the database in sync.
  • The maximum SQL Azure database size is currently 500GB, but you can get around this using SQL federations and partitioning your data across multiple nodes.
  • There are a number of partially supported and unsupported features. A few of the ones I run into regularly are:• You cannot use the USE [databasename] sql statement. You must physically switch between databases in your application. • Remove from indexes - NOT FOR REPLICATION• Remove from your tables - WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] You can review a full list of unsupported features here: https://azure.microsoft.com/en-us/documentation/articles/sql-database-transact-sql-information/
  • When I migrate a database from SQL to SQL Azure, I typically follow this process using SSMS: • Create a blank database on the SQL Azure database server• Generate the scripts from the original database to create the database objects, excluding users• Do a find and replace to remove any unsupported features such as the two mentioned above• Run the create database object scripts against the new SQL Azure database• Create the users and apply permissions for the new database• Use SSMS or SSIS to copy the data over to the new database.
  • The SQL Database Management Portal is a web based, scaled down version of SSMS. You can create objects, and run queries and execution plans. But there is no GUI interface for some of the security features like creating users and logins. I find that it's a friendlier experience to create the database server in the portal, and do everything else using SSMS.
  • SQL Azure databases are protected by an automatic backup system.
  • The length of time the backups are retained depends on what tier you buy – 7 days for Basic, 14 days for Standard and 35 days for Premium.
  • The point-in-time restore is a self-service feature that costs you nothing unless you use it. If you use it, you pay regular rates for the new database that gets restored. You get all of the protection without any additional cost.
  • SECURITY You are in complete control the IP specific access to SQL Azure Database, at both the server AND database level. No one has access by default.
  • every time your IP changes, you have to update your firewall rules.
  • SERVICE TIERS AND PERFORMANCE LEVELS There are three tiers, with several levels of performance within them. I will summarize the Microsoft definitions.• Basic: Best suited for a small size database, supporting typically one single active operation at a given time.• Standard: The go-to option for most cloud applications, supporting multiple concurrent queries.• Premium: Designed for high transactional volume, supporting a large number of concurrent users and requiring the highest level of business continuity capabilities.
  • Costs can range anywhere from $7 per month for the Basic tier, $19 - $183 per month for a 250GB database in the Standard tier, to $566 to $8500 per month in the Premium tier.
1 - 20 Next › Last »
Showing 20 items per page