Skip to main content

Home/ VirgoLab/ Group items tagged statistics

Rss Feed Group items tagged

Roger Chen

The End of Theory: The Data Deluge Makes the Scientific Method Obsolete - 0 views

  • Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database.
  • Google's founding philosophy is that we don't know why this page is better than that one: If the statistics of incoming links say it is, that's good enough.
  • The scientific method is built around testable hypotheses. These models, for the most part, are systems visualized in the minds of scientists. The models are then tested, and experiments confirm or falsify theoretical models of how the world works. This is the way science has worked for hundreds of years.
  • ...6 more annotations...
  • Peter Norvig, Google's research director, offered an update to George Box's maxim: "All models are wrong, and increasingly you can succeed without them."
  • Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.
    • Roger Chen
       
      That's what Chris Anderson thought is old-school.
  • But faced with massive data, this approach to science — hypothesize, model, test — is becoming obsolete.
    • Roger Chen
       
      Come to conclusion? I don't think so.
  • There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.
  • What can science learn from Google?
  • This kind of thinking is poised to go mainstream.
    • Roger Chen
       
      ???
  •  
    "All models are wrong, and increasing you can succeed without them."
Roger Chen

How to Make Your Blog Posts More Readable | Blogging Tips from Blogsessive - 0 views

  • Write a short introductory paragraph and tell me exactly what I’m about to read.
  • Make use of paragraphs
  • Make use of paragraphs and line breaks. Huge blocks of text are very hard to scan
  • ...4 more annotations...
  • Whenever you talk about numbers and statistics, use graphics.
  • Give me more things to read. In your posts, sometimes you mention things that are related to the topic, but you don’t want to develop more. Link them to a place where I can read more about them,
  • Whenever you link to another blog post or website, use strong, descriptive text anchors.
  • Create blog post sections and use subheadings so that I can jump between paragraphs of interest.
Roger Chen

Data mining is not just a data recovery tool | Styx online - 0 views

  • Data Mining is a process of discovering meaningful new correlations, patterns and trends by sifting through large amounts of data stored in repositories, using statistical, data analysis and mathematical techniques
  • Data mining is the crucial process that helps companies better comprehend their customers. Data mining can be defined as ‘the nontrivial extraction of implicit, previously unknown, and potentially useful information from data’ and also as ‘the science of extracting useful information from large sets or databases’.
1 - 11 of 11
Showing 20 items per page