Group items tagged data mining - VirgoLab

Data mining is not just a data recovery tool | Styx online - 0 views

www.styx-online.net/...-not-just-a-data-recovery-tool

shared by Roger Chen on 01 Aug 08 - Cached

Data Mining is a process of discovering meaningful new correlations, patterns and trends by sifting through large amounts of data stored in repositories, using statistical, data analysis and mathematical techniques
...

Cancel
Data mining is the crucial process that helps companies better comprehend their customers. Data mining can be defined as ‘the nontrivial extraction of implicit, previously unknown, and potentially useful information from data’ and also as ‘the science of extracting useful information from large sets or databases’.
...

Cancel

Analysis: data mining doesn't work for spotting terrorists - 0 views

arstechnica.com/...k-for-spotting-terrorists.html

data mining reading research

shared by Roger Chen on 11 Oct 08 - Cached

Automated identification of terrorists through data mining (or any other known methodology) is neither feasible as an objective nor desirable as a goal of technology development efforts.
...

Cancel
criminal prosecutors and judges are concerned with determining the guilt or innocence of a suspect in the wake of an already-committed crime; counter-terror officials are concerned with preventing crimes from occurring by identifying suspects before they've done anything wrong.
...

Cancel
The problem: preventing a crime by someone with no criminal record
...

Cancel
...3 more annotations...
In fact, most terrorists have no criminal record of any kind that could bring them to the attention of authorities or work against them in court.
...

Cancel
As the NRC report points out, not only is the training data lacking, but the input data that you'd actually be mining has been purposely corrupted by the terrorists themselves.
...

Cancel
So this application of data mining bumps up against the classic GIGO (garbage in, garbage out) problem in computing, with the terrorists deliberately feeding the system garbage.
...

Cancel

Semantic Library » Zotero and semantic principles - 0 views

www.semanticlibrary.net/...zotero-and-semantic-principles

data mining firefox research tools

shared by Roger Chen on 19 Jul 08 - Cached

Our Zotero Server, connected to the client, will enable all kinds of new collaboration opportunities and data-mining of aggregated collections. We also plan to provide hooks into high-performance computing projects like the SEASR text-mining project based at UIUC
...

Cancel
Data mining is becoming a major trend in eResearch as computing power increases and more and more researchers have direct access to open data sets. In the future, we won’t just be citing articles, figures, images, movies, and books, we’ll also be citing specific data points.
...

Cancel

Data Randomization - 0 views

research.microsoft.com/...view.aspx

data mining papers reference research

shared by Roger Chen on 04 Sep 08 - Cached

Roger Chen on 04 Sep 08

Attacks that exploit memory errors are still a serious problem. We present data randomization, a new technique that provides probabilistic protection against these attacks by xoring data with random masks. Data randomization uses static analysis to partition instruction operands into equivalence classes: it places two operands in the same class if they may refer to the same object in an execution that does not violate memory safety. Then it assigns a random mask to each class and it generates code instrumented to xor data read from or written to memory with the mask of the memory operand's class. Therefore, attacks that violate the results of the static analysis have unpredictable results. We implemented a data randomization prototype that compiles programs without modifications and can preventmany attacks with low overhead. Our prototype prevents all the attacks in our benchmarks while introducing an average runtime overhead of 11% (0%to 27%) and an average space overhead below 1%.

<div class="cArrow"> </div><div class="cContentInner">Attacks that exploit memory errors are still a serious problem. We present data randomization, a new technique that provides probabilistic protection against these attacks by xoring data with random masks. Data randomization uses static analysis to partition instruction operands into equivalence classes: it places two operands in the same class if they may refer to the same object in an execution that does not violate memory safety. Then it assigns a random mask to each class and it generates code instrumented to xor data read from or written to memory with the mask of the memory operand's class. Therefore, attacks that violate the results of the static analysis have unpredictable results. We implemented a data randomization prototype that compiles programs without modifications and can preventmany attacks with low overhead. Our prototype prevents all the attacks in our benchmarks while introducing an average runtime overhead of 11% (0%to 27%) and an average space overhead below 1%.</div>

...

Cancel

The End Of The Scientific Method… Wha….? « Life as a Physicist - 0 views

gordonwatts.wordpress.com/...d-of-the-scientific-method-wha

data mining thinking

shared by Roger Chen on 27 Jun 08 - Cached

His basic thesis is that when you have so much data you can map out every connection, every correlation, then the  data becomes the model. No need to derive or understand what is actually happening — you have so much data that you can already make all the predictions that a model would let you do in the first place. In short — you no longer need to develop a theory or hypothesis - just map the data!
...

Cancel
First, in order for this to work you need to have millions and millions and millions of data points. You need, basically, ever single outcome possible, with all possible other factors. Huge amounts of data. That does not apply to all branches of science.
...

Cancel
The second problem with this approach is you will never discover anything new. The problem with new things is there is no data on them!
...

Cancel
...3 more annotations...
Correlations are a way of catching a scientist’s attention, but the models and mechanisms that explain them are how we make the predictions that not only advance science, but generate practical applications. One only needs to look at a promising field that lacks a strong theoretical foundation—high-temperature superconductivity springs to mind—to see how badly the lack of a theory can impact progress
...

Cancel
Anderson is right — we are entering a new age where the ability to mine these large amounts of data are going to open up whole new levels of understanding
...

Cancel
This is a new tool, and it will open up all sorts of doors for us. But the end of the scientific method? No — because that implies an end of discovery. And end of new things.
...

Cancel

Current Approaches to Data Mining Blogs - ESIWiki - 0 views

wiki.esi.ac.uk/pproaches_to_Data_Mining_Blogs

data mining research social media

shared by Roger Chen on 28 Jul 08 - Cached

Roger Chen on 28 Jul 08

Summary of the current doirction of blog research using data mining.

<div class="cArrow"> </div><div class="cContentInner">Summary of the current doirction of blog research using data mining.</div>

...

Cancel

Data Mining Souce Code Newsletter - Blogs - 0 views

www.kdkeys.net

data mining reference

shared by Roger Chen on 01 Jul 08 - Cached

Roger Chen on 01 Jul 08

Download Free Data Mining Source Code In C/C++, C#, Visual Basic, Visual Basic.NET, Java, and other programming languages

<div class="cArrow"> </div><div class="cContentInner">Download Free Data Mining Source Code In C/C++, C#, Visual Basic, Visual Basic.NET, Java, and other programming languages</div>

...

Cancel

Business Analytics: Data Mining Combined With Predictive Modeling Equal 3D Data Visuali... - 0 views

atomai.blogspot.com/...-combined-with-predictive.html

data mining visualization

shared by Roger Chen on 16 Jul 08 - Cached

課程管理系統的資料探勘：以Moodle為例 - 0 views

pulipuli.blogspot.com/...moodle.html

data mining papers

shared by Roger Chen on 19 May 08 - Cached

Roger Chen on 19 May 08

Data mining in course management systems: Moodle case study and tutorial by: Cristobal Romero, Sebastian Ventura, Enrique Garcia Computers & Education, Vol. In Press(2007), Corrected Proof

<div class="cArrow"> </div><div class="cContentInner">Data mining in course management systems: Moodle case study and tutorial by: Cristobal Romero, Sebastian Ventura, Enrique Garcia Computers & Education, Vol. In Press(2007), Corrected Proof</div>

...

Cancel

Pentaho Data Mining Community Documentation - Pentaho Wiki - 0 views

wiki.pentaho.com/...Mining+Community+Documentation

data mining tools

shared by Roger Chen on 27 Feb 09 - Cached

ACM SIGKDD - 0 views

www.sigkdd.org

academia data mining research

shared by Roger Chen on 25 Jun 08 - Cached

Roger Chen on 25 Jun 08

ACM SPecial Interest Group on Knowledge Discovery and Data Mining

<div class="cArrow"> </div><div class="cContentInner">ACM SPecial Interest Group on Knowledge Discovery and Data Mining</div>

...

Cancel

Still confused by Data Mining? (I know I am) - 0 views

blogs.msdn.com/...y-data-mining-i-know-i-am.aspx

data mining reference

shared by Roger Chen on 01 Jul 08 - Cached

PAKDD 2009 : 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining - 0 views

www.wikicfp.com/...event.showcfp

academia conference

shared by Roger Chen on 27 Jul 08 - Cached

Roger Chen on 27 Jul 08

13th Pacific-Asia Conference on Knowledge Discovery and Data Mining

<div class="cArrow"> </div><div class="cContentInner">13th Pacific-Asia Conference on Knowledge Discovery and Data Mining</div>

...

Cancel

Data Mining - a short paper - 0 views

ivythesis.typepad.com/...data-mining.html

data mining papers

shared by Roger Chen on 19 Jul 08 - Cached

CRISP-DM - Home - 0 views

www.crisp-dm.org

data mining reference

shared by Roger Chen on 26 Jun 08 - Cached

Roger Chen on 26 Jun 08

CRISP = Cross Industry Standard Process for Data Mining

<div class="cArrow"> </div><div class="cContentInner">CRISP = Cross Industry Standard Process for Data Mining</div>

...

Cancel

KNIME - Konstanz Information Miner - 0 views

www.knime.org

data mining tools

shared by Roger Chen on 01 Aug 08 - Cached

Roger Chen on 01 Aug 08

KNIME, pronounced [naim], is a modular data exploration platform that enables the user to visually create data flows (often referred to as pipelines), selectively execute some or all analysis steps, and later investigate the results through interactive views on data and models.

<div class="cArrow"> </div><div class="cContentInner">KNIME, pronounced [naim], is a modular data exploration platform that enables the user to visually create data flows (often referred to as pipelines), selectively execute some or all analysis steps, and later investigate the results through interactive views on data and models.</div>

...

Cancel

Data Mining: Text Mining, Visualization and Social Media: Chris Harrison - 0 views

datamining.typepad.com/...chris-harrison.html

data mining visualization

shared by Roger Chen on 28 Jul 08 - Cached

KDnuggets: Data Mining, Web Mining, and Knowledge Discovery - 0 views

www.kdnuggets.com

data mining

shared by Roger Chen on 23 Apr 08 - Cached

The End of Theory: The Data Deluge Makes the Scientific Method Obsolete - 0 views

www.wired.com/...pb_theory

data mining google statistics thinking

shared by Roger Chen on 29 Jun 08 - Cached

Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database.
...

Cancel
Google's founding philosophy is that we don't know why this page is better than that one: If the statistics of incoming links say it is, that's good enough.
...

Cancel
The scientific method is built around testable hypotheses. These models, for the most part, are systems visualized in the minds of scientists. The models are then tested, and experiments confirm or falsify theoretical models of how the world works. This is the way science has worked for hundreds of years.
...

Cancel
...6 more annotations...
Peter Norvig, Google's research director, offered an update to George Box's maxim: "All models are wrong, and increasingly you can succeed without them."
...

Cancel
Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.
- Roger Chen on 29 Jun 08
  
  That's what Chris Anderson thought is old-school.
  
  <div class="cArrow"> </div><div class="cContentInner">That's what Chris Anderson thought is old-school.</div>
  
  ...
  
  Cancel
...

Cancel
But faced with massive data, this approach to science — hypothesize, model, test — is becoming obsolete.
- Roger Chen on 29 Jun 08
  
  Come to conclusion? I don't think so.
  
  <div class="cArrow"> </div><div class="cContentInner">Come to conclusion? I don't think so.</div>
  
  ...
  
  Cancel
...

Cancel
There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.
...

Cancel
What can science learn from Google?
...

Cancel
This kind of thinking is poised to go mainstream.
- Roger Chen on 29 Jun 08
  
  ???
  
  <div class="cArrow"> </div><div class="cContentInner">???</div>
  
  ...
  
  Cancel
...

Cancel

Roger Chen on 29 Jun 08

"All models are wrong, and increasing you can succeed without them."

<div class="cArrow"> </div><div class="cContentInner">"All models are wrong, and increasing you can succeed without them."</div>

...

Cancel

Data Mining, Analytics and Artificial Intelligence: Financial Services Business Analyti... - 0 views

atomai.blogspot.com/...rvices-business-analytics.html

data mining

shared by Roger Chen on 29 Apr 08 - Cached

Roger Chen on 29 Apr 08

An universal evidence-based business analytics model for the financial services industry

<div class="cArrow"> </div><div class="cContentInner">An universal evidence-based business analytics model for the financial services industry</div>

...

Cancel

Group items tagged

Data mining is not just a data recovery tool | Styx online - 0 views

Analysis: data mining doesn't work for spotting terrorists - 0 views

Semantic Library » Zotero and semantic principles - 0 views

Data Randomization - 0 views

The End Of The Scientific Method… Wha….? « Life as a Physicist - 0 views

Current Approaches to Data Mining Blogs - ESIWiki - 0 views

Data Mining Souce Code Newsletter - Blogs - 0 views

Business Analytics: Data Mining Combined With Predictive Modeling Equal 3D Data Visuali... - 0 views

課程管理系統的資料探勘：以Moodle為例 - 0 views

Pentaho Data Mining Community Documentation - Pentaho Wiki - 0 views

ACM SIGKDD - 0 views

Still confused by Data Mining? (I know I am) - 0 views

PAKDD 2009 : 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining - 0 views

Data Mining - a short paper - 0 views

CRISP-DM - Home - 0 views

KNIME - Konstanz Information Miner - 0 views

Data Mining: Text Mining, Visualization and Social Media: Chris Harrison - 0 views

KDnuggets: Data Mining, Web Mining, and Knowledge Discovery - 0 views

The End of Theory: The Data Deluge Makes the Scientific Method Obsolete - 0 views

Data Mining, Analytics and Artificial Intelligence: Financial Services Business Analyti... - 0 views

Related searches