The Right to Read Is the Right to Mine | Open Knowledge Foundation Blog - 0 views
-
simonmart on 18 Jun 12Researchers can find and read papers online, rather than having to manually track down print copies. Machines (computers) can index the papers and extract the details (titles, keywords etc.) in order to alert scientists to relevant material. In addition, computers can extract factual data and meaning by "mining" the content, opening up the possibility that machines could be used to make connections (and even scientific discoveries) that might otherwise remain invisible to researchers. However, it is not generally possible today for computers to mine the content in papers due to constraints imposed by publishers. While Open Access (OA) is improving the ability for researchers to read papers (by removing access barriers), still only around 20% of scholarly papers are OA. The remainder are locked behind paywalls. As per the vast majority of subscription contracts, Subscribers may read paywalled papers, but they may not mine them. Content mining is the way that modern technology locates digital information. Because digitized scientific information comes from hundreds of thousands of different sources in today's globally connected scientific community [2] and because current data sets can be measured in terabytes,[1] it is often no longer possible to simply read a scholarly summary in order to make scientifically significant use of such information.[3] A researcher must be able to copy information, recombine it with other data and otherwise "re-use" it so as to produce truly helpful results. Not only is it a deductive tool to analyze research data, it is how search engines operate to allow discovery of content. To prevent mining is therefore to force scientists into blind alleys and silos where only limited knowledge is accessible. Science does not progress if it cannot incorporate the most recent findings and move forward from there.