DSHR's Blog: Ithaka's Perspective on Digital Preservation - 0 views
-
Second, there is very little coverage of Web archiving, which is clearly by far the largest and most important digital preservation initiative both for current and future readers. The Internet Archive rates only two mentions, in the middle of a list of activities and in a footnote. This is despite the fact that archive.org is currently the 211th most visited site in the US (272nd globally) with over 5.5M registered users, adding over 500 per day, and serving nearly 4M unique IPs per day. For comparison, the Library of Congress currently ranks 1439th in the US (5441st globally). The Internet Archive's Web collection alone probably dwarfs all other digital preservation efforts combined both in size and in usage. Not to mention their vast collections of software, digitized books, audio, video and TV news.. Rieger writes: There is a lack of understanding about how archived websites are discovered, used, and referenced. “Researchers prefer to cite the original live-web as it is easier and shorter,” pointed out one of the experts. “There is limited awareness of the existence of web archives and lack of community consensus on how to treat them in scholarly work. The problems are not about technology any more, it is about usability, awareness, and scholarly practices.” The interviewee referred to a recent CRL study based on an analysis of referrals to archived content from papers that concluded that the citations were mainly to articles about web archiving projects. It is surprising that the report doesn't point out that the responsibility for educating scholars in the use of resources lies with the "experts and thought leaders" from institutions such as the University of California, Michigan State, Cornell, MIT, NYU and Virginia Tech. That these "experts and thought leaders" don't consider the Internet Archive to be a resource worth mentioning might have something to do with the fact that their scholars don't know that they should be using it. A report whose first major section, entitled "What's Working Well", totally fails to acknowledge the single most important digital preservation effort of the last two decades clearly lacks credibility
-
Finally, there is no acknowledgement that the most serious challenge facing the field is economic. Except for a few corner cases, we know how to do digital preservation, we just don't want to pay enough to have it done. Thus the key challenge is to achieve some mixture of significant increase in funding for, and significant cost reduction in the processes of, digital preservation. Information technology processes naturally have very strong economies of scale, which result in winner-take-all markets (as W. Brian Arthur pointed out in 1985). It is notable that the report doesn't mention the winners we already have, in Web and source code archiving, and in emulation. All are at the point where a competitor is unlikely to be viable. To be affordable, digital preservation needs to be done at scale. The report's orientation is very much "let a thousand flowers bloom", which in IT markets only happens at a very early stage. This is likely the result of talking only to people nurturing a small-scale flower, not to people who have already dominated their market niche. It is certainly a risk that each area will have a single point of failure, but trying to fight against the inherent economics of IT pretty much guarantees ineffectiveness.
-
1) The big successes in the field haven't come from consensus building around a roadmap, they have come from idiosyncratic individuals such as Brewster Kahle, Roberto di Cosmo and Jason Scott identifying a need and building a system to address it no matter what "the community" thinks. We have a couple of decades of experience showing that "the community" is incapable of coming to a coherent consensus that leads to action on a scale appropriate to the problem. In any case, describing road-mapping as "research" is a stretch. 2) Under severe funding pressure, almost all libraries have de-emphasized their custodial role of building collections in favor of responding to immediate client needs. Rieger writes: As one interviewee stated, library leaders have “shifted their attention from seeing preservation as a moral imperative to catering to the university’s immediate needs.” Regrettably, but inevitably given the economics of IT markets, this provides a market opportunity for outsourcing. Ithaka has exploited one such opportunity with Portico. This bullet does describe "research" in the sense of "market research". Success is, however, much more likely to come from the success of an individual effort than from a consensus about what should be done among people who can't actually do it. 3) In the current climate, increased funding for libraries and archives simply isn't going to happen. These institutions have shown a marked reluctance to divert their shrinking funds from legacy to digital media. Thus the research topic with the greatest leverage in turning funds into preserved digital content is into increasing the cost-effectiveness of the tools, processes and infrastructure of digital preservation.