Skip to main content

Home/ Bucknell Digital Pedagogy & Scholarship/ Group items tagged amazon

Rss Feed Group items tagged

Todd Suomela

DSHR's Blog: Ithaka's Perspective on Digital Preservation - 0 views

  • Second, there is very little coverage of Web archiving, which is clearly by far the largest and most important digital preservation initiative both for current and future readers. The Internet Archive rates only two mentions, in the middle of a list of activities and in a footnote. This is despite the fact that archive.org is currently the 211th most visited site in the US (272nd globally) with over 5.5M registered users, adding over 500 per day, and serving nearly 4M unique IPs per day. For comparison, the Library of Congress currently ranks 1439th in the US (5441st globally). The Internet Archive's Web collection alone probably dwarfs all other digital preservation efforts combined both in size and in usage. Not to mention their vast collections of software, digitized books, audio, video and TV news.. Rieger writes: There is a lack of understanding about how archived websites are discovered, used, and referenced. “Researchers prefer to cite the original live-web as it is easier and shorter,” pointed out one of the experts. “There is limited awareness of the existence of web archives and lack of community consensus on how to treat them in scholarly work. The problems are not about technology any more, it is about usability, awareness, and scholarly practices.” The interviewee referred to a recent CRL study based on an analysis of referrals to archived content from papers that concluded that the citations were mainly to articles about web archiving projects. It is surprising that the report doesn't point out that the responsibility for educating scholars in the use of resources lies with the "experts and thought leaders" from institutions such as the University of California, Michigan State, Cornell, MIT, NYU and Virginia Tech. That these "experts and thought leaders" don't consider the Internet Archive to be a resource worth mentioning might have something to do with the fact that their scholars don't know that they should be using it. A report whose first major section, entitled "What's Working Well", totally fails to acknowledge the single most important digital preservation effort of the last two decades clearly lacks credibility
  • Finally, there is no acknowledgement that the most serious challenge facing the field is economic. Except for a few corner cases, we know how to do digital preservation, we just don't want to pay enough to have it done. Thus the key challenge is to achieve some mixture of significant increase in funding for, and significant cost reduction in the processes of, digital preservation. Information technology processes naturally have very strong economies of scale, which result in winner-take-all markets (as W. Brian Arthur pointed out in 1985). It is notable that the report doesn't mention the winners we already have, in Web and source code archiving, and in emulation. All are at the point where a competitor is unlikely to be viable. To be affordable, digital preservation needs to be done at scale. The report's orientation is very much "let a thousand flowers bloom", which in IT markets only happens at a very early stage. This is likely the result of talking only to people nurturing a small-scale flower, not to people who have already dominated their market niche. It is certainly a risk that each area will have a single point of failure, but trying to fight against the inherent economics of IT pretty much guarantees ineffectiveness.
  • 1) The big successes in the field haven't come from consensus building around a roadmap, they have come from idiosyncratic individuals such as Brewster Kahle, Roberto di Cosmo and Jason Scott identifying a need and building a system to address it no matter what "the community" thinks. We have a couple of decades of experience showing that "the community" is incapable of coming to a coherent consensus that leads to action on a scale appropriate to the problem. In any case, describing road-mapping as "research" is a stretch. 2) Under severe funding pressure, almost all libraries have de-emphasized their custodial role of building collections in favor of responding to immediate client needs. Rieger writes: As one interviewee stated, library leaders have “shifted their attention from seeing preservation as a moral imperative to catering to the university’s immediate needs.” Regrettably, but inevitably given the economics of IT markets, this provides a market opportunity for outsourcing. Ithaka has exploited one such opportunity with Portico. This bullet does describe "research" in the sense of "market research".  Success is, however, much more likely to come from the success of an individual effort than from a consensus about what should be done among people who can't actually do it. 3) In the current climate, increased funding for libraries and archives simply isn't going to happen. These institutions have shown a marked reluctance to divert their shrinking funds from legacy to digital media. Thus the research topic with the greatest leverage in turning funds into preserved digital content is into increasing the cost-effectiveness of the tools, processes and infrastructure of digital preservation.
jatolbert

The Digital-Humanities Bust - The Chronicle of Higher Education - 0 views

  • To ask about the field is really to ask how or what DH knows, and what it allows us to know. The answer, it turns out, is not much. Let’s begin with the tension between promise and product. Any neophyte to digital-humanities literature notices its extravagant rhetoric of exuberance. The field may be "transforming long-established disciplines like history or literary criticism," according to a Stanford Literary Lab email likely unread or disregarded by a majority in those disciplines. Laura Mandell, director of the Initiative for Digital Humanities, Media, and Culture at Texas A&M University, promises to break "the book format" without explaining why one might want to — even as books, against all predictions, doggedly persist, filling the airplane-hanger-sized warehouses of Amazon.com.
  • A similar shortfall is evident when digital humanists turn to straight literary criticism. "Distant reading," a method of studying novels without reading them, uses computer scanning to search for "units that are much smaller or much larger than the text" (in Franco Moretti’s words) — tropes, at one end, genres or systems, at the other. One of the most intelligent examples of the technique is Richard Jean So and Andrew Piper’s 2016 Atlantic article, "How Has the MFA Changed the American Novel?" (based on their research for articles published in academic journals). The authors set out to quantify "how similar authors were across a range of literary aspects, including diction, style, theme, setting." But they never cite exactly what the computers were asked to quantify. In the real world of novels, after all, style, theme, and character are often achieved relationally — that is, without leaving a trace in words or phrases recognizable as patterns by a program.
  • Perhaps toward that end, So, an assistant professor of English at the University of Chicago, wrote an elaborate article in Critical Inquiry with Hoyt Long (also of Chicago) on the uses of machine learning and "literary pattern recognition" in the study of modernist haiku poetry. Here they actually do specify what they instructed programmers to look for, and what computers actually counted. But the explanation introduces new problems that somehow escape the authors. By their own admission, some of their interpretations derive from what they knew "in advance"; hence the findings do not need the data and, as a result, are somewhat pointless. After 30 pages of highly technical discussion, the payoff is to tell us that haikus have formal features different from other short poems. We already knew that.
  • ...2 more annotations...
  • The outsized promises of big-data mining (which have been a fixture in big-figure grant proposals) seem curiously stuck at the level of confident assertion. In a 2011 New Left Review article, "Network Theory, Plot Analysis," Moretti gives us a promissory note that characterizes a lot of DH writing: "One day, after we add to these skeletons the layers of direction, weight and semantics, those richer images will perhaps make us see different genres — tragedies and comedies; picaresque, gothic, Bildungsroman … — as different shapes; ideally, they may even make visible the micro-patterns out of which these larger network shapes emerge." But what are the semantics of a shape when measured against the tragedy to which it corresponds? If "shape" is only a place-holder meant to allow for more-complex calculations of literary meaning (disburdened of their annoyingly human baggage), by what synesthetic principle do we reconvert it into its original, now reconfigured, genre-form? It is not simply that no answers are provided; it is that DH never asks the questions. And without them, how can Moretti’s "one day" ever arrive?
  • For all its resources, the digital humanities makes a rookie mistake: It confuses more information for more knowledge. DH doesn’t know why it thinks it knows what it does not know. And that is an odd place for a science to be.
1 - 5 of 5
Showing 20 items per page