Skip to main content

Home/ Bucknell Digital Pedagogy & Scholarship/ Group items tagged analysis

Rss Feed Group items tagged

Leslie Harris

Parsing Ronald Reagan's Words for Early Signs of Alzheimer's - NYTimes.com - 0 views

  •  
    Interesting article about an analysis of Ronald Reagan's news conferences in an attempt to determine early signs of dementia. The "digital humanities" aspect is that the same linguistic analysis has been use to study word use patterns by novelists.
jatolbert

The "Digital" Scholarship Disconnect | EDUCAUSE - 0 views

  • Digital scholarship is an incredibly awkward term that people have come up with to describe a complex group of developments. The phrase is really, at some basic level, nonsensical. After all, scholarship is scholarship. Doing science is doing science. We don't find the Department of Digital Physics arguing with the Department of Non–Digital Physics about who's doing "real" physics.
  • Soon, people wanted to start talking more broadly about newly technology-enabled scholarly work, not just in science; in part this was because of some very dramatic and high-visibility developments in using digital technology in various humanistic investigations. To do so, they came up with the neologisms we enjoy today—awful phrases like e-scholarship and digital scholarship.Having said that, I do view the term digital scholarship basically as shorthand for the entire body of changing scholarly practice, a reminder and recognition of the fact that most areas of scholarly work today have been transformed, to a lesser or greater extent, by a series of information technologies: High-performance computing, which allows us to build simulation models and to conduct very-large-scale data analysis Visualization technologies, including interactive visualizations Technologies for creating, curating, and sharing large databases and large collections of data High-performance networking, which allows us to share resources across the network and to gain access to experimental or observational equipment and which allows geographically dispersed individuals to communicate and collaborate; implicit here are ideas such as the rise of lightweight challenge-focused virtual organizations
  • We now have enormous curated databases serving various disciplines: GenBank for gene sequences; the Worldwide Protein Data Bank for protein structures; and the Sloan Digital Sky Survey and planned successors for (synoptic) astronomical observations. All of these are relied upon by large numbers of working scientists. Yet the people who compiled these databases are often not regarded by their colleagues as "real" scientists but, rather, as "once-scientists" who got off-track and started doing resource-building for the community. And it's true: many resource-builders don't have the time to be actively doing science (i.e., analysis and discovery); instead, they are building and enabling the tools that will advance the collective scientific enterprise in other, less traditional ways. The academic and research community faces a fundamental challenge in developing norms and practices that recognize and reward these essential contributions.This idea—of people not doing "real" research, even though they are building up resources that can enable others to do research—has played out as well in the humanities. The humanists have often tried to make a careful distinction between the work of building a base of evidence and the work of interpreting that evidence to support some particular analysis, thesis, and/or set of conclusions; this is a little easier in the humanities because the scale of collaboration surrounding emerging digital resources and their exploitation for scholarship is smaller (contrast this to the literal "cast of thousands" at CERN) and it's common here to see the leading participants play both roles: resource-builder and "working" scholar.
  • ...2 more annotations...
  • Still, in all of these examples of digital scholarship, a key challenge remains: How can we curate and manage data now that so much of it is being produced and collected in digital form? How can we ensure that it will be discovered, shared, and reused to advance scholarship?
  • On a final note, I have talked above mostly about changes in the practice of scholarship. But changes in the practice of scholarship need to go hand-in-hand with changes in the communication and documentation of scholarship.
  •  
    Interesting short piece on challenges of digital scholarship
Jennifer Parrott

Voyant Tools: Reveal Your Texts - 1 views

  •  
    Web-based reading and analysis environment for digital texts
Leslie Harris

Computing Crime and Punishment - NYTimes.com - 0 views

  •  
    The article discusses a computer-based analysis of word use in the court reports of trials at the Old Bailey from 1674 through 1913.
Leslie Harris

For Big-Data Scientists, 'Janitor Work' Is Key Hurdle to Insights - NYTimes.com - 0 views

  •  
    New York Times article about some of the challenges of "big data" analysis - particularly the data cleanup needed to make useful inferences.
jatolbert

The Digital-Humanities Bust - The Chronicle of Higher Education - 0 views

  • To ask about the field is really to ask how or what DH knows, and what it allows us to know. The answer, it turns out, is not much. Let’s begin with the tension between promise and product. Any neophyte to digital-humanities literature notices its extravagant rhetoric of exuberance. The field may be "transforming long-established disciplines like history or literary criticism," according to a Stanford Literary Lab email likely unread or disregarded by a majority in those disciplines. Laura Mandell, director of the Initiative for Digital Humanities, Media, and Culture at Texas A&M University, promises to break "the book format" without explaining why one might want to — even as books, against all predictions, doggedly persist, filling the airplane-hanger-sized warehouses of Amazon.com.
  • A similar shortfall is evident when digital humanists turn to straight literary criticism. "Distant reading," a method of studying novels without reading them, uses computer scanning to search for "units that are much smaller or much larger than the text" (in Franco Moretti’s words) — tropes, at one end, genres or systems, at the other. One of the most intelligent examples of the technique is Richard Jean So and Andrew Piper’s 2016 Atlantic article, "How Has the MFA Changed the American Novel?" (based on their research for articles published in academic journals). The authors set out to quantify "how similar authors were across a range of literary aspects, including diction, style, theme, setting." But they never cite exactly what the computers were asked to quantify. In the real world of novels, after all, style, theme, and character are often achieved relationally — that is, without leaving a trace in words or phrases recognizable as patterns by a program.
  • Perhaps toward that end, So, an assistant professor of English at the University of Chicago, wrote an elaborate article in Critical Inquiry with Hoyt Long (also of Chicago) on the uses of machine learning and "literary pattern recognition" in the study of modernist haiku poetry. Here they actually do specify what they instructed programmers to look for, and what computers actually counted. But the explanation introduces new problems that somehow escape the authors. By their own admission, some of their interpretations derive from what they knew "in advance"; hence the findings do not need the data and, as a result, are somewhat pointless. After 30 pages of highly technical discussion, the payoff is to tell us that haikus have formal features different from other short poems. We already knew that.
  • ...2 more annotations...
  • The outsized promises of big-data mining (which have been a fixture in big-figure grant proposals) seem curiously stuck at the level of confident assertion. In a 2011 New Left Review article, "Network Theory, Plot Analysis," Moretti gives us a promissory note that characterizes a lot of DH writing: "One day, after we add to these skeletons the layers of direction, weight and semantics, those richer images will perhaps make us see different genres — tragedies and comedies; picaresque, gothic, Bildungsroman … — as different shapes; ideally, they may even make visible the micro-patterns out of which these larger network shapes emerge." But what are the semantics of a shape when measured against the tragedy to which it corresponds? If "shape" is only a place-holder meant to allow for more-complex calculations of literary meaning (disburdened of their annoyingly human baggage), by what synesthetic principle do we reconvert it into its original, now reconfigured, genre-form? It is not simply that no answers are provided; it is that DH never asks the questions. And without them, how can Moretti’s "one day" ever arrive?
  • For all its resources, the digital humanities makes a rookie mistake: It confuses more information for more knowledge. DH doesn’t know why it thinks it knows what it does not know. And that is an odd place for a science to be.
Todd Suomela

Home - OpenMinTeD - 0 views

  •  
    "OpenMinted sets out to create an open, service-oriented ep-Infrastructure for Text and Data Mining (TDM) of scientific and scholarly content. Researchers can collaboratively create, discover, share and re-use Knowledge from a wide range of text-based scientific related sources in a seamless way."
Todd Suomela

DSHR's Blog: Ithaka's Perspective on Digital Preservation - 0 views

  • Second, there is very little coverage of Web archiving, which is clearly by far the largest and most important digital preservation initiative both for current and future readers. The Internet Archive rates only two mentions, in the middle of a list of activities and in a footnote. This is despite the fact that archive.org is currently the 211th most visited site in the US (272nd globally) with over 5.5M registered users, adding over 500 per day, and serving nearly 4M unique IPs per day. For comparison, the Library of Congress currently ranks 1439th in the US (5441st globally). The Internet Archive's Web collection alone probably dwarfs all other digital preservation efforts combined both in size and in usage. Not to mention their vast collections of software, digitized books, audio, video and TV news.. Rieger writes: There is a lack of understanding about how archived websites are discovered, used, and referenced. “Researchers prefer to cite the original live-web as it is easier and shorter,” pointed out one of the experts. “There is limited awareness of the existence of web archives and lack of community consensus on how to treat them in scholarly work. The problems are not about technology any more, it is about usability, awareness, and scholarly practices.” The interviewee referred to a recent CRL study based on an analysis of referrals to archived content from papers that concluded that the citations were mainly to articles about web archiving projects. It is surprising that the report doesn't point out that the responsibility for educating scholars in the use of resources lies with the "experts and thought leaders" from institutions such as the University of California, Michigan State, Cornell, MIT, NYU and Virginia Tech. That these "experts and thought leaders" don't consider the Internet Archive to be a resource worth mentioning might have something to do with the fact that their scholars don't know that they should be using it. A report whose first major section, entitled "What's Working Well", totally fails to acknowledge the single most important digital preservation effort of the last two decades clearly lacks credibility
  • Finally, there is no acknowledgement that the most serious challenge facing the field is economic. Except for a few corner cases, we know how to do digital preservation, we just don't want to pay enough to have it done. Thus the key challenge is to achieve some mixture of significant increase in funding for, and significant cost reduction in the processes of, digital preservation. Information technology processes naturally have very strong economies of scale, which result in winner-take-all markets (as W. Brian Arthur pointed out in 1985). It is notable that the report doesn't mention the winners we already have, in Web and source code archiving, and in emulation. All are at the point where a competitor is unlikely to be viable. To be affordable, digital preservation needs to be done at scale. The report's orientation is very much "let a thousand flowers bloom", which in IT markets only happens at a very early stage. This is likely the result of talking only to people nurturing a small-scale flower, not to people who have already dominated their market niche. It is certainly a risk that each area will have a single point of failure, but trying to fight against the inherent economics of IT pretty much guarantees ineffectiveness.
  • 1) The big successes in the field haven't come from consensus building around a roadmap, they have come from idiosyncratic individuals such as Brewster Kahle, Roberto di Cosmo and Jason Scott identifying a need and building a system to address it no matter what "the community" thinks. We have a couple of decades of experience showing that "the community" is incapable of coming to a coherent consensus that leads to action on a scale appropriate to the problem. In any case, describing road-mapping as "research" is a stretch. 2) Under severe funding pressure, almost all libraries have de-emphasized their custodial role of building collections in favor of responding to immediate client needs. Rieger writes: As one interviewee stated, library leaders have “shifted their attention from seeing preservation as a moral imperative to catering to the university’s immediate needs.” Regrettably, but inevitably given the economics of IT markets, this provides a market opportunity for outsourcing. Ithaka has exploited one such opportunity with Portico. This bullet does describe "research" in the sense of "market research".  Success is, however, much more likely to come from the success of an individual effort than from a consensus about what should be done among people who can't actually do it. 3) In the current climate, increased funding for libraries and archives simply isn't going to happen. These institutions have shown a marked reluctance to divert their shrinking funds from legacy to digital media. Thus the research topic with the greatest leverage in turning funds into preserved digital content is into increasing the cost-effectiveness of the tools, processes and infrastructure of digital preservation.
Todd Suomela

MOOCs Find Their Audience: Professional Learners and Universities | EdSurge News - 0 views

  • In my last year’s analysis of the MOOC space, I concluded that there’s been a decisive shift by MOOC providers to focus on “professional” learners who are taking these courses for career-related outcomes. At the recently concluded EMOOCs conference, the then CEO of Coursera, Rick Levin, shared his thoughts on this shift. He thinks that MOOCs may not have disrupted the education market, but they are disrupting the labor market. The real audience is not the traditional university student but what he calls the “lifelong career learner,” someone who might be well beyond their college years and takes these online courses with the goal of achieving professional and career growth.
  • One of the lessons I learned from running Class Central is that to make money, you need to make others money. By targeting professional learners, MOOC providers are trying to exactly do that. To better serve this audience, every MOOC provider has launched products that range from tens of dollars to tens of thousands of dollars. As a professional learner, I feel a certain amount of comfort knowing that high-quality educational material exists for skills that I would want to learn in the future. But if you are true lifelong learner—the ones that helped start all the hype in the first place—the MOOC experience has largely been reduced to basically a YouTube playlist with a cumbersome user interface. Unless, of course, you are willing to pay.
jatolbert

DHQ: Digital Humanities Quarterly: A Genealogy of Distant Reading - 0 views

  • Because Radway’s voice is candid and engaging, the book may not always sound like social science.
    • jatolbert
       
      I wonder what social science he's been reading.
  • In calling this approach minimally "scientific," I don’t mean to imply that we must suddenly adopt all the mores of chemists, or even psychologists
    • jatolbert
       
      And yet the effect is the same: scientizing processes and products which by their very natures as human works resist scientific analysis.
  • social science
    • jatolbert
       
      Again, this is a very different social science from that in which I received my own training, which has long held to the notion that objectivity is not only unobtainable, but undesirable.
  • ...4 more annotations...
  • But computational methods now matter deeply for literary history, because they can be applied to large digital libraries, guided by a theoretical framework that tells us how to pose meaningful questions on a social scale.
    • jatolbert
       
      I wonder about this. Is he suggesting that examining a large corpus of published works is the same as examining an entire society? This would seem to ignore issues of access and audience, literacy, representation, etc.
  • The term digital humanities stages intellectual life as a dialogue between humanists and machines. Instead of explicitly foregrounding experimental methods, it underlines a boundary between the humanities and social science.
  • Conflations of that kind could begin to create an unproductive debate, where parties to the debate fail to grasp the reason for disagreement, because they misunderstand each other’s real positions and commitments.
    • jatolbert
       
      Similar to the conflation of sociology with all of the social sciences.
  • the past
    • jatolbert
       
      Is it appropriate to conflate the -literary- past with -the past-? That is, can any study based wholly on texts claim to be in any way representative of things outside the sphere of what we call "literature"?
Todd Suomela

For Google, Everything Is a Popularity Contest - The Atlantic - 0 views

  • PageRank and Classic Papers reveal Google’s theory of knowledge: What is worth knowing is what best relates to what is already known to be worth knowing. Given a system that construes value by something’s visibility, be it academic paper or web page, the valuable resources are always the ones closest to those that already proved their value.Google enjoys the benefits of this reasoning as much as anyone. When Google tells people that it has found the most lasting scholarly articles on a subject, for example, the public is likely believe that story because they also believe Google tends to find the right answers.
  • It’s as if Google, the company that promised to organize and make accessible the world’s information, has done the opposite. Almost anything can be posted, published, or sold online today, but most of it cannot be seen. Instead, information remains hidden, penalized for having failed to be sufficiently connected to other, more popular information. But to think differently is so uncommon, the idea of doing so might not even arise—for shoppers and citizens as much as for scholars. All information is universally accessible, but some information is more universally accessible than others.
1 - 16 of 16
Showing 20 items per page