Skip to main content

Home/ Data Working Group/ Group items tagged Web

Rss Feed Group items tagged

Amy West

PLoS Computational Biology: Defrosting the Digital Library: Bibliographic Tools for the... - 0 views

  • Presently, the number of abstracts considerably exceeds the number of full-text papers,
  • full papers that are available electronically are likely to be much more widely read and cited
  • Since all of these libraries are available on the Web, increasing numbers of tools for managing digital libraries are also Web-based. They rely on Uniform Resource Identifiers (URIs [25] or “links”) to identify, name, and locate resources such as publications and their authors.
  • ...27 more annotations...
  • We often take URIs for granted, but these humble strings are fundamental to the way the Web works [58] and how libraries can exploit it, so they are a crucial part of the cyberinfrastructure [59] required for e-science on the Web.
  • link to data (the full-text of a given article),
  • To begin with, a user selects a paper, which will have come proximately from one of four sources: 1) searching some digital library, “SEARCH” in Figure 4; 2) browsing some digital library (“BROWSE”); 3) a personal recommendation, word-of-mouth from colleague, etc., (“RECOMMEND”); 4) referred to by reading another paper, and thus cited in its reference list (“READ”)
  • There is no universal method to retrieve a given paper, because there is no single way of identifying publications across all digital libraries on the Web
  • Publication metadata often gets “divorced” from the data it is about, and this forces users to manage each independently, a cumbersome and error-prone process.
  • There is no single way of representing metadata, and without adherence to common standards (which largely already exist, but in a plurality) there never will be.
  • Where DOIs exist, they are supposed to be the definitive URI. This kind of automated disambiguation, of publications and authors, is a common requirement for building better digital libraries
  • Publication metadata are essential for machines and humans in many tasks, not just the disambiguation described above. Despite their importance, metadata can be frustratingly difficult to obtain.
  • So, given an arbitrary URI, there are only two guaranteed options for getting any metadata associated with it. Using http [135], it is possible to for a human (or machine) to do the following.
  • This technique works, but is not particularly robust or scalable because every time the style of a particular Web site changes, the screen-scraper will probably break as well
  • This returns metadata only, not the whole resource. These metadata will not include the author, journal, title, date, etc., of
  • As it stands, it is not possible to perform mundane and seemingly simple tasks such as, “get me all publications that fulfill some criteria and for which I have licensed access as PDF” to save locally, or “get me a specific publication and all those it immediately references”.
  • Having all these different metadata standards would not be a problem if they could easily be converted to and from each other, a process known as “round-tripping”.
  • many of these mappings are non-trivial, e.g., XML to RDF and back again
  • more complex metadata such as the inbound and outbound citations, related articles, and “supplementary” information.
  • Personalization allows users to say this is my library, the sources I am interested in, my collection of references, as well as literature I have authored or co-authored. Socialization allows users to share their personal collections and see who else is reading the same publications, including added information such as related papers with the same keyword (or “tag”) and what notes other people have written about a given publication.
  • CiteULike normalizes bookmarks before adding them to its database, which means it calculates whether each URI bookmarked identifies an identical publication added by another user, with an equivalent URI. This is important for social tagging applications, because part of their value is the ability to see how many people (and who) have bookmarked a given publication. CiteULike also captures another important bibliometric, viz how many users have potentially read a publication, not just cited it.
  • Connotea uses MD5 hashes [157] to store URIs that users bookmark, and normalizes them after adding them to its database, rather than before.
  • he source code for Connotea [159] is available, and there is an API that allows software engineers to build extra functionality around Connnotea, for example the Entity Describer [160].
  • Personalization and socialization of information will increasingly blur the distinction between databases and journals [175], and this is especially true in computational biology where contributions are particularly of a digital nature.
  • This is usually because they are either too “small” or too “big” to fit into journals.
  • As we move in biology from a focus on hypothesis-driven to data-driven science [1],[181],[182], it is increasingly recognized that databases, software models, and instrumentation are the scientific output, rather than the conventional and more discursive descriptions of experiments and their results.
  • In the digital library, these size differences are becoming increasingly meaningless as data, information, and knowledge become more integrated, socialized, personalized, and accessible. Take Postgenomic [183], for example, which aggregates scientific blog posts from a wide variety of sources. These posts can contain commentary on peer-reviewed literature and links into primary database sources. Ultimately, this means that the boundaries between the different types of information and knowledge are continually blurring, and future tools seem likely to continue this trend.
  • he identity of people is a twofold problem because applications need to identify people as users in a system and as authors of publications.
  • Passing valuable data and metadata onto a third party requires that users trust the organization providing the service. For large publishers such as Nature Publishing Group, responsible for Connotea, this is not necessarily a problem.
  • business models may unilaterally change their data model, making the tools for accessing their data backwards incompatible, a common occurrence in bioinformatics.
  • Although the practice of sharing raw data immediately, as with Open Notebook Science [190], is gaining ground, many users are understandably cautious about sharing information online before peer-reviewed publication.
  •  
    Yes, but Alexandria was also a lot smaller; not totally persuaded by analogy here...
Lisa Johnston

Open Science Data Initiative (OSDI) - 0 views

  •  
    he Open Science Data Initiative is an initiative led by Oak Ridge National Laboratory in partnership with Microsoft's Public Sector Developer Evangelism team. OSDI is based on OGDI which in turn uses the Azure Services Platform to make it easier to publish and use a wide variety of scientific data from government agencies. OSDI is an sample of OGDI's open source 'starter kit' (coming soon) with code that can be used to publish data on the Internet in a Web-friendly format with easy-to-use, open API's. OSDI-based web API's can be accessed from a variety of client technologies such as Silverlight, Flash, JavaScript, PHP, Python, Ruby, mapping web sites, etc. Whether you are a researcher wishing to use scientific data, a hobyist developer, or a "budding scientist", these open API's will enable you to build innovative applications, visualizations and mash-ups that empower people through access to scientific information. This site is built using the OGDI starter kit software assets and provides interactive access to some publicly-available data sets along with sample code and resources for writing applications using the OSDI APIs.
David Govoni

The Sensor Web: Bringing Information to Life | ERCIM News - EN76 | ERCIM - 0 views

  •  
    ERCIM News special theme, downloadable as PDF.
David Govoni

JISC-PoWR » Handbook - 0 views

  •  
    JISC-PoWR handbook on the presercation of Web resources.
Lisa Johnston

Got big data? Crunch it with Google's BigQuery | VentureBeat - 2 views

  •  
    BigQuery??  "the service is designed for large-scale internal data analytics, to companies of all sizes, and it's adding a web interface so you can do it all in the cloud."
Lisa Johnston

Data Management - Dana Library's Data Support - Research Guides at Rutgers University - 1 views

  •  
    Lots of new materials here...we should add some of this to our web site!
David Govoni

JISC-PoWR Preservation of Web Resources Project (UK) - 0 views

  •  
    Blog set up to support the work of JISC-PoWR, a project initiated by the JISC Integrated Information Environment Committee (UK).
David Govoni

Science & Social Media | Tamara Zemlo, BioInformatics, LLC | SciVee - 1 views

  •  
    "On Jan. 6, 2009, in Arlington, Virginia, the National Science Foundation, The Ballston Science and Technology Alliance, and BioInformatics, LLC, hosted a Cafe Scientifique on Science and Social Media. In part 1 of this 4 part video, Dr. Tamara Zemlo from
David Govoni

Science & Social Media | Chris Condayan, ASM/MicrobeWorld | SciVee - 0 views

  •  
    "On Jan. 6, 2009, in Arlington, Virginia, the National Science Foundation, The Ballston Science and Technology Alliance, and BioInformatics, LLC, hosted a Cafe Scientifique on Science and Social Media. In part 2 of this 4 part video, Chris Condayan, Manag
David Govoni

Science & Social Media | Stephanie Stockman, NASA | SciVee - 0 views

  •  
    "On Jan. 6, 2009, in Arlington, Virginia, the National Science Foundation, The Ballston Science and Technology Alliance, and BioInformatics, LLC, hosted a Cafe Scientifique on Science and Social Media. In part 3 of this 4 part video, Stephanie Stockman, a
David Govoni

Science & Social Media | Nancy Shute, US News & World Report | SciVee - 0 views

  •  
    "On Jan. 6, 2009, in Arlington, Virginia, the National Science Foundation, The Ballston Science and Technology Alliance, and BioInformatics, LLC, hosted a Cafe Scientifique on Science and Social Media. In the final segment of this 4 part video, Nancy Shut
David Govoni

OSTI - Widgets - 0 views

  •  
    Embeddable (e.g., into iGoogle) widgets to access a variety of scientific and technical data and reference resources. U.S. Department of Energy (DOE), Office of Scientific & Technical Information (OSTI).
David Govoni

SciTechNet(sm): Science and Technology Social Networking Services - 0 views

  •  
    SciTechNet(sm) is devoted to describing and documenting online social networking services in the Sciences and Technology.
David Govoni

Scratchpads | Biodiversity Online - 0 views

  •  
    "Scratchpads are an easy to use, social networking application that enable communities of researchers to manage, share and publish taxonomic data online. Sites are hosted at the Natural History Museum London, and offered free to any scientist that complet
David Govoni

SciLink - 0 views

  •  
    A specialized science-oriented community on the Facebook and LinkedIn models. SciLink™ is an online community with the goal of helping you to discover scientists, authors, and relationships.
David Govoni

Biodiversity Information Standards | TDWG - 0 views

  •  
    "Biodiversity Information Standards (TDWG) is an international not-for-profit group that develops standards and protocols for sharing biodiversity data."
1 - 20 of 32 Next ›
Showing 20 items per page