Group items tagged

Filter: All | Bookmarks | Topics Simple Middle

Interagency Data Stewardship/Citations/provider guidelines - Federation of Earth Scienc... - 0 views

wiki.esipfed.org/...provider_guidelines

data citations guidelines earth science

shared by Amy West on 13 Sep 11 - No Cached

- Amy West on 13 Sep 11
  
  Little confused by what's meant by "data sets should be cited like books" since they go on to provide really good reasons why data aren't like books, e.g. need subsetting information, access date for dynamic databases.
  
  <div class="cArrow"> </div><div class="cContentInner">Little confused by what's meant by "data sets should be cited like books" since they go on to provide really good reasons why data aren't like books, e.g. need subsetting information, access date for dynamic databases.</div>
  
  ...
  
  Cancel
...

Cancel
The guidelines build from the IPY Guidelines and are compatible with the DataCite Metadata Scheme for the Publication and Citation of Research Data, Version 2.2, July 2011.
...

Cancel
In some cases, the data set authors may have also published a paper describing the data in great detail. These sort of data papers should be encouraged, and both the paper and the data set should be cited when the data are used.
...

Cancel
...27 more annotations...
Ongoing updates to a time series do change the content of the data set, but they do not typically constitute a new version or edition of a data set. New versions typically reflect changes in sampling protocols, algorithms, quality control processes, etc. Both a new version and an update may be reflected in the release date.
...

Cancel
Locator, Identifier, or Distribution Medium
...

Cancel
Then it is necessary to include a persistant reference to the location of the data.
...

Cancel
This may be the most challenging aspect of data citation. It is necessary to enable "micro-citation" or the ability to refer to the specific data used--the exact files, granules, records, etc.
...

Cancel
Data stewards should suggest how to reference subsets of their data. With Earth science data, subsets can often be identified by referring to a temporal and spatial range.
...

Cancel
A particular data set may be part of a compilation, in which case it is appropriate to cite the data set somewhat like a chapter in an edited volume.
...

Cancel
Increasingly, publishers are allowing data supplements to be published along with peer-reviewed research papers. When using the data supplement one need only cite the parent reference. F
...

Cancel
Confusingly, a Digital Object Identifier is a locator. It is a Handle based scheme whereby the steward of the digital object registers a location (typically a URL) for the object. There is no guarantee that the object at the registered location will remain unchanged. Consider a continually updated data time series, for example.
...

Cancel
While it is desirable to uniquely identify the cited object, it has proven extremely challenging to identify whether two data sets or data files are scientifically identical.
...

Cancel
At this point, we must rely on location information combined with other information such as author, title, and version to uniquely identify data used in a study.
...

Cancel
The key to making registered locators, such as DOIs, ARKS, or Handles, work unambiguously to identify and locate data sets is through careful tracking and documentation of versions.
...

Cancel
how to handle different data set versions relative to an assigned locator.
...

Cancel
Track major_version.minor_version.[archive_version].
...

Cancel
Typically, something that affects the whole data set like a reprocessing would be considered a major version.
...

Cancel
Assign unique locators to major versions.
...

Cancel
Old locators for retired versions should be maintained and point to some appropriate web site that explains what happened to the old data if they were not archived.
...

Cancel
A new major version leads to the creation of a new collection-level metadata record that is distributed to appropriate registries. The older metadata record should remain with a pointer to the new version and with explanation of the status of the older version data.
...

Cancel
Major and minor version should be listed in the recommended citation.
...

Cancel
inor versions should be explained in documentation
...

Cancel
Ongoing additions to an existing time series need not constitute a new version. This is one reason for capturing the date accessed when citing the data.
...

Cancel
we believe it is currently impossible to fully satisfy the requirement of scientific reproducibility in all situations
...

Cancel
To aid scientific reproducibility through direct, unambiguous reference to the precise data used in a particular study. (This is the paramount purpose and also the hardest to achieve). To provide fair credit for data creators or authors, data stewards, and other critical people in the data production and curation process. To ensure scientific transparency and reasonable accountability for authors and stewards. To aid in tracking the impact of data set and the associated data center through reference in scientific literature. To help data authors verify how their data are being used. To help future data users identify how others have used the data.
...

Cancel
The ESIP Preservation and Stewardship cluster has examined these and other current approaches and has found that they are generally compatible and useful, but they do not entirely meet all the purposes of Earth science data citation.
...

Cancel
In general, data sets should be cited like books.
...

Cancel
hey need to use the style dictated by their publishers, but by providing an example, data stewards can give users all the important elements that should be included in their citations of data sets
...

Cancel
Access Date and Time--because data can be dynamic and changeable in ways that are not always reflected in release dates and versions, it is important to indicate when on-line data were accessed.
...

Cancel
Additionally, it is important to provide a scheme for users to indicate the precise subset of data that were used. This could be the temporal and spatial range of the data, the types of files used, a specific query id, or other ways of describing how the data were subsetted.
...

Cancel

Developing the Capability and Skills to Support eResearch - 0 views

www.ariadne.ac.uk/henty

eresearch escience ariadne

shared by Amy West on 07 Nov 08 - Cached

Of particular concern to this article is the need for improved levels of data stewardship to enable good data management for long-term sustainability, both at national and institutional levels.
...

Cancel
researchers, particularly those engaged in data-intensive research; systems developers, data scientists and other technical staff; data managers of institutional repositories, data archives and discipline-based data centres and their support staff, with those who liaise between depositors and the repository as being of particular interest; and those who are engaged in high-level policy formulation, either in government or research institutions.
...

Cancel
Interviews were conducted with twelve key established researchers in six Australian institutions, with a focus on academics engaged in data-intensive research. Interviews were conducted also with the manager of a large data centre, and a repository administrator. The institutions concerned were the Australian National University, the University of Melbourne, the University of Tasmania, the University of Queensland, the University of Sydney and one area of the Commonwealth Scientific and Industrial Research Organisation (CSIRO).
...

Cancel
...18 more annotations...
There was wide agreement that there are three types of skills required for practitioners of eResearch, their support staff and repository staff. Not surprisingly, there was a strong need for technical skills. Perhaps not as obvious was the identification of a wide range of non-technical skills. Less obvious again was mention of an assortment of personal qualities, which, while not skills in the formal sense of the term, were singled out as being important.
...

Cancel
The surveys indicated that not everyone needs the same level of technical skills to conduct or support eResearch.
...

Cancel
So you need a basic literacy level to look after your computers where you’re storing your data, and then in order to access, like a remote repository, you need to know something about how to connect to that remote repository, what the format of the data should be to go in it, how to convert your data to that required format
...

Cancel
included skills related to high-performance computing (HPC) and the access grid, data (and database) management, data curation, information engineering, information modelling, software development, remote communications, distributed processing, informatics, portal design, computational fluid dynamics, database integration, visualisation and programming of all kinds.
...

Cancel
Some of these skills are tightly connected to specific disciplines, especially informatics.
...

Cancel
The need for technical skills is allied to the ability to understand end-to-end workflows, especially for repository managers and developers who need to be able to think like the researcher and to apply that understanding to developing the repository. By workflows, I mean the many software applications, processing operations and interactions required for research tasks to be carried through to completion.
...

Cancel
The group of librarians at ‘The Researcher Librarian Nexus’ workshop identified a need for further development of their technical skills, mentioning in particular metadata, something which did not feature among any of the other responses, other than by implication.
...

Cancel
These vary from skills in data analysis (including the use of statistical packages and other techniques such as data mining) through information seeking to a broader range of general skills. Project management, business analysis, communications, negotiation, intellectual property, team building and train the trainer were mentioned specifically. Another was generic problem solving, because, as one researcher aptly put it, the kinds of problems which arise when undertaking eResearch mean that ‘There’s never going to be someone who has done it before.’
...

Cancel
The librarians involved with the Researcher/Librarian Nexus workshop also identified it as being of high priority for repository managers, along with marketing, advocacy, copyright, metadata, educational outreach and grant submission writing. They also singled out the intriguing skill of ‘researcher management’ while not specifying precisely what this might entail.
...

Cancel
A good grasp of copyright and intellectual property issues was seen as essential,
...

Cancel
These were listed as: open-mindedness, patience and an ‘ability to cooperate and collaborate rather than compete’
...

Cancel
For example, one researcher, in the field of finance told me of his need for programmers who have a high level of expertise in economics, econometrics, statistics, maths and programming; ‘otherwise all the programming expertise doesn’t really help because then they make strange assumptions in their coding that just result in nonsense output.’
...

Cancel
One solution to the need to bridge the disciplinary gap is to use graduate students to help with the technical aspects, where those students have an interest and aptitude for this kind of work. In some cases this might be done by providing scholarships, the students then graduating with a PhD on the basis their contribution to the research project has been of sufficient originality to warrant the degree.
...

Cancel
The barrier to research most often mentioned was the difficulty in assembling all the skills required to conduct a project, particularly in relation to data management and stewardship. In some cases the gap is organisational, as happens for example when the researcher is either unaware of or unable to tap into the skills of a central IT unit. More often the gap was in a lack of understanding of what each group needs, what each has to offer and where responsibilities lie. Examples of this can be seen in comments like the following:
...

Cancel
For instance if you’ve got data in say NetCDF file formats and the repository wants it in TIFF format, well you need to know something about the technicality of getting your data from NetCDF format into TIFF format.
...

Cancel
The humanities and social sciences are notable areas where the take-up rate of eResearch has been slower than, for example, in the hard sciences, and where there have been calls for exemplars to be publicised. Many practitioners in the humanities and social sciences find it difficult to envisage where their work might fit into the concept of eResearch.
...

Cancel
Few researchers are aware that there are such things as repositories, so it is important that the repository is seen as (and indeed is) ‘a good repository – that it’s good in the sense of its high quality but also good in that it adds value for [the researcher].’
...

Cancel
If research institutions are to minimise the gap between the ideals and realities of eResearch, there is some way to go in providing both institutional capacity and appropriately qualified individuals. While eResearch is dependent on good ICT infrastructure, this is not sufficient in itself. The results of the survey outlined here show that capacity in information technology skills is important but must be accompanied by a range of non-technical skills in such areas as project management. Equally important is the creation of research environments which are covered by well-propagated and understood policies, which are appropriately organised into structures with clearly delineated roles and responsibilities and which minimise the current barriers experienced by many researchers.
...

Cancel

Project MUSE - Library Trends - Volume 57, Number 2, Fall 2008 - 0 views

muse.jhu.edu/...lib.57.2.html

data curation data management

shared by umgeoglib on 27 Mar 09 - Cached

umgeoglib on 27 Mar 09

4 Articles related to data curation/data management: At the Watershed: Preparing for Research Data Management and Stewardship at the University of Minnesota Libraries Case Study in Data Curation at Johns Hopkins University Shedding Light on the Dark Data in the Long Tail of Science Institutional Repositories and Research Data Curation in a Distributed Environment

<div class="cArrow"> </div><div class="cContentInner">4 Articles related to data curation/data management: At the Watershed: Preparing for Research Data Management and Stewardship at the University of Minnesota Libraries Case Study in Data Curation at Johns Hopkins University Shedding Light on the Dark Data in the Long Tail of Science Institutional Repositories and Research Data Curation in a Distributed Environment </div>

...

Cancel

Open Science Data Initiative (OSDI) - 0 views

ornlogdi.cloudapp.net/Default.aspx

shared by Lisa Johnston on 29 Apr 10 - Cached

Lisa Johnston on 29 Apr 10

he Open Science Data Initiative is an initiative led by Oak Ridge National Laboratory in partnership with Microsoft's Public Sector Developer Evangelism team. OSDI is based on OGDI which in turn uses the Azure Services Platform to make it easier to publish and use a wide variety of scientific data from government agencies. OSDI is an sample of OGDI's open source 'starter kit' (coming soon) with code that can be used to publish data on the Internet in a Web-friendly format with easy-to-use, open API's. OSDI-based web API's can be accessed from a variety of client technologies such as Silverlight, Flash, JavaScript, PHP, Python, Ruby, mapping web sites, etc. Whether you are a researcher wishing to use scientific data, a hobyist developer, or a "budding scientist", these open API's will enable you to build innovative applications, visualizations and mash-ups that empower people through access to scientific information. This site is built using the OGDI starter kit software assets and provides interactive access to some publicly-available data sets along with sample code and resources for writing applications using the OSDI APIs.

<div class="cArrow"> </div><div class="cContentInner">he Open Science Data Initiative is an initiative led by Oak Ridge National Laboratory in partnership with Microsoft's Public Sector Developer Evangelism team. OSDI is based on OGDI which in turn uses the Azure Services Platform to make it easier to publish and use a wide variety of scientific data from government agencies. OSDI is an sample of OGDI's open source 'starter kit' (coming soon) with code that can be used to publish data on the Internet in a Web-friendly format with easy-to-use, open API's. OSDI-based web API's can be accessed from a variety of client technologies such as Silverlight, Flash, JavaScript, PHP, Python, Ruby, mapping web sites, etc. Whether you are a researcher wishing to use scientific data, a hobyist developer, or a "budding scientist", these open API's will enable you to build innovative applications, visualizations and mash-ups that empower people through access to scientific information. This site is built using the OGDI starter kit software assets and provides interactive access to some publicly-available data sets along with sample code and resources for writing applications using the OSDI APIs.</div>

...

Cancel

In case you can't read…. | Prof-Like Substance - 1 views

scientopia.org/...in-case-you-cant-read

conference presentations unpublished data

shared by Amy West on 04 Aug 11 - No Cached

When I am putting a talk together it would never occur to me not to include a health dose of unpublished data. The only times in my career that I have talked about mostly published data have been when I first started as a postdoc and in the early days of being a PI, when I didn't have enough new data to even make a coherent story, but that accounts for maybe three professional talks out of man
...

Cancel
s it a fear of being scooped or a penchant for keeping one's ideas close to the chest that promotes the Summary Talk?
...

Cancel
I think it's field dependent. Personally, I can rarely get enough information from a talk to know whether to believe a result or not. This means that unpublished data usually ends up with me thinking "maybe, maybe not".
...

Cancel
...10 more annotations...
(A good talk like this has enough of a citation on the slide that I can jot down where to go if I want to know details on any particular result.)
...

Cancel
I'm in a highly competitive biomed field, and I was taught never to present something unless it was either submitted or ready to be submitted.
...

Cancel
I don't really spend any time worrying about being scooped because I collect my own data.
...

Cancel
Why look at a poster or talk of 100% published work, I've already seen the stuff in a journal to start with
...

Cancel
Final year materials chemist = keeping cards close to my chest. Once bitten, never again.
...

Cancel
In neuro, I'd say that at smaller conferences and less high-profile talks at big conferences (i.e. not keynotes or featured lectures), the bulk of what you're hearing is unpublished. ALL posters are unpublished--in fact, I think (?) it's a rule at SfN that the content of posters can't be published already.
...

Cancel
In my field I'd guess that most talks include data that is in press or at some close to publication sta
...

Cancel
A big name should be more generous, but then again they do have to save guard the career of the student/postdoc who generated the data. Also the star or keynote speaker is expected to address a wider audience, and make their talk relevant to the overall theme of the conference.
...

Cancel
In my (experimental) social science, most conferences explicitly say that you cannot submit to present already published or even accepted work.
...

Cancel
In my field (Astronomy), I'd say 95% of the talks are about unpublished data.
...

Cancel

Amy West on 04 Aug 11

A blog post & comments on what's preferred in conference presentations: published or unpublished data. Interesting.

<div class="cArrow"> </div><div class="cContentInner">A blog post & comments on what's preferred in conference presentations: published or unpublished data. Interesting.</div>

...

Cancel

WHAT EXPLAINS THE GERMAN LABOR MARKET MIRACLE IN THE GREAT RECESSION? - 0 views

www.nber.org/w17187.pdf

data management citations social science practices

shared by Amy West on 07 Jul 11 - No Cached

Amy West on 07 Jul 11

This paper uses, among other sources, the US Bureau of Labor Statistics CPS data that covers 1960-2009 to analyze just 2 years of data. The authors do cite the whole CPS, but you have to read the paper to see which bits of that set matter to this paper. The bulk of the paper itself is their explanation of the various statistical methods they used to support their conclusions. The data is neither novel or unique to them. Their analysis however, may be novel and is certainly unique to them. They also provide some technical documentation, e.g. we did x with SPSS. So, ideally, it would be nice to have a citation to the paper, to the 2 year subset of data relevant to it and a citation to the entire BLS CPS data. This is not agricultural economics, but I think that pretty similar patterns will be found there too.

<div class="cArrow"> </div><div class="cContentInner">This paper uses, among other sources, the US Bureau of Labor Statistics CPS data that covers 1960-2009 to analyze just 2 years of data. The authors do cite the whole CPS, but you have to read the paper to see which bits of that set matter to this paper. The bulk of the paper itself is their explanation of the various statistical methods they used to support their conclusions. The data is neither novel or unique to them. Their analysis however, may be novel and is certainly unique to them. They also provide some technical documentation, e.g. we did x with SPSS. So, ideally, it would be nice to have a citation to the paper, to the 2 year subset of data relevant to it and a citation to the entire BLS CPS data. This is not agricultural economics, but I think that pretty similar patterns will be found there too.</div>

...

Cancel

Open access to research data a lot tougher than you think - 2 views

arstechnica.com/...lot-tougher-than-you-think.ars

open access data publications preservation

shared by Amy West on 29 Aug 11 - No Cached

It means that researchers need to deal with the formatting and deposition of data, an annoying step when they would rather be focusing on their next project. Given the time lag, it's also difficult to associate the correct metadata with the material that's being a
...

Cancel
According to the commentary, scientists view data deposition as a burden due to the extra work it involves. Research data is usually not in the correct format for submission to repositories when the project is completed, and so the scientist must take the time to convert it.
...

Cancel
The authors here propose a new approach to data management, where each research institution should employ data managers to work with scientists and administer local, structured data storage. Local storage and support is the preference of most scientists, who would rather not hand off control of their data to remote strangers.
...

Cancel

Data Archiving - The American Naturalist - 2 views

www.journals.uchicago.edu/...650340

sharing repositories archive

shared by Lisa Johnston on 21 Jan 10 - Cached

Lisa Johnston on 21 Jan 10

Science depends on good data. Data are central to our understanding of the natural world, yet most data in ecology and evolution are lost to science-except perhaps in summary form-very quickly after they are collected. ... Yet these data, even after the main results for which they were collected are published, are invaluable to science, for meta‐analysis, new uses, and quality control.

<div class="cArrow"> </div><div class="cContentInner">Science depends on good data. Data are central to our understanding of the natural world, yet most data in ecology and evolution are lost to science-except perhaps in summary form-very quickly after they are collected. ... Yet these data, even after the main results for which they were collected are published, are invaluable to science, for meta‐analysis, new uses, and quality control. </div>

...

Cancel

PLoS Computational Biology: Defrosting the Digital Library: Bibliographic Tools for the... - 0 views

www.ploscompbiol.org/...10.1371%2Fjournal.pcbi.1000204

shared by Amy West on 12 Nov 08 - Cached

Presently, the number of abstracts considerably exceeds the number of full-text papers,
...

Cancel
full papers that are available electronically are likely to be much more widely read and cited
...

Cancel
Since all of these libraries are available on the Web, increasing numbers of tools for managing digital libraries are also Web-based. They rely on Uniform Resource Identifiers (URIs [25] or “links”) to identify, name, and locate resources such as publications and their authors.
...

Cancel
...27 more annotations...
We often take URIs for granted, but these humble strings are fundamental to the way the Web works [58] and how libraries can exploit it, so they are a crucial part of the cyberinfrastructure [59] required for e-science on the Web.
...

Cancel
link to data (the full-text of a given article),
...

Cancel
To begin with, a user selects a paper, which will have come proximately from one of four sources: 1) searching some digital library, “SEARCH” in Figure 4; 2) browsing some digital library (“BROWSE”); 3) a personal recommendation, word-of-mouth from colleague, etc., (“RECOMMEND”); 4) referred to by reading another paper, and thus cited in its reference list (“READ”)
...

Cancel
There is no universal method to retrieve a given paper, because there is no single way of identifying publications across all digital libraries on the Web
...

Cancel
Publication metadata often gets “divorced” from the data it is about, and this forces users to manage each independently, a cumbersome and error-prone process.
...

Cancel
There is no single way of representing metadata, and without adherence to common standards (which largely already exist, but in a plurality) there never will be.
...

Cancel
Where DOIs exist, they are supposed to be the definitive URI. This kind of automated disambiguation, of publications and authors, is a common requirement for building better digital libraries
...

Cancel
Publication metadata are essential for machines and humans in many tasks, not just the disambiguation described above. Despite their importance, metadata can be frustratingly difficult to obtain.
...

Cancel
So, given an arbitrary URI, there are only two guaranteed options for getting any metadata associated with it. Using http [135], it is possible to for a human (or machine) to do the following.
...

Cancel
This technique works, but is not particularly robust or scalable because every time the style of a particular Web site changes, the screen-scraper will probably break as well
...

Cancel
This returns metadata only, not the whole resource. These metadata will not include the author, journal, title, date, etc., of
...

Cancel
As it stands, it is not possible to perform mundane and seemingly simple tasks such as, “get me all publications that fulfill some criteria and for which I have licensed access as PDF” to save locally, or “get me a specific publication and all those it immediately references”.
...

Cancel
Having all these different metadata standards would not be a problem if they could easily be converted to and from each other, a process known as “round-tripping”.
...

Cancel
many of these mappings are non-trivial, e.g., XML to RDF and back again
...

Cancel
more complex metadata such as the inbound and outbound citations, related articles, and “supplementary” information.
...

Cancel
Personalization allows users to say this is my library, the sources I am interested in, my collection of references, as well as literature I have authored or co-authored. Socialization allows users to share their personal collections and see who else is reading the same publications, including added information such as related papers with the same keyword (or “tag”) and what notes other people have written about a given publication.
...

Cancel
CiteULike normalizes bookmarks before adding them to its database, which means it calculates whether each URI bookmarked identifies an identical publication added by another user, with an equivalent URI. This is important for social tagging applications, because part of their value is the ability to see how many people (and who) have bookmarked a given publication. CiteULike also captures another important bibliometric, viz how many users have potentially read a publication, not just cited it.
...

Cancel
Connotea uses MD5 hashes [157] to store URIs that users bookmark, and normalizes them after adding them to its database, rather than before.
...

Cancel
he source code for Connotea [159] is available, and there is an API that allows software engineers to build extra functionality around Connnotea, for example the Entity Describer [160].
...

Cancel
Personalization and socialization of information will increasingly blur the distinction between databases and journals [175], and this is especially true in computational biology where contributions are particularly of a digital nature.
...

Cancel
This is usually because they are either too “small” or too “big” to fit into journals.
...

Cancel
As we move in biology from a focus on hypothesis-driven to data-driven science [1],[181],[182], it is increasingly recognized that databases, software models, and instrumentation are the scientific output, rather than the conventional and more discursive descriptions of experiments and their results.
...

Cancel
In the digital library, these size differences are becoming increasingly meaningless as data, information, and knowledge become more integrated, socialized, personalized, and accessible. Take Postgenomic [183], for example, which aggregates scientific blog posts from a wide variety of sources. These posts can contain commentary on peer-reviewed literature and links into primary database sources. Ultimately, this means that the boundaries between the different types of information and knowledge are continually blurring, and future tools seem likely to continue this trend.
...

Cancel
he identity of people is a twofold problem because applications need to identify people as users in a system and as authors of publications.
...

Cancel
Passing valuable data and metadata onto a third party requires that users trust the organization providing the service. For large publishers such as Nature Publishing Group, responsible for Connotea, this is not necessarily a problem.
...

Cancel
business models may unilaterally change their data model, making the tools for accessing their data backwards incompatible, a common occurrence in bioinformatics.
...

Cancel
Although the practice of sharing raw data immediately, as with Open Notebook Science [190], is gaining ground, many users are understandably cautious about sharing information online before peer-reviewed publication.
...

Cancel

Amy West on 12 Nov 08

Yes, but Alexandria was also a lot smaller; not totally persuaded by analogy here...

<div class="cArrow"> </div><div class="cContentInner">Yes, but Alexandria was also a lot smaller; not totally persuaded by analogy here...</div>

...

Cancel

Chronopolis -- Digital Preservation Program -- Long-Term Mass-Scale Federated Digital P... - 0 views

chronopolis.sdsc.edu/about.html

shared by Lisa Johnston on 30 Dec 09 - Cached

Lisa Johnston on 30 Dec 09

The Chronopolis Digital Preservation Demonstration Project, one of the Library of Congress' latest efforts to collect and preserve at-risk digital information, has been officially launched as a multi-member partnership to meet the archival needs of a wide range of cultural and social domains. Chronopolis is a digital preservation data grid framework being developed by the San Diego Supercomputer Center (SDSC) at UC San Diego , the UC San Diego Libraries (UCSDL) , and their partners at the National Center for Atmospheric Research (NCAR) in Colorado and the University of Maryland's Institute for Advanced Computer Studies (UMIACS) . A key goal of the Chronopolis project is to provide cross-domain collection sharing for long-term preservation. Using existing high-speed educational and research networks and mass-scale storage infrastructure investments, the partnership is designed to leverage the data storage capabilities at SDSC, NCAR, and UMIACS to provide a preservation data grid that emphasizes heterogeneous and highly redundant data storage systems.

<div class="cArrow"> </div><div class="cContentInner">The Chronopolis Digital Preservation Demonstration Project, one of the Library of Congress' latest efforts to collect and preserve at-risk digital information, has been officially launched as a multi-member partnership to meet the archival needs of a wide range of cultural and social domains. Chronopolis is a digital preservation data grid framework being developed by the San Diego Supercomputer Center (SDSC) at UC San Diego , the UC San Diego Libraries (UCSDL) , and their partners at the National Center for Atmospheric Research (NCAR) in Colorado and the University of Maryland's Institute for Advanced Computer Studies (UMIACS) . A key goal of the Chronopolis project is to provide cross-domain collection sharing for long-term preservation. Using existing high-speed educational and research networks and mass-scale storage infrastructure investments, the partnership is designed to leverage the data storage capabilities at SDSC, NCAR, and UMIACS to provide a preservation data grid that emphasizes heterogeneous and highly redundant data storage systems. </div>

...

Cancel

DOE DATA EXPLORER - 0 views

www.osti.gov/dataexplorer

escience doe computer simulations figures plots maps images

shared by Amy West on 07 Nov 08 - Cached

Amy West on 07 Nov 08

Use the DOE Data Explorer (DDE) to find scientific research data - such as computer simulations, numeric data files, figures and plots, interactive maps, multimedia, and scientific images - generated in the course of DOE-sponsored research in various science disciplines.

<div class="cArrow"> </div><div class="cContentInner">Use the DOE Data Explorer (DDE) to find scientific research data - such as computer simulations, numeric data files, figures and plots, interactive maps, multimedia, and scientific images - generated in the course of DOE-sponsored research in various science disciplines.</div>

...

Cancel

Shedding Light on the Dark Data in the Long Tail of Science - 0 views

66.102.1.104/scholar

shared by Lisa Johnston on 03 Dec 08 - Cached

Lisa Johnston on 03 Dec 08

This paper focuses on a particularly troublesome class of data, termed "dark data". "Dark data" is not carefully indexed and stored so becomes nearly invisible to scientists and other potential users and therefore is more likely to remain underutilized and eventually lost

<div class="cArrow"> </div><div class="cContentInner">This paper focuses on a particularly troublesome class of data, termed "dark data". "Dark data" is not carefully indexed and stored so becomes nearly invisible to scientists and other potential users and therefore is more likely to remain underutilized and eventually lost</div>

...

Cancel

Democratic Dividends: Stockholding, wealth and politics in New York - 1 views

www.nber.org/w17147.pdf

data methodology

shared by Amy West on 21 Jun 11 - No Cached

Amy West on 21 Jun 11

interesting and frustrating paper. has a "data appendix" which talks about the data and methodology (good), but doesn't include the data files that had to have been created in order to generate the tables.

<div class="cArrow"> </div><div class="cContentInner">interesting and frustrating paper. has a "data appendix" which talks about the data and methodology (good), but doesn't include the data files that had to have been created in order to generate the tables. </div>

...

Cancel

DigitalKoans » Blog Archive » Planets Project Deposits "Digital Genome" Ti... - 0 views

digital-scholarship.org/...ime-capsule-in-swiss-fort-knox

shared by Lisa Johnston on 01 Jun 10 - Cached

Lisa Johnston on 01 Jun 10

Over the last decade the digital age has seen an explosion in the rate of data creation. Estimates from 2009 suggest that over 100 GB of data has already been created for every single individual on the planet ranging from holiday snaps to health records-that's over 1 trillion CDs worth of data, equivalent to 24 tons of books per person!

<div class="cArrow"> </div><div class="cContentInner">Over the last decade the digital age has seen an explosion in the rate of data creation. Estimates from 2009 suggest that over 100 GB of data has already been created for every single individual on the planet ranging from holiday snaps to health records-that's over 1 trillion CDs worth of data, equivalent to 24 tons of books per person!</div>

...

Cancel

Scientific Data Sharing Project - 0 views

scientificdatasharing.com

data sharing

shared by Lisa Johnston on 03 Dec 10 - Cached

Lisa Johnston on 03 Dec 10

The Data Sharing Project proposes to further this goal initially in the field of medicine by working to create a raw data sharing program that will serve as a model to other disciplines attempting to make their own way in this arena.

<div class="cArrow"> </div><div class="cContentInner">The Data Sharing Project proposes to further this goal initially in the field of medicine by working to create a raw data sharing program that will serve as a model to other disciplines attempting to make their own way in this arena.</div>

...

Cancel

Digital Curation Centre: DCC SCARP Project - 0 views

www.dcc.ac.uk/scarp

report DCC

shared by Lisa Johnston on 25 Jan 10 - Cached

Lisa Johnston on 25 Jan 10

18 January 2010 | Key perspectives | Type: report The Digital Curation Centre is pleased to announce the report "Data Dimensions: Disciplinary Differences in Research Data Sharing, Reuse and Long term Viability" by Key Perspectives, as one of the final outputs of the DCC SCARP project. The project investigated attitudes and approaches to data deposit, sharing and reuse, curation and preservation, over a range of research fields in differing disciplines. The synthesis report (which drew on the SCARP case studies plus a number of others, identified in the Appendix), identifies factors that help understand how curation practices in research groups differ in disciplinary terms. This provides a backdrop to different digital curation approaches.

<div class="cArrow"> </div><div class="cContentInner">18 January 2010 | Key perspectives | Type: report The Digital Curation Centre is pleased to announce the report "Data Dimensions: Disciplinary Differences in Research Data Sharing, Reuse and Long term Viability" by Key Perspectives, as one of the final outputs of the DCC SCARP project. The project investigated attitudes and approaches to data deposit, sharing and reuse, curation and preservation, over a range of research fields in differing disciplines. The synthesis report (which drew on the SCARP case studies plus a number of others, identified in the Appendix), identifies factors that help understand how curation practices in research groups differ in disciplinary terms. This provides a backdrop to different digital curation approaches.</div>

...

Cancel

Where is the cloud? Geography, economics, environment, and jurisdiction in cloud computing - 0 views

www.uic.edu/...2171

cloud computing geography

shared by Lisa Johnston on 01 May 09 - Cached

Lisa Johnston on 01 May 09

Cloud computing - the creation of large data centers that can be dynamically provisioned, configured, and reconfigured to deliver services in a scalable manner - places enormous capacity and power in the hands of users. As an emerging new technology, however, cloud computing also raises significant questions about resources, economics, the environment, and the law. Many of these questions relate to geographical considerations related to the data centers that underlie the clouds: physical location, available resources, and jurisdiction. While the metaphor of the cloud evokes images of dispersion, cloud computing actually represents centralization of information and computing resources in data centers, raising the specter of the potential for corporate or government control over information if there is insufficient consideration of these geographical issues, especially jurisdiction. This paper explores the interrelationships between the geography of cloud computing, its users, its providers, and governments.

<div class="cArrow"> </div><div class="cContentInner">Cloud computing - the creation of large data centers that can be dynamically provisioned, configured, and reconfigured to deliver services in a scalable manner - places enormous capacity and power in the hands of users. As an emerging new technology, however, cloud computing also raises significant questions about resources, economics, the environment, and the law. Many of these questions relate to geographical considerations related to the data centers that underlie the clouds: physical location, available resources, and jurisdiction. While the metaphor of the cloud evokes images of dispersion, cloud computing actually represents centralization of information and computing resources in data centers, raising the specter of the potential for corporate or government control over information if there is insufficient consideration of these geographical issues, especially jurisdiction. This paper explores the interrelationships between the geography of cloud computing, its users, its providers, and governments.</div>

...

Cancel

The Current Status of Scientific Data Sharing and Spatial Data Services in China | Moun... - 2 views

www.mtnforum.org/...nd-spatial-data-services-china

shared by Lisa Johnston on 31 Mar 11 - No Cached

Lisa Johnston on 31 Mar 11

There are a lot of scientific data repositories in China that have been growing since the 90's. Interesting sea change of ethics and sharing practices.

<div class="cArrow"> </div><div class="cContentInner">There are a lot of scientific data repositories in China that have been growing since the 90's. Interesting sea change of ethics and sharing practices.</div>

...

Cancel

EarthScope - 0 views

www.earthscope.org/...es_home.html

geoscience geology geophysics cyberinfrastructure geoinformatics tectonics earth e-research e-science earthquakes volcanoes data seismic earth-science

shared by David Govoni on 10 Jan 09 - Cached

David Govoni on 10 Jan 09

"An earth science program to explore the structure and evolution of the North American Continent and understand processes controlling earthquakes and volcanoes. EarthScope provides freely accessible data and data products from thousands of geophysical in

<div class="cArrow"> </div><div class="cContentInner">"An earth science program to explore the structure and evolution of the North American Continent and understand processes controlling earthquakes and volcanoes. EarthScope provides freely accessible data and data products from thousands of geophysical in</div>

...

Cancel

NSF to Ask Every Grant Applicant for Data Management Plan - ScienceInsider - 0 views

news.sciencemag.org/...ask-every-grant-applicant.html

shared by Lisa Johnston on 07 May 10 - Cached

Lisa Johnston on 07 May 10

Scientists seeking funding from the National Science Foundation (NSF) will soon need to spell out how they plan to manage the data they hope to collect. It's part of a broader move by NSF and other federal agencies to emphasize the importance of community access to data.

<div class="cArrow"> </div><div class="cContentInner">Scientists seeking funding from the National Science Foundation (NSF) will soon need to spell out how they plan to manage the data they hope to collect. It's part of a broader move by NSF and other federal agencies to emphasize the importance of community access to data.</div>

...

Cancel

1 - 20 of 140 Next › Last »

Showing 20▼ items per page