Group items tagged

Filter: All | Bookmarks | Topics Simple Middle

Protocol for Implementing Open Access Data - 0 views

sciencecommons.org/...open-access-data-protocol

science commons open access data public domain

shared by Mike Chelen on 07 Sep 08 - Cached

information for the Internet community
...

Cancel
distributing data or databases
...

Cancel
“open” and “open access”
...

Cancel
...69 more annotations...
requirements for gaining and using the Science Commons Open Access Data Mark and metadata
...

Cancel
interoperability of scientific data
...

Cancel
terms and conditions around data make integration difficult to legally perform
...

Cancel
single license
...

Cancel
data with this license can be integrated with any other data under this license
...

Cancel
too many databases under too many terms already
...

Cancel
unlikely that any one license or suite of licenses will have the correct mix of terms
...

Cancel
principles for open access data and a protocol for implementing those principles
...

Cancel
Open Access Data Mark and metadata
...

Cancel
databases and data
...

Cancel
the foundation to legally integrate a database or data product
...

Cancel
another database or data product
...

Cancel
no mechanisms to manage transfer or negotiations of rights unrelated to integration
...

Cancel
submitted to Science Commons for certification as a conforming implementation
...

Cancel
Open Access Data trademarks (icons and phrases) and metadata on databases
...

Cancel
protocol must promote legal predictability and certainty
...

Cancel
easy to use and understand
...

Cancel
lowest possible transaction costs on users
...

Cancel
Science Commons’ experience in distributing a database licensing Frequently Asked Questions (FAQ) file
...

Cancel
hard to apply the distinction between what is copyrightable and what is not copyrightable
...

Cancel
lack of simplicity restricts usage
...

Cancel
reducing or eliminating the need to make the distinction between copyrightable and non-copyrightable elements
...

Cancel
satisfy the norms and expectations of the disciplines providing the database
...

Cancel
norms for citation will differ
...

Cancel
norms must be attached
...

Cancel
Converge on the public domain by waiving all rights based on intellectual property
...

Cancel
reconstruction of the public domain
...

Cancel
scientific norms to express the wishes of the data provider
...

Cancel
public domain
...

Cancel
waiving the relevant rights on data and asserting that the provider makes no claims on the data
...

Cancel
Requesting behavior, such as citation, through norms rather than as a legal requirement based on copyright or contracts, allows for different scientific disciplines to develop different norms for citation.
...

Cancel
waive all rights necessary for data extraction and re-use
...

Cancel
copyright
...

Cancel
sui generis database rights
...

Cancel
claims of unfair competition
...

Cancel
implied contracts
...

Cancel
and other legal rights
...

Cancel
any obligations on the user of the data or database such as “copyleft” or “share alike”, or even the legal requirement to provide attribution
...

Cancel
non-legally binding set of citation norms
...

Cancel
waiving other statutory or intellectual property rights
...

Cancel
there are other rights, in addition to copyright, that may apply
...

Cancel
uncopyrightable databases may be protected in some countries
...

Cancel
sui generis rights apply in the European Union
...

Cancel
waivers of sui generis and other legal grounds for database protection
...

Cancel
no contractual controls
...

Cancel
using contract, rather than intellectual property or statutory rights, to apply terms to databases
...

Cancel
affirmatively declare that contractual constraints do not apply to the database
...

Cancel
interoperation with databases and data not available under the Science Commons Open Access Data Protocol through metadata
...

Cancel
data that is not or cannot be made available under this protocol
...

Cancel
owner provides metadata (as data) under this protocol so that the existence of the non-open access data is discoverable
...

Cancel
digital identifiers and metadata describing non-open access data
...

Cancel
“Licensing” a database typically means that the “copyrightable elements” of a database are made available under a copyright license
...

Cancel
Database FAQ, in its first iteration, recommended this method
...

Cancel
recommendation is now withdrawn
...

Cancel
copyright begins in and ends in many databases
...

Cancel
database divided into copyrightable and non copyrightable elements
...

Cancel
user tends to assume that all is under copyright or none is under copyright
...

Cancel
share-alike license on the copyrightable elements may be falsely assumed to operate on the factual contents of a database
...

Cancel
copyright in situations where it is not necessary
...

Cancel
query across tens of thousands of data records across the web might return a result which itself populates a new database
...

Cancel
selective waiving of intellectual property rights fail to provide a high degree of legal certainty and ease of use
...

Cancel
problem of false expectations
...

Cancel
apply a “copyleft” term to the copyrightable elements of a database, in hopes that those elements result in additional open access database elements coming online
...

Cancel
uncopyrightable factual content
...

Cancel
republish those contents without observing the copyleft or share-alike terms
...

Cancel
cascading attribution if attribution is required as part of a license approach
...

Cancel
Would a scientist need to attribute 40,000 data depositors in the event of a query across 40,000 data sets?
...

Cancel
conflict with accepted norms in some disciplines
...

Cancel
imposes a significant transaction cost
...

Cancel

FTP Download - 0 views

www.ensembl.org/...index.html

shared by Mike Chelen on 11 Dec 08 - Cached

Mike Chelen on 11 Dec 08

If required, entire databases can be downloaded from our FTP site in a variety of formats, from flat files to MySQL dumps. Please be aware that these files can run to many gigabytes of data. To facilitate storage and download all databases are GNU Zip (gzip, *.gz) compressed. Please note: Ensembl supports downloading of many correlation tables via the highly customisable BioMart data mining tool. You may find exploring this web-based data mining tool easier than extracting information from our database dumps.

<div class="cArrow"> </div><div class="cContentInner">If required, entire databases can be downloaded from our FTP site in a variety of formats, from flat files to MySQL dumps. Please be aware that these files can run to many gigabytes of data. To facilitate storage and download all databases are GNU Zip (gzip, *.gz) compressed. Please note: Ensembl supports downloading of many correlation tables via the highly customisable BioMart data mining tool. You may find exploring this web-based data mining tool easier than extracting information from our database dumps. </div>

...

Cancel

BioLit Project - 0 views

biolit.ucsd.edu/index.html

open access

shared by Mike Chelen on 05 Jan 09 - Cached

Mike Chelen on 05 Jan 09

The establishment of open access literature makes it possible for knowledge to be extracted from scholarly articles and included in other resources. BioLit aims to extract database identifiers and rich meta-data from open access articles in the life sciences and integrate that information with existing biological databases. We have begun prototyping this effort using a clone of the RCSB Protein Data Bank, a database of macromolecular structures.

<div class="cArrow"> </div><div class="cContentInner">The establishment of open access literature makes it possible for knowledge to be extracted from scholarly articles and included in other resources. BioLit aims to extract database identifiers and rich meta-data from open access articles in the life sciences and integrate that information with existing biological databases. We have begun prototyping this effort using a clone of the RCSB Protein Data Bank, a database of macromolecular structures. </div>

...

Cancel

de.bezier.mysql - 0 views

www.bezier.de/mysql

my sql processing java data base

shared by Mike Chelen on 22 Nov 08 - Cached

Mike Chelen on 22 Nov 08

Processing (BETA) library to communicate with MySQL (or any other SQL) databases. note that due to java security restrictions this will not work with applets "out of the box" and that many remote mysql-servers will only allow local access ("localhost") or connections from trusted hosts. (see notes. ) also note that you should have some experience with SQL to put, change and retrieve data from the database.

<div class="cArrow"> </div><div class="cContentInner">Processing (BETA) library to communicate with MySQL (or any other SQL) databases. note that due to java security restrictions this will not work with applets "out of the box" and that many remote mysql-servers will only allow local access ("localhost") or connections from trusted hosts. (see notes. ) also note that you should have some experience with SQL to put, change and retrieve data from the database. </div>

...

Cancel

A pitfall of wiki solution for biological database...[Brief Bioinform. 2008] - PubMed R... - 0 views

www.ncbi.nlm.nih.gov/...19060305

wiki biology

shared by Mike Chelen on 12 Dec 08 - Cached

Mike Chelen on 12 Dec 08

Not a few biologists tend to consider wiki as a solution to manage and reorganize data by a community. However, in its basic functionality, wiki lacks a measure to check data consistency and is not suitable for a database. To circumvent this pitfall, installation of page dependency through in-line page searches is necessary. We also introduce two existing approaches that support in-line queries.

<div class="cArrow"> </div><div class="cContentInner">Not a few biologists tend to consider wiki as a solution to manage and reorganize data by a community. However, in its basic functionality, wiki lacks a measure to check data consistency and is not suitable for a database. To circumvent this pitfall, installation of page dependency through in-line page searches is necessary. We also introduce two existing approaches that support in-line queries.</div>

...

Cancel

Public MySQL Server - 0 views

www.ensembl.org/...mysql.html

shared by Mike Chelen on 11 Dec 08 - Cached

For large amounts of data and more detailed analysis, we recommend you use our publicly-accessible MySQL server, ensembldb.ensembl.org, which you can access as user 'anonymous'. A second server, martdb.ensembl.org provides public access to the BioMart databases.
...

Cancel

Mike Chelen on 11 Dec 08

For large amounts of data and more detailed analysis, we recommend you use our publicly-accessible MySQL server, ensembldb.ensembl.org, which you can access as user 'anonymous'. A second server, martdb.ensembl.org provides public access to the BioMart databases.

<div class="cArrow"> </div><div class="cContentInner">For large amounts of data and more detailed analysis, we recommend you use our publicly-accessible MySQL server, ensembldb.ensembl.org, which you can access as user 'anonymous'. A second server, martdb.ensembl.org provides public access to the BioMart databases.</div>

...

Cancel

genome.gov | A Catalog of Published Genome-Wide Association Studies - 0 views

www.genome.gov/26525384

genetics snp pubmed

shared by Mike Chelen on 28 Jan 09 - Cached

Mike Chelen on 28 Jan 09

The genome-wide association study (GWAS) publications listed here include only those attempting to assay at least 100,000 single nucleotide polymorphisms (SNPs) in the initial stage. Publications are organized from most to least recent date of publication, indexing from online publication if available. Studies focusing only on candidate genes are excluded from this catalog. Studies are identified through weekly PubMed literature searches, daily NIH-distributed compilations of news and media reports, and occasional comparisons with an existing database of GWAS literature (HuGE Navigator). SNP-trait associations listed here are limited to those with p-values < 1.0 x 10-5. Note that we are now including all identified SNP-trait associations meeting this p-value threshhold. Multipliers of powers of 10 in p-values are rounded to the nearest single digit; odds ratios and allele frequencies are rounded to two decimals. Standard errors are converted to 95 percent confidence intervals where applicable. Allele frequencies, p-values, and odds ratios derived from the largest sample size, typically a combined analysis (initial plus replication studies), are recorded below if reported; otherwise statistics from the initial study sample are recorded. Odds ratios < 1 in the original paper are converted to OR > 1 for the alternate allele. Where results from multiple genetic models are available, we prioritized effect sizes (OR's or beta-coefficients) as follows: 1) genotypic model, per-allele estimate; 2) genotypic model, heterozygote estimate, 3) allelic model, allelic estimate. Gene regions corresponding to SNPs were identified from the UCSC Genome Browser. Gene names are those reported by the authors in the original paper. Only one SNP within a gene or region of high linkage disequilibrium is recorded unless there was evidence of independent association.

<div class="cArrow"> </div><div class="cContentInner">The genome-wide association study (GWAS) publications listed here include only those attempting to assay at least 100,000 single nucleotide polymorphisms (SNPs) in the initial stage. Publications are organized from most to least recent date of publication, indexing from online publication if available. Studies focusing only on candidate genes are excluded from this catalog. Studies are identified through weekly PubMed literature searches, daily NIH-distributed compilations of news and media reports, and occasional comparisons with an existing database of GWAS literature (HuGE Navigator). SNP-trait associations listed here are limited to those with p-values < 1.0 x 10-5. Note that we are now including all identified SNP-trait associations meeting this p-value threshhold. Multipliers of powers of 10 in p-values are rounded to the nearest single digit; odds ratios and allele frequencies are rounded to two decimals. Standard errors are converted to 95 percent confidence intervals where applicable. Allele frequencies, p-values, and odds ratios derived from the largest sample size, typically a combined analysis (initial plus replication studies), are recorded below if reported; otherwise statistics from the initial study sample are recorded. Odds ratios < 1 in the original paper are converted to OR > 1 for the alternate allele. Where results from multiple genetic models are available, we prioritized effect sizes (OR's or beta-coefficients) as follows: 1) genotypic model, per-allele estimate; 2) genotypic model, heterozygote estimate, 3) allelic model, allelic estimate. Gene regions corresponding to SNPs were identified from the UCSC Genome Browser. Gene names are those reported by the authors in the original paper. Only one SNP within a gene or region of high linkage disequilibrium is recorded unless there was evidence of independent association.</div>

...

Cancel

Neuroscience Information Framework (Main.WebHome) - Neuroscience Information Framework ... - 0 views

wiki.neuinfo.org/...Main

neuroscience wiki

shared by Mike Chelen on 15 Dec 08 - No Cached

Mike Chelen on 15 Dec 08

The advent of the World Wide Web has led to an explosion in the number of diverse resources available to neuroscientists. Despite the availability of powerful search engines, locating these diverse resources has become increasingly difficult and time consuming. The NIF project utilizes both advanced machine-based search technologies and old-fashioned human legwork to provide access to neuroscience-relevant resources on the Web. Resources include research materials, Web pages, software tools, data sets, literature and general information. The NIF has developed technologies that allow a user to search across these different types of resources, all from a single interface. A unique feature of the NIF is the ability to issue direct queries against multiple databases simultaneously, retrieving content that is largely hidden from traditional search engines. A second unique feature is an extensive vocabulary covering major neuroscience domains for describing and searching these resources. The NIF takes advantage of advances in knowledge engineering to broaden and refine searches based on related concepts. The NIF beta test site was developed to gain feedback on the NIF search interface and content. Users will be asked to search the NIF, explore the vocabularies, and answer a questionnaire about their experience.

<div class="cArrow"> </div><div class="cContentInner">The advent of the World Wide Web has led to an explosion in the number of diverse resources available to neuroscientists. Despite the availability of powerful search engines, locating these diverse resources has become increasingly difficult and time consuming. The NIF project utilizes both advanced machine-based search technologies and old-fashioned human legwork to provide access to neuroscience-relevant resources on the Web. Resources include research materials, Web pages, software tools, data sets, literature and general information. The NIF has developed technologies that allow a user to search across these different types of resources, all from a single interface. A unique feature of the NIF is the ability to issue direct queries against multiple databases simultaneously, retrieving content that is largely hidden from traditional search engines. A second unique feature is an extensive vocabulary covering major neuroscience domains for describing and searching these resources. The NIF takes advantage of advances in knowledge engineering to broaden and refine searches based on related concepts. The NIF beta test site was developed to gain feedback on the NIF search interface and content. Users will be asked to search the NIF, explore the vocabularies, and answer a questionnaire about their experience.</div>

...

Cancel

ChemSpider Blog » Blog Archive » Adding Publications to ChemSpider via Digita... - 0 views

www.chemspider.com/...l-object-identifiers-dois.html

chemistry doi chemspider

shared by Mike Chelen on 13 Dec 08 - Cached

Mike Chelen on 13 Dec 08

We are focused on providing tools to our users to ensure that they can add information of interest to structure-based records in ChemSpider. We have introduced DOI-based associations recently allowing users to connect publications of interest to chemical compounds on our database. The process is simple. Find the structure record of interest, use the Add DOI function and Publish. The process is outlined graphically below.

<div class="cArrow"> </div><div class="cContentInner">We are focused on providing tools to our users to ensure that they can add information of interest to structure-based records in ChemSpider. We have introduced DOI-based associations recently allowing users to connect publications of interest to chemical compounds on our database. The process is simple. Find the structure record of interest, use the Add DOI function and Publish. The process is outlined graphically below. </div>

...

Cancel

Qualifying Online Information Resources for Chemists | SciVee - 0 views

scivee.tv/9267

shared by Mike Chelen on 11 Dec 08 - Cached

Mike Chelen on 11 Dec 08

his meeting was about "Making the Web Work for Science and the Impact of e-Science and the Cyberinfrastructure." I provided an overview of how access to information has changed over the past 20 years for me. I talked about the challenges for publishers serving the chemistry community and how their business models are being challenged and how I empathize with the struggle to figure out how to deal with it. I talked about quality and how care must be taken when using information online. We are ALL challenged with errors - whether you consider PubChem, ChemSpider, Wikipedia or any of the other online databases they all have errors - how do you find them? Some of them are obvious and I pointed to obvious examples in the talk. I hoped to educate the attendees in regards to the value of InChI which, while not a perfect fit yet, is a great start to structure-based communication of chemistry. I publicly blessed the efforts of publishers such as the RSC and Nature Publishing group for the efforts they are making to support InChI and improve the quality of document presentation online. I blessed CAS as a treasure trove of information and the gold standard of curated chemistry. We need them all to be successful for the sake of our science. The challenge is how to fit into the ongoing proliferation of free access to information without modifying the business models.

<div class="cArrow"> </div><div class="cContentInner">his meeting was about "Making the Web Work for Science and the Impact of e-Science and the Cyberinfrastructure." I provided an overview of how access to information has changed over the past 20 years for me. I talked about the challenges for publishers serving the chemistry community and how their business models are being challenged and how I empathize with the struggle to figure out how to deal with it. I talked about quality and how care must be taken when using information online. We are ALL challenged with errors - whether you consider PubChem, ChemSpider, Wikipedia or any of the other online databases they all have errors - how do you find them? Some of them are obvious and I pointed to obvious examples in the talk. I hoped to educate the attendees in regards to the value of InChI which, while not a perfect fit yet, is a great start to structure-based communication of chemistry. I publicly blessed the efforts of publishers such as the RSC and Nature Publishing group for the efforts they are making to support InChI and improve the quality of document presentation online. I blessed CAS as a treasure trove of information and the gold standard of curated chemistry. We need them all to be successful for the sake of our science. The challenge is how to fit into the ongoing proliferation of free access to information without modifying the business models.</div>

...

Cancel

Open Knowledge Foundation Blog » Blog Archive » Comments on the Science Commo... - 0 views

blog.okfn.org/...-implementing-open-access-data

open access data okfn science commons

shared by Mike Chelen on 01 Mar 09 - Cached

the protocol does not discuss any of the possible attractions of allowing such provisions
...

Cancel
Protocol gives 3 basic reasons for preferring the ‘PD’ approach
...

Cancel
Science Commons Protocol for Implementing Open Access Data
...

Cancel
...7 more annotations...
I am not really convinced by any of these points that attribution or share-alike provisions should not be included in open data licenses
...

Cancel
application of obligations based on copyright in situations where it is not necessary
...

Cancel
non-copyrightable elements extends to the entire database and inadvertently infringe
...

Cancel
If intellectual property rights are involved
...

Cancel
requirements carrying a stiff penalty for failure
...

Cancel
selective waiving of intellectual property rights
...

Cancel
interpretative problems
...

Cancel

Geeking with Greg: Column versus row stores - 0 views

glinden.blogspot.com/...column-versus-row-stores.html

analysis database

shared by Mike Chelen on 09 Sep 08 - Cached

ChemSpider - Database of Chemical Structures and Property Predictions - 0 views

www.chemspider.com

chemistry science chemspider research

shared by Mike Chelen on 13 Dec 08 - Cached

Mike Chelen on 13 Dec 08

ChemSpider is a free access service providing a structure centric community for chemists. Providing access to millions of chemical structures and integration to a multitude of other online services ChemSpider is the richest single source of structure-based chemistry information.

<div class="cArrow"> </div><div class="cContentInner">ChemSpider is a free access service providing a structure centric community for chemists. Providing access to millions of chemical structures and integration to a multitude of other online services ChemSpider is the richest single source of structure-based chemistry information.</div>

...

Cancel

Freebase: an open, shared database of the world's knowledge - 0 views

www.freebase.com

base data free

shared by Mike Chelen on 07 Sep 08 - Cached

Mike Chelen liked it

Welcome to the Mulgara Project! - 0 views

www.mulgara.org

database java open rdf semantic source

shared by Mike Chelen on 10 Sep 08 - Cached

1 - 15 of 15

Showing 20▼ items per page

Group items tagged

Protocol for Implementing Open Access Data - 0 views

FTP Download - 0 views

BioLit Project - 0 views

de.bezier.mysql - 0 views

A pitfall of wiki solution for biological database...[Brief Bioinform. 2008] - PubMed R... - 0 views

Public MySQL Server - 0 views

genome.gov | A Catalog of Published Genome-Wide Association Studies - 0 views

Neuroscience Information Framework (Main.WebHome) - Neuroscience Information Framework ... - 0 views

ChemSpider Blog » Blog Archive » Adding Publications to ChemSpider via Digita... - 0 views

Qualifying Online Information Resources for Chemists | SciVee - 0 views

Open Knowledge Foundation Blog » Blog Archive » Comments on the Science Commo... - 0 views

Geeking with Greg: Column versus row stores - 0 views

ChemSpider - Database of Chemical Structures and Property Predictions - 0 views

Freebase: an open, shared database of the world's knowledge - 0 views

Welcome to the Mulgara Project! - 0 views

Related searches