Skip to main content

Home/ OpenSciInfo/ Group items tagged distributed

Rss Feed Group items tagged

Mike Chelen

PS3Cluster Guide: By The Cluster Workshop - 0 views

  •  
    Our community guide allows you to set up your own MPI (Message Passing Interface) based supercomputer cluster with the Playstation 3. This guide was co-written by Gaurav Khanna, based on his previous work on the Gravity Grid and is a current run-time environment for the research of co-author (Chris Poulin), based on his current work in distributed pattern recognition. As such, we currently utilize the Fedora Core for this infrastructure and illustrate a "how-to" below. NOTE: We focus on the Fedora 8 distribution, due to prevalence of Fedora and its Cell SDK (3.0) compatibility. Finally, this content should be considered open source, and here is the license.
Mike Chelen

genome.gov | A Catalog of Published Genome-Wide Association Studies - 0 views

  •  
    The genome-wide association study (GWAS) publications listed here include only those attempting to assay at least 100,000 single nucleotide polymorphisms (SNPs) in the initial stage. Publications are organized from most to least recent date of publication, indexing from online publication if available. Studies focusing only on candidate genes are excluded from this catalog. Studies are identified through weekly PubMed literature searches, daily NIH-distributed compilations of news and media reports, and occasional comparisons with an existing database of GWAS literature (HuGE Navigator). SNP-trait associations listed here are limited to those with p-values < 1.0 x 10-5. Note that we are now including all identified SNP-trait associations meeting this p-value threshhold. Multipliers of powers of 10 in p-values are rounded to the nearest single digit; odds ratios and allele frequencies are rounded to two decimals. Standard errors are converted to 95 percent confidence intervals where applicable. Allele frequencies, p-values, and odds ratios derived from the largest sample size, typically a combined analysis (initial plus replication studies), are recorded below if reported; otherwise statistics from the initial study sample are recorded. Odds ratios < 1 in the original paper are converted to OR > 1 for the alternate allele. Where results from multiple genetic models are available, we prioritized effect sizes (OR's or beta-coefficients) as follows: 1) genotypic model, per-allele estimate; 2) genotypic model, heterozygote estimate, 3) allelic model, allelic estimate. Gene regions corresponding to SNPs were identified from the UCSC Genome Browser. Gene names are those reported by the authors in the original paper. Only one SNP within a gene or region of high linkage disequilibrium is recorded unless there was evidence of independent association.
Mike Chelen

EcoliHub - a comprehensive K-12 information resource - Home - 0 views

  •  
    Sixty years of study have made Escherichia coli K-12 the most deeply understood organism at the molecular level. Much of what we know about cellular processes can be traced to fundamental discoveries in E. coli. In spite of its great importance as a model organism, information about E. coli is distributed among many online resources. EcoliHub uses web services that are being developed to make seamless bidirectional connections between E. coli resources, thereby enabling the full use of existing knowledge and supporting cutting-edge research into the molecular basis of life. Read More EcoliHub is being developed to serve the user community. Users can help teach us what is desirable in future releases by taking our User Survey.
Mike Chelen

Eggheads.org - Main Index - 0 views

shared by Mike Chelen on 17 Dec 08 - Cached
  •  
    Eggdrop is the world's most popular Open Source IRC bot, designed for flexibility and ease of use, and is freely distributable under the GNU General Public License (GPL). Eggdrop was originally developed by Robey Pointer; however, he no longer works on Eggdrop so please do not contact him for help solving a problem or bug. Some features of Eggdrop: * Designed to run on Linux, *BSD, SunOs, Windows, Mac OS X, etc ... * Extendable with Tcl scripts and/or C modules * Support for the big five IRC networks (Undernet, DALnet, EFnet, IRCnet, and QuakeNet) * The ability to form botnets and share partylines and userfiles between bots Some benefits of Eggdrop: * The oldest IRC bot still in active development (Eggdrop was created in 1993) * Established IRC help channels and web sites dedicated to Eggdrop * Thousands of premade Tcl scripts and C modules * Best of all ... It's FREE!
Mike Chelen

Protocol for Implementing Open Access Data - 0 views

  • information for the Internet community
  • distributing data or databases
  • “open” and “open access”
  • ...69 more annotations...
  • requirements for gaining and using the Science Commons Open Access Data Mark and metadata
  • interoperability of scientific data
  • terms and conditions around data make integration difficult to legally perform
  • single license
  • data with this license can be integrated with any other data under this license
  • too many databases under too many terms already
  • unlikely that any one license or suite of licenses will have the correct mix of terms
  • principles for open access data and a protocol for implementing those principles
  • Open Access Data Mark and metadata
  • databases and data
  • the foundation to legally integrate a database or data product
  • another database or data product
  • no mechanisms to manage transfer or negotiations of rights unrelated to integration
  • submitted to Science Commons for certification as a conforming implementation
  • Open Access Data trademarks (icons and phrases) and metadata on databases
  • protocol must promote legal predictability and certainty
  • easy to use and understand
  • lowest possible transaction costs on users
  • Science Commons’ experience in distributing a database licensing Frequently Asked Questions (FAQ) file
  • hard to apply the distinction between what is copyrightable and what is not copyrightable
  • lack of simplicity restricts usage
  • reducing or eliminating the need to make the distinction between copyrightable and non-copyrightable elements
  • satisfy the norms and expectations of the disciplines providing the database
  • norms for citation will differ
  • norms must be attached
  • Converge on the public domain by waiving all rights based on intellectual property
  • reconstruction of the public domain
  • scientific norms to express the wishes of the data provider
  • public domain
  • waiving the relevant rights on data and asserting that the provider makes no claims on the data
  • Requesting behavior, such as citation, through norms rather than as a legal requirement based on copyright or contracts, allows for different scientific disciplines to develop different norms for citation.
  • waive all rights necessary for data extraction and re-use
  • copyright
  • sui generis database rights
  • claims of unfair competition
  • implied contracts
  • and other legal rights
  • any obligations on the user of the data or database such as “copyleft” or “share alike”, or even the legal requirement to provide attribution
  • non-legally binding set of citation norms
  • waiving other statutory or intellectual property rights
  • there are other rights, in addition to copyright, that may apply
  • uncopyrightable databases may be protected in some countries
  • sui generis rights apply in the European Union
  • waivers of sui generis and other legal grounds for database protection
  • no contractual controls
  • using contract, rather than intellectual property or statutory rights, to apply terms to databases
  • affirmatively declare that contractual constraints do not apply to the database
  • interoperation with databases and data not available under the Science Commons Open Access Data Protocol through metadata
  • data that is not or cannot be made available under this protocol
  • owner provides metadata (as data) under this protocol so that the existence of the non-open access data is discoverable
  • digital identifiers and metadata describing non-open access data
  • “Licensing” a database typically means that the “copyrightable elements” of a database are made available under a copyright license
  • Database FAQ, in its first iteration, recommended this method
  • recommendation is now withdrawn
  • copyright begins in and ends in many databases
  • database divided into copyrightable and non copyrightable elements
  • user tends to assume that all is under copyright or none is under copyright
  • share-alike license on the copyrightable elements may be falsely assumed to operate on the factual contents of a database
  • copyright in situations where it is not necessary
  • query across tens of thousands of data records across the web might return a result which itself populates a new database
  • selective waiving of intellectual property rights fail to provide a high degree of legal certainty and ease of use
  • problem of false expectations
  • apply a “copyleft” term to the copyrightable elements of a database, in hopes that those elements result in additional open access database elements coming online
  • uncopyrightable factual content
  • republish those contents without observing the copyleft or share-alike terms
  • cascading attribution if attribution is required as part of a license approach
  • Would a scientist need to attribute 40,000 data depositors in the event of a query across 40,000 data sets?
  • conflict with accepted norms in some disciplines
  • imposes a significant transaction cost
Mike Chelen

Home :::Academic Journals - 1 views

  •  
    ACADEMIC JOURNALS provides free access to research information to the international community without financial, legal or technical barriers. All the journals from this organization will be freely distributed and available from multiple websites.....ACADEMIC JOURNALS, breaking new frontiers in the world of journals.
Mike Chelen

Open Knowledge Foundation Blog » Blog Archive » Open Data: Openness and Licen... - 0 views

  • Why bother about openness and licensing for data
  • It’s crucial because open data is so much easier to break-up and recombine, to use and reuse.
  • want people to have incentives to make their data open and for open data to be easily usable and reusable
  • ...8 more annotations...
  • good definition of openness acts as a standard that ensures different open datasets are ‘interoperable’
  • Licensing is important because it reduces uncertainty. Without a license you don’t know where you, as a user, stand: when are you allowed to use this data? Are you allowed to give to others? To distribute your own changes, etc?
  • licensing and definitions are important even though they are only a small part of the overall picture
  • If we get them wrong they will keep on getting in the way of everything else.
  • Everyone agrees that requiring attribution is OK
    • Mike Chelen
       
      My opinion is that there should be no requirements, including attribution, and that standards should be community-based instead of legal.
  • Even if a basic license is used it can be argued that any ‘requirements’ for attribution or share-alike should not be in a license but in ‘community norms’.
    • Mike Chelen
       
      Licenses and community norms are not exclusive. It's recommended to adopt a Public Domain license, and encourage attribution through community standards.
  • A license is likely to elicit at least as much, and almost certainly more, conformity with its provisions than community norms.
    • Mike Chelen
       
      Ease of access and should be the goal, not conformity.
  • (even to a user it is easy to comply with the open license)
    • Mike Chelen
       
      It is important to specifically publish using a Public Domain dedication.
  •  
    Why bother about openness and licensing for data? After all they don't matter in themselves: what we really care about are things like the progress of human knowledge or the freedom to understand and share.
Mike Chelen

How to set up Disco on Amazon EC2 - Disco v0.1.2 documentation - 0 views

  •  
    With the following three steps, you can set up a Disco cluster in the Amazon's Elastic Computing Cloud. This will cost you a few dollars (or more, depending on your needs) but requires no resources on your side besides a single machine that you use to setup the cluster. In this setup Disco master and nodes run on EC2. Your Disco client can run either on the master node on EC2 or on a local machine.
Mike Chelen

Creative Commons Attribution Noncommercial ShareAlike Legal Code - 0 views

  • must include a copy of, or the Uniform Resource Identifier for, this License with every copy
  • Derivative Work
  • the terms of this License
  • ...10 more annotations...
  • distribute
  • may not exercise
  • rights granted to You in Section 3 above
  • primarily intended for or directed toward commercial advantage or private monetary compensation
  • primarily
  • exchange of the Work for other copyrighted works by means of digital file-sharing or otherwise shall not be considered to be intended for or directed toward commercial advantage or private monetary compensation
  • provided there is no payment of any monetary compensation in connection with the exchange of copyrighted works
  • name of the Original Author
  • keep intact all copyright notices for the Work
  • "Attribution Parties"
1 - 17 of 17
Showing 20 items per page