Skip to main content

Home/ OpenSciInfo/ Group items tagged news

Rss Feed Group items tagged

Mike Chelen

Science 2.0 - introduction and perspectives for Poland « Freelancing science - 0 views

  • transcript of Science 2.0 based on a presentation I gave on conference on open science organized in Warsaw earlier this month
  • prepared for mixed audience and focused on perspectives for Poland
  • new forms of communication between scientists
  • ...44 more annotations...
  • research become meaningful only after confronting results with the scientific community
  • peer-reviewed publication is the best communication channel we had so far
  • new communication channels complement peer-reviewed publication
  • two important attributes in which they differ from traditional models: openness and communication time
  • increased openness and shorter communication time happens already in publishing industry (via Open Access movement and experiments with alternative/shorter ways of peer-review)
  • say few words about experiments that go little or quite a lot beyond publication
  • My Experiment as an example of an important step towards openness
  • least radical idea you can find in modern Science 2.0 world
  • virtual research environment
  • focus is put on sharing scientific workflows
  • use case
  • diagram of the “methods” sections from experimental (including bioinformatics analyses) publications
  • make it easier for others to understand what we did
  • can open towards other scientists we can also open towards non-experts
  • people from all over the world compete in improving structural models of proteins
  • helps in improving protein structure prediction software and in understanding protein folding
  • combine teaching and data annotation
  • metagenome sequences in first case and chemistry spectra in the second
  • interactive visualizations of chemical structures, genomes, proteins or multidimensional data
  • communicate some difficult concepts faster
  • new approaches in conference reporting
  • report in real time from the conference
  • followed by a number of people, including even the ones that were already on the conference
  • “open notebook science” which means conducting research using publicly available, immediately updated laboratory notebook
  • The reason I did a model for Cameron’s grant was that I subscribed to his feed before
  • I didn’t subscribe to Cameron because I knew his professional profile
  • I read his blog, I commented on it and he commented on mine, etc.
  • participation in online communities
  • important part of Science 2.0 is the fact that it has human face
  • PhDs about the same time
  • first was from a major Polish institute, the second from a major European one
  • what a head of a lab both would apply to will see
  • gap we must fill, this is between current research and lectures we give today
  • access to real-time scientific conversation
  • follow current research and decide what is important to learn
  • synthetic biology
  • not all universities in world have synthetic biology courses
  • didn’t stop these students, and they plan to participate in IGEM again
  • not only scientists – there are librarians, science communicators, editors from scientific journals, people working in biotech industry
  • community of life scientists
  • even people without direct connection to science
  • diverse skills and background
  • online conference
  • interact with them and to learn from them
Mike Chelen

Wikipedia:WikiProject NIH - Wikipedia, the free encyclopedia - 0 views

  •  
    Welcome to the NIH WikiProject, a collaboration area and group of editors dedicated to improving Wikipedia's coverage of National Institutes of Health. This is a new WikiProject, so please join!! (For more information on WikiProjects, please see Wikipedia:WikiProject and the Guide to WikiProjects). Goals * Improve Wikipedia's current coverage of the NIH and deepen the coverage with more pages. Scope * Cover all of the Institutes all the way down to individual laboratories/units.
Mike Chelen

genome.gov | A Catalog of Published Genome-Wide Association Studies - 0 views

  •  
    The genome-wide association study (GWAS) publications listed here include only those attempting to assay at least 100,000 single nucleotide polymorphisms (SNPs) in the initial stage. Publications are organized from most to least recent date of publication, indexing from online publication if available. Studies focusing only on candidate genes are excluded from this catalog. Studies are identified through weekly PubMed literature searches, daily NIH-distributed compilations of news and media reports, and occasional comparisons with an existing database of GWAS literature (HuGE Navigator). SNP-trait associations listed here are limited to those with p-values < 1.0 x 10-5. Note that we are now including all identified SNP-trait associations meeting this p-value threshhold. Multipliers of powers of 10 in p-values are rounded to the nearest single digit; odds ratios and allele frequencies are rounded to two decimals. Standard errors are converted to 95 percent confidence intervals where applicable. Allele frequencies, p-values, and odds ratios derived from the largest sample size, typically a combined analysis (initial plus replication studies), are recorded below if reported; otherwise statistics from the initial study sample are recorded. Odds ratios < 1 in the original paper are converted to OR > 1 for the alternate allele. Where results from multiple genetic models are available, we prioritized effect sizes (OR's or beta-coefficients) as follows: 1) genotypic model, per-allele estimate; 2) genotypic model, heterozygote estimate, 3) allelic model, allelic estimate. Gene regions corresponding to SNPs were identified from the UCSC Genome Browser. Gene names are those reported by the authors in the original paper. Only one SNP within a gene or region of high linkage disequilibrium is recorded unless there was evidence of independent association.
Mike Chelen

EST clusters - 0 views

  •  
    We build here a repository of assembled transcript sequences from the contigation (Expressed Sequence Tag (known as EST) & mRNA) in order to discover new genes from already existing data. Publicly available EST & mRNA sequences are clusterised and then contigated with specific bioinformatic tools (see technology).
Mike Chelen

SourceForge.net: CloudBurst - cloudburst-bio - 0 views

  •  
    CloudBurst: Highly Sensitive Short Read Mapping with MapReduce Michael Schatz Center for Bioinformatics and Computational Biology, University of Maryland Next-generation DNA sequencing machines are generating an enormous amount of sequence data, placing unprecedented demands on traditional single-processor read mapping algorithms. CloudBurst is a new parallel read-mapping algorithm optimized for mapping next-generation sequence data to the human genome and other reference genomes, for use in a variety of biological analyses including SNP discovery, genotyping, and personal genomics. It is modeled after the short read mapping program RMAP, and reports either all alignments or the unambiguous best alignment for each read with any number of mismatches or differences. This level of sensitivity could be prohibitively time consuming, but CloudBurst uses the open-source Hadoop implementation of MapReduce to parallelize execution using multiple compute nodes. CloudBurst's running time scales linearly with the number of reads mapped, and with near linear speedup as the number of processors increases. In a 24-processor core configuration, CloudBurst is up to 30 times faster than RMAP executing on a single core, while computing an identical set of alignments. In a large remote compute clouds with 96 cores, CloudBurst reduces the running time from hours to mere minutes for typical jobs involving mapping of millions of short reads to the human genome. CloudBurst is available open-source as a model for parallelizing other bioinformatics algorithms with MapReduce.
Mike Chelen

WWMM Web Services - 0 views

  •  
    These Web Services can be used to create applications on Molecules and their Properties. Details of the Web Services can be found here. If you are new to using these Web Services, please take a few minutes to read the instructions here.
Mike Chelen

USENIX IMC '05 Technical Paper - 0 views

  •  
    Existing studies on BitTorrent systems are single-torrent based, while more than 85% of all peers participate in multiple torrents according to our trace analysis. In addition, these studies are not sufficiently insightful and accurate even for single-torrent models, due to some unrealistic assumptions. Our analysis of representative BitTorrent traffic provides several new findings regarding the limitations of BitTorrent systems: (1) Due to the exponentially decreasing peer arrival rate in reality, service availability in such systems becomes poor quickly, after which it is difficult for the file to be located and downloaded. (2) Client performance in the BitTorrent-like systems is unstable, and fluctuates widely with the peer population. (3) Existing systems could provide unfair services to peers, where peers with high downloading speed tend to download more and upload less. In this paper, we study these limitations on torrent evolution in realistic environments. Motivated by the analysis and modeling results, we further build a graph based multi-torrent model to study inter-torrent collaboration. Our model quantitatively provides strong motivation for inter-torrent collaboration instead of directly stimulating seeds to stay longer. We also discuss a system design to show the feasibility of multi-torrent collaboration.
Mike Chelen

YouTube - Hans Rosling: No more boring data: TEDTalks - 0 views

  •  
    With the drama and urgency of a sportscaster, statistics guru Hans Rosling uses an amazing new presentation tool, Gapminder, to debunk several myths about world development. Rosling is professor of international health at Sweden's Karolinska Institute, and founder of Gapminder, a nonprofit that brings vital global data to life. (Recorded February 2006 in Monterey, CA.)
Mike Chelen

Using the Google Plugin for Eclipse - Google App Engine - Google Code - 0 views

  • Eclipse 3.4 (Ganymede)
  • Help menu &gt; Software Updates...
  • Guestbook
  • ...18 more annotations...
  • http://dl.google.com/eclipse/plugin/3.4
  • Google Plugin for Eclipse 3.4
  • Google Web Toolkit SDK
  • Google App Engine Java SDK
  • Install...
  • restart
  • File menu &gt; New &gt; Web Application Project
  • Add Site...
  • Project name
  • Package
  • guestbook
  • Verify that "Use Google App Engine" is checked.
  • Finish
  • Run menu, Debug As &gt; Web Application
  • App Engine deploy button uploads your application to App Engine:
  • register an application ID with App Engine using the Admin Console
  • edit the appengine-web.xml file and change the &lt;application&gt;...&lt;/application&gt; element to contain the new ID
  • administrator account username (your email address) and password
Mike Chelen

Home :::Academic Journals - 1 views

  •  
    ACADEMIC JOURNALS provides free access to research information to the international community without financial, legal or technical barriers. All the journals from this organization will be freely distributed and available from multiple websites.....ACADEMIC JOURNALS, breaking new frontiers in the world of journals.
irina Popusoi

The New Discovery. Astronomy. Physics. Alternative energy - 0 views

  •  
    A site about the engineer's from Moldova, Leonid Popusoi's, inventions and discoveries in astronomy and physics. \nIt contains descriptions of his inventions, books and videos.
Mike Chelen

Peter Suber, Open Access News - 0 views

  •  
    Law professors defend NIH policy against copyright objections Forty-six law professors and specialists in copyright law wrote to the House Judiciary Committee on September 8 to show that the publishing lobby's objections to the NIH policy misrepresent US copyright law. The Committee had the letter in hand when it convened the September 11 hearing on the Conyers bill. The letter is now online. Excerpt:
Mike Chelen

Protocol for Implementing Open Access Data - 0 views

  • information for the Internet community
  • distributing data or databases
  • “open” and “open access”
  • ...69 more annotations...
  • requirements for gaining and using the Science Commons Open Access Data Mark and metadata
  • interoperability of scientific data
  • terms and conditions around data make integration difficult to legally perform
  • single license
  • data with this license can be integrated with any other data under this license
  • too many databases under too many terms already
  • unlikely that any one license or suite of licenses will have the correct mix of terms
  • principles for open access data and a protocol for implementing those principles
  • Open Access Data Mark and metadata
  • databases and data
  • the foundation to legally integrate a database or data product
  • another database or data product
  • no mechanisms to manage transfer or negotiations of rights unrelated to integration
  • submitted to Science Commons for certification as a conforming implementation
  • Open Access Data trademarks (icons and phrases) and metadata on databases
  • protocol must promote legal predictability and certainty
  • easy to use and understand
  • lowest possible transaction costs on users
  • Science Commons’ experience in distributing a database licensing Frequently Asked Questions (FAQ) file
  • hard to apply the distinction between what is copyrightable and what is not copyrightable
  • lack of simplicity restricts usage
  • reducing or eliminating the need to make the distinction between copyrightable and non-copyrightable elements
  • satisfy the norms and expectations of the disciplines providing the database
  • norms for citation will differ
  • norms must be attached
  • Converge on the public domain by waiving all rights based on intellectual property
  • reconstruction of the public domain
  • scientific norms to express the wishes of the data provider
  • public domain
  • waiving the relevant rights on data and asserting that the provider makes no claims on the data
  • Requesting behavior, such as citation, through norms rather than as a legal requirement based on copyright or contracts, allows for different scientific disciplines to develop different norms for citation.
  • waive all rights necessary for data extraction and re-use
  • copyright
  • sui generis database rights
  • claims of unfair competition
  • implied contracts
  • and other legal rights
  • any obligations on the user of the data or database such as “copyleft” or “share alike”, or even the legal requirement to provide attribution
  • non-legally binding set of citation norms
  • waiving other statutory or intellectual property rights
  • there are other rights, in addition to copyright, that may apply
  • uncopyrightable databases may be protected in some countries
  • sui generis rights apply in the European Union
  • waivers of sui generis and other legal grounds for database protection
  • no contractual controls
  • using contract, rather than intellectual property or statutory rights, to apply terms to databases
  • affirmatively declare that contractual constraints do not apply to the database
  • interoperation with databases and data not available under the Science Commons Open Access Data Protocol through metadata
  • data that is not or cannot be made available under this protocol
  • owner provides metadata (as data) under this protocol so that the existence of the non-open access data is discoverable
  • digital identifiers and metadata describing non-open access data
  • “Licensing” a database typically means that the “copyrightable elements” of a database are made available under a copyright license
  • Database FAQ, in its first iteration, recommended this method
  • recommendation is now withdrawn
  • copyright begins in and ends in many databases
  • database divided into copyrightable and non copyrightable elements
  • user tends to assume that all is under copyright or none is under copyright
  • share-alike license on the copyrightable elements may be falsely assumed to operate on the factual contents of a database
  • copyright in situations where it is not necessary
  • query across tens of thousands of data records across the web might return a result which itself populates a new database
  • selective waiving of intellectual property rights fail to provide a high degree of legal certainty and ease of use
  • problem of false expectations
  • apply a “copyleft” term to the copyrightable elements of a database, in hopes that those elements result in additional open access database elements coming online
  • uncopyrightable factual content
  • republish those contents without observing the copyleft or share-alike terms
  • cascading attribution if attribution is required as part of a license approach
  • Would a scientist need to attribute 40,000 data depositors in the event of a query across 40,000 data sets?
  • conflict with accepted norms in some disciplines
  • imposes a significant transaction cost
Mike Chelen

Science in the open » A breakthrough on data licensing for public science? - 0 views

  • Peter Murray-Rust and others at the Unilever Centre for Molecular Informatics at Cambridge
  • conversation we had over lunch with Peter, Jim Downing, Nico Adams, Nick Day and Rufus Pollock
  • appropriate way to license published scientific data
  • ...27 more annotations...
  • value of share-alike or copyleft provisions of GPL and similar licenses
  • spreading the message and use of Open Content
  • prevent “freeloaders” from being able to use Open material and not contribute back to the open community
  • presumption in this view is that a license is a good, or at least acceptable, way of achieving both these goals
  • allow people the freedom to address their concerns through copyleft approaches
  • Rufus
  • concerned more centrally with enabling re-use and re-purposing of data as far as is possible
  • make it easy for researchers to deliver on their obligations
  • worried by the potential for licensing to make it harder to re-use and re-mix disparate sets of data and content into new digital objects
  • “license”, will have scientists running screaming in the opposite direction
  • we focused on what we could agree on
  • common position statement
  • area of best practice for the publication of data that arises from public science
  • there is a window of opportunity to influence funder positions
  • data sharing policies
  • “following best practice”
  • don’t tend to be concerned about freeloading
  • providing clear guidance and tools
  • if it is widely accepted by their research communities
  • “best practice is X”
  • enable re-use and re-purposing of that data
  • share-alike approaches as a community expectation
  • Explicit statements of the status of data are required and we need effective technical and legal infrastructure to make this easy for researchers.
  • “Where a decision has been taken to publish data deriving from public science research, best practice to enable the re-use and re-purposing of that data, is to place it explicitly in the public domain via {one of a small set of protocols e.g. cc0 or PDDL}.”
  • focuses purely on what should be done once a decision to publish has been made
  • data generated by public science
  • describing this as best practice it also allows deviations that may, for whatever reason, be justified by specific people in specific circumstances
1 - 18 of 18
Showing 20 items per page