Skip to main content

Home/ Groups/ ODI Data Infrastructure, Institutions & Data Access Initiatives
Ben Snaith

345725803-The-state-of-weather-data-infrastructure-white-paper.pdf - 1 views

  • From its early beginnings over 150 years ago, weather forecasting at the Met Office has been driven by data. Simple observations recorded and used to hand-plot synoptic charts have been exchanged for the billions of observations received and handled every day, mainly from satellites but also from weather stations, radar , ocean buoys, planes, shipping and the public.
  • The key stages of the weather data value chain are as follows: Ÿ Monitoring and observation of the weather and environment, e.g. by NMSs. Ÿ Numerical weather prediction (NWP) and climate modelling carried out by NMSs to create global, regional and limited area weather forecasts. Private companies are growing their presence in the market and challenging the traditional role of NMSs to provide forecasts to the public, by statistically blending data from NMS models to create their own forecast models, for example. Other companies providing data via online channels and/or apps include The Weather Company, Accuweather or the Climate Corporation. Ÿ Communication and dissemination of forecasts by news, NMS and media organisations like the BBC, Yahoo and Google, or within consumer-targeted mobile and web applications. Ÿ Decision making by individuals and businesses across a variety of sectors, which draws on weather data and reporting.
  • The core data asset of our global weather data infrastructure is observation data that captures a continuous record of weather and climate data around the world. This includes temperature, rainfall, wind speed and details of a host of other atmospheric, surface and marine conditions.
  • ...5 more annotations...
  • The collection of observation data is a global effort. The Global Observing System consists of around 11,000 ground-based monitoring stations supplemented with thousands of sensors installed on weather balloons, aircraft and ships. 3 Observations are also collected from a network of radar installations and satellite-based sensors. As we see later, the ‘official’ observation system is increasingly being supplemented with new sources of observation data from the Internet of Things (IoT).
  • Ensemble model forecasts aim to give an indication of the range of possible future states of the atmosphere and oceans (which are a significant driver of the weather in the atmosphere). This overcomes errors introduced by using imperfect measurement of initial starting conditions that are then amplified by the chaotic nature of the atmosphere. Increasing the number of forecast members over a global scale and at higher resolutions result in data volumes increasing exponentially .
  • Created in 1950, The World Meteorological Organisation (WMO) is made up of 191 member states and territories around the world. The WMO was founded on the principle that global coordination was necessary to reap the benefits of weather and climate data. This includes a commitment to weather data and products being freely exchanged around the world (Obasi, 1995).
  • While the WMO has a global outlook, its work is supplemented by regional meteorological organisations like the European Centre for Medium Range Weather Forecasts (ECMWF) and NMSs, such as the Met Office in the UK
  • There are increasing new sources of weather observation data. In recent years, services like Weather Underground and the Met Office’s Weather Observation Website have demonstrated the potential for people around the world to contribute weather observations about their local areas – using low-cost home weather stations and sensors, for example. But there is now potential for sensors in cities, homes, cars, cell towers and even mobile phones to contribute observational data that could also be fed into forecast models.
Ben Snaith

Sharing tools and data globally will help us beat COVID-19 | World Economic Forum - 0 views

  • Second, we need to create open-source structures that allow national and sub-national level health systems to collect and share this precious data in a timely, privacy-preserving manner. Fragile health systems around the world have already been overwhelmed with the tsunami of demand that has arisen from the spread of COVID-19. Everyone racing to create their own solutions to problems negates the need for speed we have in this pandemic. An epidemic somewhere has the potential to become a pandemic everywhere. We need to share tools – both hardware and software – openly and understand that short term gains in one area of the world are meaningless if not shared with other areas that are battling this virus.
Ben Snaith

Patterns of data institution that support people to steward data themselves, or become ... - 0 views

  • it enables people to contribute data about them to it and, on a case-by-case basis, people can choose to permit third parties to access that data. This is the pattern that many personal data stores and personal data management systems adopt in holding data and enabling users to unlock new apps and services that can plug into it. Health Bank enables people to upload their medical records and other information like wearable readings and scans to share with doctors or ‘loved ones’ to help manage their care; Japan’s accredited information banks might undertake a similar role. Other examples — such as Savvy and Datacoup — seem to be focused on sharing data with market research companies willing to offer a form of payment. Some digital identity services may also conform to this pattern.
  • it enables people to contribute data about them to it and, on a case-by-case basis, people can choose whether that data is shared with third parties as part of aggregate datasets. OpenHumans is an example that enables communities of people to share data for group studies and other activities. Owners of a MIDATA account can “actively contribute to medical research and clinical studies by granting selective access to their personal data”. The approach put forward by the European DECODE project would seem to support this type of individual buy-in to collective data sharing, in that case with a civic purpose. The concept of data unions advocated by Streamr seeks to create financial value for individuals by creating aggregate collections of data in this way. Although Salus Coop asks its users to “share and govern [their] data together.. to put it at the service of collective return”, it looks as though individuals can choose which uses to put it to.
  • it enables people to contribute data about them to it and decisions about what third parties can access aggregate datasets are taken collectively. As an example, The Good Data seeks to sell browsing data generated by its users “entirely on their members’ terms… [where] any member can participate in deciding these rules”. The members of the Holland Health Data Cooperative would similarly appear to “determine what happens to their data” collectively, as would drivers and other workers who contribute data about them to Workers Info Exchange.
  • ...6 more annotations...
  • it enables people to contribute data about them and defer authority to it to decide who can access the data. A high-profile proposal of this pattern comes in the form of ‘bottom-up data trusts’ — Mozilla Fellow Anouk Ruhaak has described scenarios where multiple people “hand over their data assets or data rights to a trustee”. Some personal data stores and personal information management systems will also operate under this kind of delegated authority within particular parameters or settings.
  • people entrust it to mediate their relationships with services that collect data about them. This is more related to decisions about data collection rather than decisions about access to existing data, but involves the stewardship of data nonetheless. For example, Tom Steinberg has described a scenario whereby “you would nominate a Personal Data Representative to make choices for you about which apps can do what with your data.. [it] could be a big internet company, it could be a church, it could be a trade union, or it could be a dedicated rights group like the Electronic Frontier Foundation”. Companies like Disconnect.Me and Jumbo are newer examples of this type of approach in practice.
  • it enables people to collect or create new data. Again, this pattern describes the collection rather than the re-use of existing data. For example, OpenBenches enables volunteers to contribute information about memorial benches, and OpenStreetMap does similar at much larger scale to collaboratively create and maintain a free map of the world. The ODI has published research into well-known collaboratively maintained datasets, including Wikidata, Wikipedia and MusicBrainz, and a library of related design patterns. I’ve included this pattern here as to me it represents a way for people to be directly involved in the stewardship of data, personal or not.
  • it collects data in providing a service to users and, on a case-by-case basis, users can share that data directly with third parties. This pattern enables users to unlock new services by sharing data about them (such as via Open Banking and other initiatives labelled as ‘data portability’), or to donate data for broader notions of good (such as Strava’s settings that enable its users to contribute data about them to aggregate datasets shared with cities for planning). I like IF’s catalogue of approaches for enabling people to permit access to data in this way, and its work to show how services can design for the fact that data is often about multiple people.
  • it collects data by providing a service to users and shares that data directly with third parties as provisioned for in its Terms and Conditions. This typically happens when we agree to Ts&Cs that allow data about us to be shared with third parties of an organisation’s choice, such as for advertising, and so might be considered a ‘dark’ pattern. However, some data collectors are beginning to do this for more public, educational or charitable purposes — such as Uber’s sharing of aggregations of data with cities via the SharedStreets initiative. Although the only real involvement we have here in stewarding data is in choosing to use the service, might we not begin to choose between services, in part, based on how well they act as data institutions?
  • I echo the point that Nesta recently made in their paper on ‘citizen-led data governance’, that “while it can be useful to assign labels to different approaches, in reality no clear-cut boundary exists between each of the models, and many of the models may overlap”
Ben Snaith

Business models for sustainable research data repositories | OECD - 3 views

shared by Ben Snaith on 01 Jun 20 - No Cached
  • However, for the benefits of open science and open research data to be realised, these data need to be carefully and sustainably managed so that they can be understood and used by both present and future generations of researchers. Data repositories - based in local and national research institutions and international bodies - are where the long-term stewardship of research data takes place and hence they are the foundation of open science. Yet good data stewardship is costly and research budgets are limited. So, the development of sustainable business models for research data repositories needs to be a high priority in all countries.
  • The 47 data repositories analysed reported 95 revenue sources. Typically, repository business models combine structural or host funding with various forms of research and other contract-for-services funding, or funding from charges for access to related value-added services or facilities. A second popular combination is deposit-side funding combined with a mix of structural or host institutional funding, or with revenue from the provision of research, value-added, and other services.
  • Research data repositories themselves can take advantage of the underlying economic differences between research data, which exhibit public good characteristics, and value-adding services and facilities, which typically do not, to develop business models that support free and open data while charging some or all users for access to value-adding services or related facilities
  • ...12 more annotations...
  • Over the centuries, libraries, archives, and museums have shown the practical and policy advantages of preserving sources of knowledge for society. Research and other types of data constitute a relatively new subject that requires our serious attention. Although some research data repositories were founded in the 1960s and even earlier, the data that are now being generated have resulted in the establishment of many new repositories and related infrastructure. Societies need such repositories to ensure that the most useful or unique data are preserved over the long term.
  • First, there are substantial and positive efficiency impacts, not only reducing the cost of conducting research, but also enabling more research to be done, to the benefit of researchers, research organisations, their funders, and society more widely
  • substantial additional reuse of the stored data, with between 44% and 58% of surveyed users across the studies saying they could neither have created the data for themselves nor obtained them elsewhere.
  • While these studies tend to provide a snapshot of the repository's value, which can be affected by the scale, age and prominence of the data repository concerned, it is important to note that in most cases, data archives are appreciating rather than depreciating assets. Most of the economic impact is cumulative and it grows in value over time, whereas most infrastructure (such as ships or buildings) has a declining value as it ages. Like libraries, data collections become more valuable as they grow and the longer one invests in them, provided that the data remain accessible, usable, and used.
  • Openness of public information strengthens freedom and democratic institutions by empowering citizens, and supporting transparency of political decision-making and trust in governance. It is no coincidence that the most repressive regimes have the most secretive institutions and activities (Uhlir, 2004). Open factual datasets also enhance public decision-making from the national to the local levels (Nelson, 2011), and open data policies demonstrate confidence of leadership and generally can broaden the influence of governments (Uhlir and Schröder, 2007). Countries that may be lagging behind socioeconomically frequently can benefit even more from access to public data resources (NRC, 2012b, 2002).
  • The survey of repositories undertaken for this and the previous RDA-WDS study classified the principal research data repository revenue sources as follows: • Structural funding (i.e. central funding or contract from a research or infrastructure funder that is in the form of a longer-term, multi-year contract). We use the term “structural” to underline the difference between this and project funding. The research data repository is considered as a form of research infrastructure or as providing an ongoing service. Although the funding may be regularly reviewed, it is a form of funding that is substantively different to project funding.
  • Host institution funding and support (i.e. direct or indirect support from a host institution). Some research data repositories are hosted by a research performing institution, e.g. a university, and receive direct funding or indirect (but costed) support from their host. • Data deposit fees (i.e. in the form of annual contracts with depositing institutions or per-deposit fees). As indicated, this can take the form of a period contract or a charge per deposit. In either case, the cost is borne by the entity that wishes to ensure that the data are preserved and curated for the long term. • Access charges (i.e. charging for access to standard data or to value-added services and facilities). This covers charges of various sorts (e.g. contract or per-access charges) and can be levied either for standard data or value-added services. In all cases, the cost is borne by the entity that wishes to access and use the data. • Contract services or project funding (i.e. charges for contract services to other parties or for research contracts). This covers short-term contracts and projects for various activities not covered above (i.e. these are not contracts to deposit or access data, but cover other services that may be provided). Similarly, this category of funding is distinct from structural funding because, although it may come from a research or infrastructure funder, it is for specific, time- and objective-limited projects, rather than for ongoing services or infrastructure.
  • The 47 data repositories analysed reported 95 revenue sources, an average of two per repository. Twenty-four repositories reported funding from more than one source, and seven reported more than three revenue sources. Combining revenue sources is an important element in developing a sustainable research data infrastructure.
  • A large majority (more than 80%), said they would not be considering any revenue sources that are incompatible with the open data principle.
  • The stage of development of a repository, its institutional or disciplinary context, its scale, and level of federation are also important determinants of what might be a sustainable business model. Referring to the dynamic of the evolution of firms, some economists draw a human parallel, talking of the phases as births, deaths, and marriages (and sometimes divorces). All phases are needed and should be accommodated. Indeed, sometimes it may not be desirable, effective, or efficient for a repository to be sustainable - provided that the data can continue to be hosted elsewhere.
  • This is the situation facing research data repositories. To be sustainable, data repositories need to generate sufficient revenue to cover their costs, but setting a price above the marginal cost of copying and distribution will reduce net welfare
  • Actions needed to develop a successful research data repository business model include: • Understanding the lifecycle phase of the repository's development (e.g. the need for investment funding, development funding, ongoing operational funding, or transitional funding) • Identifying who the stakeholders are (e.g. data depositors, data users, research institutions, research funders, and policy makers) • Developing the product/service mix (e.g. basic data, value-added data, value-added services and related facilities, and contract and research services) • Understanding the cost drivers and matching revenue sources (e.g. scaling with demand for data ingest, data use, the development and provision of value-adding services or related facilities, research priorities, and policy mandates) • Identifying revenue sources (e.g. structural funding, host institutional funding, deposit-side charges, access charges, and value-added services or facilities charges ) • Making the value proposition to stakeholders (e.g. measuring impacts and making the research case, measuring value and making the economic case, informing, and educating) (Figure 6).
Ben Snaith

AnnualReport2018LandPortal.pdf - 0 views

shared by Ben Snaith on 01 Jun 20 - No Cached
  • We are proud to say that the Land Portal has become far and wide the world’s leading source of data and information on land, with more than 30,000 visits a month, a majority of which are from the Global South. The Land Portal now counts more than 760 land-related indicators from 45 datasets aggregated from trusted sources around the world. This great diversity of information feeds into a dozen thematic portfolios and more than 60 country portfolios that combine data with the latest news, relevant publications, organizations and more.
  • Land Portal is leading the way on the adoption of open data and making land governance information more accessible. We brought together the land governance community for the Partnership for Action workshop to set the stage for building an information ecosystem on land governance, resulting in an action plan on data collection, management and dissemination. The Land Portal’s approach to capacity building was further refined at a workshop in Pretoria, South Africa, which created a great deal of momentum to adopt open data practices in this country
  • We are grateful for the United Kingdom’s Department for International Development (DFID) steadfast support, as well as the support of the Omidyar Network. We are also thankful for the support of Food and Agriculture Organization of the United Nations (FAO), GIZ - German Cooperation and the collaboration of all of our partners, without which our work would not be possible
Ben Snaith

TheDiamondReport_TheSecondCut_2018-FINAL.pdf - 0 views

  • CDN will seek partnership with others in the immediate sector and beyond who have an interest in, and access to, other relevant data sets, so that we might collaborate and find ways to integrate data, to build a bigger picture about diversity in the industry.
Ben Snaith

Every day, we rely on digital infrastructure built by volunteers. What happens when it ... - 0 views

  • Free and public code grew in direct response to the perceived failings of expensive, proprietary commercial software. As a result, the heart of the problem with digital infrastructure is also part of what makes it so rich with potential: It is not centralized. There is no one person or entity deciding what’s needed and what’s not. There is also no one overseeing how digital infrastructure is implemented. And because the community of volunteers developing this infrastructure has a complicated relationship with what might be seen as a more traditional, or official, way of doing things, few digital infrastructure projects have a clear business model or source of revenue. Even projects that have grown to be used by millions of people tend to lack a cohesive structure and plan for sustaining the technology’s long-term development.
  • We need to start by educating people who are in positions to provide support. Many of them—from start-up engineers to government officials—don’t know enough about how digital infrastructure functions and what it requires, or are under the perception that public software doesn’t need support.
Ben Snaith

Diamond_theFirstCut_pdf.pdf - 0 views

  • Currently, we are also unable to ascertain the extent to which our data sample is representative of the workforce it is trying to capture. Although we are reporting on 80,804 contributions from 5,904 contributors, the response rate is relatively low (24.3% of those invited to submit data). The low response rate and self-selecting nature of Diamond means there is the possibility of bias in the data we present here. We are taking this into account and will consider it as we undertake an equality analysis 1 of the system one year on.
  • Diamond collects: • Actual diversity data (across six protected characteristic groups) from individuals (contributors) who have a role in making television, whether on- or off-screen; and • Perceived diversity data (across the six protected characteristics) of the on-screen contributors (i.e. diversity characteristics as viewers might perceive them).
  • Diamond is collecting: • Actual diversity data (from those making and appearing on television, including freelancers) and Perceived diversity data (how the viewer might perceive those they see on television) • Data across six protected characteristic groups: gender, gender identity, age, ethnicity, sexual orientation, and disability 2 • Data from those making a significant contribution to a programme • Data from original programmes only, commissioned by the current five Diamond broadcasters for UK transmission • Data from programmes across all genres (although we do not currently report on news and sport) broadcast on a total of 30 channels across the five Diamond broadcasters.
  • ...2 more annotations...
  • Diamond diversity data is collected via the online platform Silvermouse, which is already used by many broadcasters and production companies to collect and manage other kinds of information about the programmes they make.
  • Diamond does not collect: • Data from programmes which have not been commissioned by the five Diamond broadcasters • Data on people working across broadcasting more generally, outside of production (in other words, our data are not overall workforce statistics ) • Data where it is impractical to do so and where relevant privacy notices cannot be given. (Diamond does not collect data from every person appearing on television as part of a crowd scene, for example.)
  •  
    This report provides an initial view of the data that has been collected and made available to CDN since Diamond went live on 15 August 2016.
Ben Snaith

Why we're calling for a data collective - The Catalyst - 0 views

  • We propose forming a data collective: a conscious, coordinated effort by a group of organisations with expertise in gathering and using data in the charity sector. We want to make sure that people in charities, on the front line and in leadership positions have access to the information they need, in a timely fashion, in the easiest possible format to understand, with the clearest possible analysis of what it means for them.
  •  
    "Social Economy Data Lab"
Ben Snaith

Privacy not a blocker for 'meaningful' research access to platform data, says report | ... - 0 views

  • The report, which the authors are aiming at European Commission lawmakers as they ponder how to shape an effective platform governance framework, proposes mandatory data sharing frameworks with an independent EU-institution acting as an intermediary between disclosing corporations and data recipients.
  • “Such an institution would maintain relevant access infrastructures including virtual secure operating environments, public databases, websites and forums. It would also play an important role in verifying and pre-processing corporate data in order to ensure it is suitable for disclosure,” they write in a report summary.
  • A market research purpose might only get access to very high level data, he suggests. Whereas medical research by academic institutions could be given more granular access — subject, of course, to strict requirements (such as a research plan, ethical board review approval and so on).
Ben Snaith

Actually, nonprofits don't spend enough money on overhead - Quartz - 0 views

  • Successful organizations require financial systems, information technology, volunteer management and sustainable revenue streams. Part of the myth of the nonprofit world is that somehow righteousness will ultimately triumph over limited planning, crappy systems and a general scarcity of resources. But that is not the way the world works.
Ben Snaith

How the Coronavirus Crisis May Upend Grant Making for Good - The Chronicle of Philanthropy - 0 views

  • Our effort, known as Building Institutions and Networks, or Build, provides long-term, flexible funding and a deep sense of partnership with grantees, which leads to impressive outcomes for social-change organizations around the world. More than 80 percent of Build grantees report that because of Build support, their work is more effective, their networks and fields are stronger, and they have been better able to take advantage of strategic opportunities and counter external threats.
  • Flexible funding requires foundations to be flexible in their own grant-making strategies — and to listen deeply to their nonprofit partners in developing strategy in the first place.
  • For grant makers willing to take the leap, many funding colleagues can show you the way. The Trust-Based Philanthropy Project, Grantmakers for Effective Organizations, the Full Cost Project, and many others offer tools and resources for funders on how to make larger, longer, more flexible grants.
Ben Snaith

Flexibility for Grantees Is Not Enough. Let Them Decide Where the Money Goes (Letter to... - 0 views

  • To meaningfully support the nonprofit sector at this time, she argues that funders must transform their practices to be more flexible and less bureaucratic. She also says that they should offer unrestricted and easy access to grant money and provide long-term support.
Ben Snaith

HowToFundTech2020.pdf - 0 views

shared by Ben Snaith on 04 Aug 20 - No Cached
1 - 20 of 46 Next › Last »
Showing 20 items per page