Skip to main content

Home/ ODI Data Infrastructure, Institutions & Data Access Initiatives/ Group items tagged citizen

Rss Feed Group items tagged

Ben Snaith

Patterns of data institution that support people to steward data themselves, or become ... - 0 views

  • it enables people to contribute data about them to it and, on a case-by-case basis, people can choose to permit third parties to access that data. This is the pattern that many personal data stores and personal data management systems adopt in holding data and enabling users to unlock new apps and services that can plug into it. Health Bank enables people to upload their medical records and other information like wearable readings and scans to share with doctors or ‘loved ones’ to help manage their care; Japan’s accredited information banks might undertake a similar role. Other examples — such as Savvy and Datacoup — seem to be focused on sharing data with market research companies willing to offer a form of payment. Some digital identity services may also conform to this pattern.
  • it enables people to contribute data about them to it and, on a case-by-case basis, people can choose whether that data is shared with third parties as part of aggregate datasets. OpenHumans is an example that enables communities of people to share data for group studies and other activities. Owners of a MIDATA account can “actively contribute to medical research and clinical studies by granting selective access to their personal data”. The approach put forward by the European DECODE project would seem to support this type of individual buy-in to collective data sharing, in that case with a civic purpose. The concept of data unions advocated by Streamr seeks to create financial value for individuals by creating aggregate collections of data in this way. Although Salus Coop asks its users to “share and govern [their] data together.. to put it at the service of collective return”, it looks as though individuals can choose which uses to put it to.
  • it enables people to contribute data about them to it and decisions about what third parties can access aggregate datasets are taken collectively. As an example, The Good Data seeks to sell browsing data generated by its users “entirely on their members’ terms… [where] any member can participate in deciding these rules”. The members of the Holland Health Data Cooperative would similarly appear to “determine what happens to their data” collectively, as would drivers and other workers who contribute data about them to Workers Info Exchange.
  • ...6 more annotations...
  • it enables people to contribute data about them and defer authority to it to decide who can access the data. A high-profile proposal of this pattern comes in the form of ‘bottom-up data trusts’ — Mozilla Fellow Anouk Ruhaak has described scenarios where multiple people “hand over their data assets or data rights to a trustee”. Some personal data stores and personal information management systems will also operate under this kind of delegated authority within particular parameters or settings.
  • people entrust it to mediate their relationships with services that collect data about them. This is more related to decisions about data collection rather than decisions about access to existing data, but involves the stewardship of data nonetheless. For example, Tom Steinberg has described a scenario whereby “you would nominate a Personal Data Representative to make choices for you about which apps can do what with your data.. [it] could be a big internet company, it could be a church, it could be a trade union, or it could be a dedicated rights group like the Electronic Frontier Foundation”. Companies like Disconnect.Me and Jumbo are newer examples of this type of approach in practice.
  • it enables people to collect or create new data. Again, this pattern describes the collection rather than the re-use of existing data. For example, OpenBenches enables volunteers to contribute information about memorial benches, and OpenStreetMap does similar at much larger scale to collaboratively create and maintain a free map of the world. The ODI has published research into well-known collaboratively maintained datasets, including Wikidata, Wikipedia and MusicBrainz, and a library of related design patterns. I’ve included this pattern here as to me it represents a way for people to be directly involved in the stewardship of data, personal or not.
  • it collects data in providing a service to users and, on a case-by-case basis, users can share that data directly with third parties. This pattern enables users to unlock new services by sharing data about them (such as via Open Banking and other initiatives labelled as ‘data portability’), or to donate data for broader notions of good (such as Strava’s settings that enable its users to contribute data about them to aggregate datasets shared with cities for planning). I like IF’s catalogue of approaches for enabling people to permit access to data in this way, and its work to show how services can design for the fact that data is often about multiple people.
  • it collects data by providing a service to users and shares that data directly with third parties as provisioned for in its Terms and Conditions. This typically happens when we agree to Ts&Cs that allow data about us to be shared with third parties of an organisation’s choice, such as for advertising, and so might be considered a ‘dark’ pattern. However, some data collectors are beginning to do this for more public, educational or charitable purposes — such as Uber’s sharing of aggregations of data with cities via the SharedStreets initiative. Although the only real involvement we have here in stewarding data is in choosing to use the service, might we not begin to choose between services, in part, based on how well they act as data institutions?
  • I echo the point that Nesta recently made in their paper on ‘citizen-led data governance’, that “while it can be useful to assign labels to different approaches, in reality no clear-cut boundary exists between each of the models, and many of the models may overlap”
Ben Snaith

Business models for sustainable research data repositories | OECD - 3 views

shared by Ben Snaith on 01 Jun 20 - No Cached
  • However, for the benefits of open science and open research data to be realised, these data need to be carefully and sustainably managed so that they can be understood and used by both present and future generations of researchers. Data repositories - based in local and national research institutions and international bodies - are where the long-term stewardship of research data takes place and hence they are the foundation of open science. Yet good data stewardship is costly and research budgets are limited. So, the development of sustainable business models for research data repositories needs to be a high priority in all countries.
  • The 47 data repositories analysed reported 95 revenue sources. Typically, repository business models combine structural or host funding with various forms of research and other contract-for-services funding, or funding from charges for access to related value-added services or facilities. A second popular combination is deposit-side funding combined with a mix of structural or host institutional funding, or with revenue from the provision of research, value-added, and other services.
  • Research data repositories themselves can take advantage of the underlying economic differences between research data, which exhibit public good characteristics, and value-adding services and facilities, which typically do not, to develop business models that support free and open data while charging some or all users for access to value-adding services or related facilities
  • ...12 more annotations...
  • Over the centuries, libraries, archives, and museums have shown the practical and policy advantages of preserving sources of knowledge for society. Research and other types of data constitute a relatively new subject that requires our serious attention. Although some research data repositories were founded in the 1960s and even earlier, the data that are now being generated have resulted in the establishment of many new repositories and related infrastructure. Societies need such repositories to ensure that the most useful or unique data are preserved over the long term.
  • First, there are substantial and positive efficiency impacts, not only reducing the cost of conducting research, but also enabling more research to be done, to the benefit of researchers, research organisations, their funders, and society more widely
  • substantial additional reuse of the stored data, with between 44% and 58% of surveyed users across the studies saying they could neither have created the data for themselves nor obtained them elsewhere.
  • While these studies tend to provide a snapshot of the repository's value, which can be affected by the scale, age and prominence of the data repository concerned, it is important to note that in most cases, data archives are appreciating rather than depreciating assets. Most of the economic impact is cumulative and it grows in value over time, whereas most infrastructure (such as ships or buildings) has a declining value as it ages. Like libraries, data collections become more valuable as they grow and the longer one invests in them, provided that the data remain accessible, usable, and used.
  • Openness of public information strengthens freedom and democratic institutions by empowering citizens, and supporting transparency of political decision-making and trust in governance. It is no coincidence that the most repressive regimes have the most secretive institutions and activities (Uhlir, 2004). Open factual datasets also enhance public decision-making from the national to the local levels (Nelson, 2011), and open data policies demonstrate confidence of leadership and generally can broaden the influence of governments (Uhlir and Schröder, 2007). Countries that may be lagging behind socioeconomically frequently can benefit even more from access to public data resources (NRC, 2012b, 2002).
  • The survey of repositories undertaken for this and the previous RDA-WDS study classified the principal research data repository revenue sources as follows: • Structural funding (i.e. central funding or contract from a research or infrastructure funder that is in the form of a longer-term, multi-year contract). We use the term “structural” to underline the difference between this and project funding. The research data repository is considered as a form of research infrastructure or as providing an ongoing service. Although the funding may be regularly reviewed, it is a form of funding that is substantively different to project funding.
  • Host institution funding and support (i.e. direct or indirect support from a host institution). Some research data repositories are hosted by a research performing institution, e.g. a university, and receive direct funding or indirect (but costed) support from their host. • Data deposit fees (i.e. in the form of annual contracts with depositing institutions or per-deposit fees). As indicated, this can take the form of a period contract or a charge per deposit. In either case, the cost is borne by the entity that wishes to ensure that the data are preserved and curated for the long term. • Access charges (i.e. charging for access to standard data or to value-added services and facilities). This covers charges of various sorts (e.g. contract or per-access charges) and can be levied either for standard data or value-added services. In all cases, the cost is borne by the entity that wishes to access and use the data. • Contract services or project funding (i.e. charges for contract services to other parties or for research contracts). This covers short-term contracts and projects for various activities not covered above (i.e. these are not contracts to deposit or access data, but cover other services that may be provided). Similarly, this category of funding is distinct from structural funding because, although it may come from a research or infrastructure funder, it is for specific, time- and objective-limited projects, rather than for ongoing services or infrastructure.
  • The 47 data repositories analysed reported 95 revenue sources, an average of two per repository. Twenty-four repositories reported funding from more than one source, and seven reported more than three revenue sources. Combining revenue sources is an important element in developing a sustainable research data infrastructure.
  • A large majority (more than 80%), said they would not be considering any revenue sources that are incompatible with the open data principle.
  • The stage of development of a repository, its institutional or disciplinary context, its scale, and level of federation are also important determinants of what might be a sustainable business model. Referring to the dynamic of the evolution of firms, some economists draw a human parallel, talking of the phases as births, deaths, and marriages (and sometimes divorces). All phases are needed and should be accommodated. Indeed, sometimes it may not be desirable, effective, or efficient for a repository to be sustainable - provided that the data can continue to be hosted elsewhere.
  • This is the situation facing research data repositories. To be sustainable, data repositories need to generate sufficient revenue to cover their costs, but setting a price above the marginal cost of copying and distribution will reduce net welfare
  • Actions needed to develop a successful research data repository business model include: • Understanding the lifecycle phase of the repository's development (e.g. the need for investment funding, development funding, ongoing operational funding, or transitional funding) • Identifying who the stakeholders are (e.g. data depositors, data users, research institutions, research funders, and policy makers) • Developing the product/service mix (e.g. basic data, value-added data, value-added services and related facilities, and contract and research services) • Understanding the cost drivers and matching revenue sources (e.g. scaling with demand for data ingest, data use, the development and provision of value-adding services or related facilities, research priorities, and policy mandates) • Identifying revenue sources (e.g. structural funding, host institutional funding, deposit-side charges, access charges, and value-added services or facilities charges ) • Making the value proposition to stakeholders (e.g. measuring impacts and making the research case, measuring value and making the economic case, informing, and educating) (Figure 6).
1 - 3 of 3
Showing 20 items per page