MODULE 2 OF 4 · PRACTICE AND STRATEGY

Data Sharing, Trust Frameworks, and Standards

60 min read 3 outcomes Interactive quiz

By the end of this module you will be able to:

  • Identify the six lawful bases for data processing under UK GDPR and explain when each is appropriate
  • Distinguish between data trusts, data cooperatives, and data clean rooms, with examples of each
  • Explain how federated data analysis and privacy-enhancing technologies enable data sharing without data disclosure
Medical data and research environment representing NHS data analysis (photo on Unsplash)

Real-world innovation · OpenSAFELY · COVID-19 vaccine safety (2021)

What if the most privacy-preserving way to share data is to never share it at all?

In January 2021, within weeks of the UK COVID-19 vaccine rollout beginning, a team of researchers at the University of Oxford published the first population-scale vaccine safety analysis. They had analysed 57 million NHS patient records. Not a single patient record had left an NHS server.

OpenSAFELY, built by Ben Goldacre's team, inverted the conventional model of research data access. Instead of giving researchers a data extract they could query on their own machines, OpenSAFELY gave researchers a secure coding environment running inside the NHS data infrastructure. Researchers wrote code; the code ran against the data; only aggregated, anonymised results could be exported. The data never moved to the researcher. The researcher's question moved to the data.

This is federated data analysis: computation happens at the data source, not at a central analyst's location. It enabled research that would have taken years of data sharing negotiations and regulatory approvals to be completed in weeks, with stronger privacy protections than any conventional data sharing arrangement could provide.

What if the most privacy-preserving way to share data is to never share it at all?

With the learning outcomes established, this module begins by examining uk data governance landscape in depth.

13.1 UK Data Governance Landscape

Following the United Kingdom's departure from the European Union, UK GDPR came into force on 1 January 2021, incorporating the EU GDPR's substantive requirements into UK domestic law alongside the Data Protection Act 2018. The two instruments work together: the Data Protection Act provides the UK-specific provisions that the EU GDPR permitted member states to determine domestically.

The Information Commissioner's Office (ICO) is the UK's independent supervisory authority for data protection law. It has the power to issue fines of up to GBP 17.5 million or 4% of global annual turnover (whichever is higher) for serious infringements of UK GDPR. Since 2018, the ICO has fined organisations including British Airways (GBP 20 million, 2020), Marriott International (GBP 18.4 million, 2020), and TSB (GBP 49 million, 2023 for operational resilience breaches).

The UK National Data Strategy (2020) positioned data as national infrastructure, comparable to roads and energy networks. The strategy identified that the UK's ability to generate value from data depended on improving data quality in public sector organisations, establishing trusted data sharing frameworks, and developing smart data schemes that enable consumers to access and share their own data held by regulated organisations.

UK GDPR and the EU GDPR are substantively aligned but not identical. Organisations that operate in both the UK and EU must comply with both regimes. The UK's adequacy decision from the EU Commission (granted June 2021) allows personal data to flow between the EU and UK without additional safeguards, but this decision is not permanent and could be reviewed.

Understanding the regulatory landscape sets the context. Section 13.2 goes deeper into the specific legal bases that must be satisfied before data can legally be shared, and the distinctions that trip organisations up in practice.

13.2 Legal Bases for Data Sharing

UK GDPR Article 6 specifies six lawful bases for processing personal data. Every instance of personal data processing must rely on at least one lawful basis. The choice of lawful basis is not discretionary and cannot be changed retrospectively: if an organisation relies on consent, it cannot later claim legitimate interests for the same processing activity if consent is withdrawn.

The six bases are: consent (the data subject has given freely given, specific, informed, unambiguous consent); contract(processing is necessary for a contract with the data subject); legal obligation (processing is necessary to comply with a legal duty); vital interests (processing is necessary to protect someone's life); public task (processing is necessary for a public authority carrying out an official function); and legitimate interests (processing is necessary for the controller's legitimate interests, balanced against the data subject's rights).

For public sector organisations processing data under their statutory functions, the public task basis is typically most appropriate. For research organisations processing NHS data, a specific research exemption under Article 89 may apply. For commercial organisations, legitimate interests is the most flexible basis but requires a Legitimate Interests Assessment (LIA) documenting that the interests are genuine, the processing is necessary, and the data subject's rights do not override those interests.

The ICO guidance identifies consent as the most demanding basis to maintain: consent must be freely given (cannot be a condition of service), specific, informed, and easily withdrawable. For data sharing arrangements between organisations, consent is rarely the appropriate basis because it requires ongoing management of consent records and creates operational fragility.

Legitimate interests is the most flexible lawful basis, but you cannot assume it will always apply. It will not be appropriate where your interests are overridden by the interests or fundamental rights and freedoms of the individual.

ICO, Guide to the UK General Data Protection Regulation - Lawful basis for processing

The ICO's framing of legitimate interests as 'flexible but not always appropriate' is practically important. Many organisations default to legitimate interests as a catch-all basis without conducting a Legitimate Interests Assessment. This creates regulatory risk: if challenged, the organisation must be able to demonstrate that it balanced its interests against the data subject's rights. Without a documented LIA, this defence is difficult to sustain.

Common misconception

Consent is always required for data sharing.

UK GDPR specifies six lawful bases for processing, of which consent is one. Public sector organisations typically rely on the public task basis for their statutory functions. Research organisations often use the Article 89 research exemption. The NHS shares patient data under legal obligation and public task bases routinely. Requiring consent for every data sharing arrangement would make most public sector data use impractical. The correct question is 'which lawful basis applies?' not 'have we obtained consent?'

Loading interactive component...

Legal compliance defines the floor for data sharing. Section 13.3 covers emerging organisational structures - data trusts, cooperatives, and commons - that go beyond compliance to build the trust infrastructure needed for valuable data exchange.

13.3 Data Trusts, Cooperatives, and Commons

Collective data governance models address a problem that individual data protection law does not: how should groups of people or organisations govern data that has collective value, such as patient health data, urban mobility data, or agricultural sensor data?

A data trust is a legal structure in which a trustee holds stewardship rights over data on behalf of data subjects. The trustee is independent of the data holder and has a legal duty of care to the beneficiaries (data subjects). UK Biobank is the most prominent UK example: an independent body governs access to 500,000 participants' health and genetic data for research, with an access committee that evaluates requests against defined criteria.

A data cooperative is owned and governed by its members. Members pool their data collectively, with governance decisions made democratically. MIDATA in Switzerland enables members to control and selectively share their health and financial data with services of their choice. Data cooperatives give members negotiating power they would not have individually.

A data commons is data made openly available under defined governance conditions, typically for public benefit. The UK's Ordnance Survey OpenData provides geographic data as a commons. The Open Data Institute (ODI, founded by Tim Berners-Lee in 2012) promotes data commons through its open data standards and certification scheme.

Data governance spectrum from open commons to controlled trusts, illustrating models by sensitivity, beneficiary type, and jurisdiction
Data governance models span a spectrum from open commons to tightly controlled trusts. The appropriate model depends on the sensitivity of the data, the nature of the beneficiaries, and the legal context of the jurisdiction.
Loading interactive component...

Data trusts and cooperatives are voluntary arrangements. Smart data schemes are regulatory frameworks that mandate data portability in specific sectors. Section 13.4 covers the UK government's Smart Data programme and how Open Banking demonstrated the model at scale.

13.4 Smart Data Schemes

Smart data schemes are regulatory frameworks that require organisations holding customer data to share that data with third parties at the customer's instruction. They extend the GDPR right to data portability into specific, interoperable standards.

Open Banking is the UK's most mature smart data scheme. The Competition and Markets Authority ordered the nine largest UK banks to implement it in 2016. By 2024, over 7 million UK consumers used Open Banking-enabled services, sharing their bank account data with budgeting apps, mortgage brokers, and credit reference services via standardised APIs. The scheme is regulated by the Open Banking Implementation Entity (OBIE).

The Data (Use and Access) Act 2025 extended the smart data framework beyond banking to energy (smart meter data sharing), telecommunications (call records and usage data), and mortgage data. Each sector scheme specifies the data standards, security requirements, and consumer consent mechanisms that participants must implement.

Smart data schemes create value by reducing information asymmetries: a consumer who can share their energy usage data with multiple providers gets more competitive quotes; a consumer who can share their full bank transaction history with a mortgage broker gets a faster, more accurate affordability assessment.

Smart data schemes require data to leave the data controller. Privacy-enhancing technologies offer an alternative: analysis that never requires sharing raw data at all. Section 13.5 covers federated learning, differential privacy, and secure multi-party computation.

13.5 Federated Data Analysis and Privacy-Enhancing Technologies

Privacy-Enhancing Technologies (PETs) are a set of techniques that enable data analysis while minimising the exposure of raw personal data. The UK ICO and the US National Institute of Standards and Technology (NIST) have both published guidance on PETs as a privacy-by-design mechanism.

Federated learning trains a machine learning model across multiple data holders without aggregating the data centrally. Each participant trains a local model on their own data and shares only model parameters (gradients) with a central aggregator. Google uses federated learning to train the Gboard keyboard prediction model on users' devices without any text leaving the device.

Differential privacy adds calibrated statistical noise to query results, making it mathematically impossible to determine whether any specific individual's data contributed to a result. Apple uses differential privacy for its usage analytics. The UK Statistics Authority uses differential privacy techniques for small-area census data releases to prevent re-identification.

Secure multi-party computation allows multiple parties to jointly compute a function over their combined data without any party learning the other's input data. Data clean rooms (used by Google, Meta, and AWS) apply this principle to advertising measurement: two parties can compute the overlap of their customer lists without either party seeing the other's list.

OpenSAFELY combined federated analysis with output checking: researchers' code ran at the data source, and outputs were manually checked for re-identification risk before release. This human-in-the-loop output checking is an additional safeguard for high-risk datasets.

Common misconception

Sharing data always means giving someone a copy.

Federated analysis, data clean rooms, secure multi-party computation, and differential privacy all enable data to be used without disclosure. OpenSAFELY analysed 57 million NHS records without a single record leaving an NHS server. Google Ads Data Hub allows advertisers to measure campaign effectiveness without receiving individual user data. The question 'how do we share this data?' should be preceded by 'does the data need to move at all?'

Data is a new form of infrastructure. Like roads, clean water and energy, data helps people to travel through life, and to meet their basic needs.

UK National Data Strategy, November 2020 - Chapter 1: Data as Infrastructure

The infrastructure framing has regulatory consequences: infrastructure is a public good that requires governance, standards, and investment. When the UK government describes data as infrastructure, it is arguing that data governance is a policy question, not just a compliance question. This frames data trusts, smart data schemes, and PETs as public infrastructure investments, not merely data protection mechanisms.

OpenSAFELY federated analysis architecture enabling population-scale health research with stronger privacy protections than conventional data sharing
OpenSAFELY demonstrated that federated analysis can enable population-scale health research faster and with stronger privacy protections than conventional data sharing. The model is now being adopted for other sensitive datasets.
Loading interactive component...
Loading interactive component...
13.6 Check your understanding

An NHS trust wants to share patient data with a pharmaceutical company for research into drug side effects. The trust's data protection officer is asked which lawful basis applies. Which basis is most appropriate and why?

A retailer wants to allow customers to share their loyalty card transaction history with a competing retailer to receive a personalised switching offer. This is most analogous to which of the following regulatory frameworks?

OpenSAFELY analysed 57 million NHS patient records for COVID-19 vaccine safety without any patient record leaving NHS servers. What does this demonstrate about the relationship between data utility and data privacy?

Key takeaways

  • UK GDPR specifies six lawful bases for data processing; consent is the most demanding to maintain and often inappropriate for public sector or research contexts; the public task and legal obligation bases are more commonly relied upon in public sector organisations.
  • Data trusts, data cooperatives, and data commons are collective governance models for data that has shared value; each has different ownership and accountability structures suited to different contexts.
  • Smart data schemes (Open Banking, energy smart data) require regulated organisations to share customer data with third parties at the customer's instruction, using standardised APIs, enabling value creation from data portability.
  • Federated data analysis runs computation at the data source rather than extracting data, enabling population-scale analysis without data disclosure; OpenSAFELY used this model to analyse 57 million NHS records with stronger privacy protection than any conventional data sharing arrangement.
  • Privacy-Enhancing Technologies (differential privacy, federated learning, secure multi-party computation) enable data to be used without disclosure; the question 'does the data need to move?' should precede any data sharing conversation.
  • The assumed trade-off between data utility and data privacy is a technical design problem, not an inevitable constraint; well-designed federated systems can provide more data utility with stronger privacy than conventional sharing.

Standards and sources cited in this module

  1. ICO, Guide to the UK General Data Protection Regulation

    ico.org.uk/for-organisations/guide-to-data-protection

    Primary source for UK GDPR lawful bases guidance, legitimate interests assessment methodology, and data protection principles. Referenced throughout Sections 13.1 and 13.2.

  2. UK National Data Strategy

    DCMS, November 2020

    Source for data-as-infrastructure framing and smart data scheme policy context. Referenced in Sections 13.1 and 13.5.

  3. OpenSAFELY, Collaborative

    opensafely.org

    Primary source for federated analysis case study cited in the opening story and Section 13.5. 57 million records figure from Williamson et al., Nature 2020.

  4. Open Banking Limited, Open Banking Impact Report

    openbanking.org.uk, 2024

    Source for 7 million Open Banking users figure and smart data scheme design referenced in Section 13.4.

  5. ICO and NIST, Privacy-Enhancing Technologies Guide

    ico.org.uk/about-the-ico/research-and-reports/pet-guidance

    Primary source for PET definitions including federated learning, differential privacy, and secure multi-party computation referenced in Section 13.5.

  6. Hall, W. and Pesenti, J., Growing the Artificial Intelligence Industry in the UK

    DCMS, 2017

    Context for UK data policy landscape including data trust proposals and governance framework development.

Data sharing requires trust infrastructure. The next module examines the platforms and ecosystems that organisations operate within, and the vendor management decisions that determine whether digital investments create flexibility or dependency.

Module 13 of 15 in Practice and Strategy