Data Foundations · Module 6

Open data, data sharing, and FAIR thinking

Open data is not “everything on the internet”.

22 min 4 outcomes Data Foundations

Previously

Standards, schemas, and interoperability

Interoperability is a boring word for a very expensive problem.

This module

Open data, data sharing, and FAIR thinking

Open data is not “everything on the internet”.

Next

Visualisation basics (so charts do not lie to you)

Visualisation is part of data literacy.

Progress

Mark this module complete when you can explain it without rereading every paragraph.

Why this matters

Most real-world data lives in the middle: shared with specific parties under agreements.

What you will be able to do

  • 1 Explain open data, data sharing, and fair thinking in your own words and apply it to a realistic scenario.
  • 2 Open data and FAIR principles help when you are clear about audience, risk, and governance.
  • 3 Check the assumption "Audience is known" and explain what changes if it is false.
  • 4 Check the assumption "Risk is assessed" and explain what changes if it is false.

Before you begin

  • No previous technical background required
  • Read the section explanation before using tools

Common ways people get this wrong

  • Accidental disclosure. Publishing can expose sensitive patterns even without obvious identifiers.
  • Unusable open data. Open data without metadata is hard to reuse and easy to misinterpret.

Open data is not “everything on the internet”. It is a choice about access and reuse. Some data should be open because it improves transparency and innovation. Some data must stay restricted because it contains personal or security-sensitive information. A mature organisation can explain the difference without hand-waving.

The data spectrum (closed, shared, open)

Most real-world data lives in the middle: shared with specific parties under agreements. The useful question is not “open or closed”. It is “who can access, for what purpose, with what safeguards, and for how long”.

FAIR as a quality lens

FAIR means findable, accessible, interoperable, reusable. It does not automatically mean public. It is a lens you can use to judge whether a dataset is actually usable by someone who is not already in your team.

Common mistakes in data sharing

Common mistake

Publishing data with no metadata

Reality: A CSV file without a data dictionary is a liability. People will misinterpret units, miss updates, and make decisions you never intended.

Common mistake

Sharing data without a licence

Reality: If you do not specify usage rights, you will spend months arguing about ownership when someone does something you dislike with the data.

Common mistake

Assuming removing names makes data anonymous

Reality: Many datasets can be re-identified through linkage with other sources. True anonymisation is hard and requires proper techniques, not just column deletion.

Verification. Make a dataset shareable in a way you would trust

Verification. Make a dataset shareable in a way you would trust

  • Write a title, description, update frequency, and contact owner.

  • List the units and definitions for key fields.

  • State what the dataset can and cannot be used for.

  • State whether it is open, shared, or restricted, and why.

Mental model

Open and FAIR as choices

Open data and FAIR principles help when you are clear about audience, risk, and governance.

  1. 1

    Publish

  2. 2

    Access rules

  3. 3

    Metadata

  4. 4

    Reuse

Assumptions to keep in mind

  • Audience is known. Publishing without knowing audience creates risk and misunderstanding.
  • Risk is assessed. Some data should be restricted. Openness is not the default for everything.

Failure modes to notice

  • Accidental disclosure. Publishing can expose sensitive patterns even without obvious identifiers.
  • Unusable open data. Open data without metadata is hard to reuse and easy to misinterpret.

Check yourself

Quick check. Open data and FAIR

0 of 4 opened

Does FAIR automatically mean public

No. FAIR is about usability. Data can be FAIR and still be restricted.

What is the difference between open and shared data

Open data is available for broad reuse. Shared data is available to specific parties under agreements and safeguards.

Why is metadata part of trust

Without definitions, units, and update rules, people misinterpret the dataset and make unsafe decisions.

Why is removing names not the same as anonymisation

Other fields can still identify people when linked with other datasets. True anonymisation requires careful techniques.

Artefact and reflection

Artefact

A short module note with one key definition and one practical example

Reflection

Where in your work would explain open data, data sharing, and fair thinking in your own words and apply it to a realistic scenario. change a decision, and what evidence would make you trust that change?

Optional practice

Complete one guided exercise and explain your decision in plain language

Source DAMA DMBOK 2 (Data Management Body of Knowledge, 2nd Edition)
Source ISO/IEC 11179 metadata registries
Source ISO/IEC 27701:2025 privacy information management
Source ICO data protection principles and UK GDPR guidance