Data Foundations · Module 5
Standards, schemas, and interoperability
Interoperability is a boring word for a very expensive problem.
Previously
Data representation and formats
Computers store everything using bits (binary digits) because hardware can reliably tell two states apart.
This module
Standards, schemas, and interoperability
Interoperability is a boring word for a very expensive problem.
Next
Open data, data sharing, and FAIR thinking
Open data is not “everything on the internet”.
Progress
Mark this module complete when you can explain it without rereading every paragraph.
Why this matters
A standard can be a file format (CSV, JSON), a schema (field definitions), a data model (how entities relate), or a message contract (API request and response).
What you will be able to do
- 1 Explain standards, schemas, and interoperability in your own words and apply it to a realistic scenario.
- 2 Standards reduce repeated translation work by creating shared meaning and stable interfaces.
- 3 Check the assumption "Standards are adopted" and explain what changes if it is false.
- 4 Check the assumption "Mappings are maintained" and explain what changes if it is false.
Before you begin
- No previous technical background required
- Read the section explanation before using tools
Common ways people get this wrong
- Paper standards. A PDF standard that nobody implements does not improve interoperability.
- Semantic drift. Shared terms drift unless stewardship exists.
Interoperability is a boring word for a very expensive problem. Two systems can both be “correct” and still disagree because they mean different things. Standards are the shared rules that reduce translation work. Not because standards are morally pure, but because without shared meaning you spend your life reconciling spreadsheets and arguing in meetings.
What a standard really is
A standard can be a file format (CSV, JSON), a schema (field definitions), a data model (how entities relate), or a message contract (API request and response). Good standards do two jobs. They make systems compatible and they make errors visible earlier.
Worked example. “Customer” broke your dashboard, not your code
Worked example. “Customer” broke your dashboard, not your code
System A records “customer” as the bill payer. System B records “customer” as the person who contacted support. A dashboard joins them and reports “customers contacted”. Leadership changes policy based on that number. Nobody wrote a bug. The definition was the bug.
Common mistakes in standards work
Common mistake
Choosing field names first and definitions later
Reality: The name "customer_id" tells you nothing about whose definition of customer it uses. Write the definition first, then pick a name that matches.
Common mistake
Treating schemas as documentation only
Reality: If your schema is not validated in the pipeline, it is not a contract. It is a wish. Validate early, fail loudly, fix upstream.
Common mistake
Changing a contract without versioning
Reality: If you rename a field and downstream systems break, that is your fault, not theirs. Version your schemas and communicate changes.
Verification. A small contract you can write today
Verification. A small contract you can write today
Pick one dataset. Write 5 fields with units, allowed ranges, and what “missing” means.
Write one identifier field and state whether it is a string or number, and why.
Write what changes are breaking, and how you would version them.
Mental model
Standards reduce translation
Standards reduce repeated translation work by creating shared meaning and stable interfaces.
-
1
Local meaning
-
2
Mapping
-
3
Standard
-
4
Exchange
Assumptions to keep in mind
- Standards are adopted. A standard helps only if systems implement it consistently.
- Mappings are maintained. Mappings rot. Maintenance is part of the cost.
Failure modes to notice
- Paper standards. A PDF standard that nobody implements does not improve interoperability.
- Semantic drift. Shared terms drift unless stewardship exists.
Check yourself
Quick check. Standards and interoperability
0 of 4 opened
Why can two systems be “correct” and still disagree
They can use different definitions or units for the same words, so the meaning differs even if each system is consistent internally.
What makes a schema more than documentation
Validation. If the pipeline validates it and fails when it breaks, it becomes a contract.
Why is versioning a contract change important
Downstream systems depend on it. Unversioned breaking changes cause silent errors and outages.
What is one thing you should write for every field in a dataset
A definition and a unit, plus what missing means for that field.
Artefact and reflection
Artefact
A short module note with one key definition and one practical example
Reflection
Where in your work would explain standards, schemas, and interoperability in your own words and apply it to a realistic scenario. change a decision, and what evidence would make you trust that change?
Optional practice
Complete one guided exercise and explain your decision in plain language