Standards and interoperability
Why data standards exist, who creates them, and which standards govern the most common data exchange scenarios from dates and geography to healthcare records.
By the end of this module you will be able to:
- Distinguish de jure from de facto data standards with examples
- Apply ISO 8601 date formatting rules to avoid common ambiguity errors
- Identify the correct standard for at least three common data exchange scenarios
- Explain why the NHS Test and Trace Excel incident occurred and how a standard would have prevented it

Real-world case · 3-4 October 2020
NHS Test and Trace: 15,841 COVID-19 results lost to an Excel row limit
On 3 and 4 October 2020, approximately 15,841 COVID-19 positive test results were temporarily missing from NHS Test and Trace reporting. The cause: Public Health England was aggregating test results from laboratories using Microsoft Excel's legacy XLS (Excel 97-2003) file format. XLS has a hard row limit of 65,536 rows. When the aggregation file exceeded this limit, Excel silently dropped subsequent records rather than raising an error.
The positive test results were not uploaded to the contact tracing system for several days. Contacts of infected people were not notified in time. A CSV file with no row limit, a database, or any standard data pipeline format would have prevented the failure entirely. The choice of a non-standard format with a hidden capacity limit, made as a convenience decision, caused a public health data failure during a pandemic.
Excel silently dropped 15,841 positive COVID-19 test results. How does a convenience format choice become a public health failure?
Why data standards exist and how they are classified
A data standard is a documented agreement specifying how data should be structured, formatted, encoded, or exchanged. Without standards, every system that exchanges data must negotiate a custom format with every other system. With standards, any two conforming systems can exchange data without prior negotiation. This is the economic argument for standardisation.
De jure standards are formally published and ratified by a recognised body: ISO (International Organisation for Standardisation), IEC (International Electrotechnical Commission), W3C (World Wide Web Consortium), or IETF (Internet Engineering Task Force). Compliance may be voluntary or mandated by regulation.
De facto standards achieve widespread adoption through market dominance or practical necessity, without formal standardisation. JSON began as a de facto standard (derived from JavaScript object syntax) before being formalised as RFC 4627 in 2006, updated as RFC 8259 in 2017. PDF was a de facto standard before ISO standardisation as ISO 32000.
With an understanding of why data standards exist and how they are classified in place, the discussion can now turn to key data exchange standards, which builds directly on these foundations.
Key data exchange standards
ISO 8601:2019 (dates and times) defines the internationally unambiguous date format as YYYY-MM-DD. The date "01/02/03" is ambiguous: it could mean 1 February 2003 (UK/European format), 2 January 2003 (US format), or 3 February 2001 (ISO with 2-digit year). ISO 8601 eliminates all ambiguity. Full datetime: 2024-06-14T09:30:00Z (Z indicates UTC). The date format DD/MM/YYYY is standard in the UK; MM/DD/YYYY in the US. Any data exchange that does not specify which convention is in use creates ambiguity for all dates where the day value is 12 or below.
RFC 4180 (CSV format) specifies: comma as field delimiter, CRLF as line terminator, double-quote for escaping fields containing commas or newlines, and the first line as an optional header row. Many CSV files in the wild do not conform, using semicolons (common in European locales where commas are decimal separators) or other delimiters.
RFC 7946 (GeoJSON) defines a JSON-based format for geographic features using WGS 84 coordinates. Critical: GeoJSON specifies longitude before latitude: [longitude, latitude]. Google Maps uses (lat, lng) order. Mixing these produces points plotted in the ocean or on the wrong continent. This is the most common GeoJSON mistake.
HL7 FHIR R4 (health data) defines resources (patient, observation, medication, condition, appointment) as structured JSON or XML documents with a RESTful API profile. NHS England mandated FHIR adoption as part of its NHS Digital interoperability programme from 2019 onwards. It allows any conforming system (GP surgery software, hospital EPR, pharmacy management) to exchange patient records without bespoke integration.
“Standards provide a common language enabling the exchange of data between different systems, organisations, and countries. Without them, every data exchange requires bespoke negotiation, and interoperability becomes an engineering project rather than an assumption.”
ISO/IEC 11179-1:2023, Introduction: The role of metadata standards in data management - Section 3.1, Data
Common misconception
“Using standard date formats is a minor formatting preference, not a correctness requirement.”
Date format ambiguity is a data correctness issue, not a style preference. '01/02/03' means three different dates depending on whether you are in the UK, the US, or using ISO short form. Any date where the day value is 12 or below is ambiguous in non-ISO formats. In a dataset combining records from UK and US teams, ambiguous dates will be silently mis-parsed. The systemic fix is ISO 8601 (YYYY-MM-DD) in all data files and APIs, not case-by-case manual checking. Requiring ISO 8601 at source prevents ambiguity; accepting non-standard formats creates it.
A government agency receives data from 12 organisations. Organisation A submits dates as '14/06/2024', Organisation B as '06/14/2024', and Organisation C as '2024-06-14'. The agency must merge all three files. Which approach best resolves the ambiguity?
A developer creates a GeoJSON file marking the location of 50 air quality monitoring stations in London. When plotted on a map, all stations appear in the Indian Ocean near the equator. What is the most likely cause?
Key takeaways
- Data standards exist to enable interoperability: any two conforming systems can exchange data without custom integration. The economic case is compelling at scale.
- De jure standards are formally ratified by ISO, IETF, W3C, and similar bodies. De facto standards emerge from widespread adoption and may be later formalised (JSON: de facto 2001, ratified as RFC 8259 in 2017).
- ISO 8601 (YYYY-MM-DD), RFC 4180 (CSV), RFC 7946 (GeoJSON), and HL7 FHIR R4 (healthcare) are the most commonly encountered data exchange standards.
- GeoJSON specifies (longitude, latitude) order, the opposite of many map application conventions. This single ambiguity causes the most common GeoJSON error.
- The NHS Test and Trace Excel failure is a direct example of what happens when a non-standard format with a hidden capacity limit is used for a critical data pipeline. A CSV file would have prevented it.
Standards solve the format problem. The next module asks a different question: once data exists in a standard format, who can use it and under what conditions? Open data, the FAIR principles, and licensing frameworks determine whether data is legally, technically, and practically reusable.
Standards and sources cited in this module
ISO 8601:2019: Date and time format
Authoritative international standard for date and time representation: YYYY-MM-DD and full datetime formats.
RFC 4180 (IETF, 2005): Common Format for CSV Files
CSV format specification: delimiter, quoting, line terminator, and header row conventions.
RFC 7946 (IETF, 2016): GeoJSON Format
Section 3.1: Position. GeoJSON coordinate order specification (longitude, latitude).
Fast Healthcare Interoperability Resources: patient, observation, and medication resource definitions.
Public Health England: Technical statement on test data (October 2020)
NHS Test and Trace Excel row limit failure: 15,841 positive test results delayed.