Module 6 of 15 · Applied

From meter to market: the 7-stage data lifecycle

40 min read 3 outcomes Quiz + lifecycle trace

By the end of this module you will be able to:

  • Trace the 7-stage lifecycle: Generate → Collect → Validate → Store → Process → Share → Archive
  • Follow one 1.47 kWh reading from a smart meter through settlement to a consumer bill
  • Compare the DTS, DIP, and DSI data platforms and explain their roles
An electricity smart meter mounted on a wall, digital display showing consumption data

Think about it

Every half-hourly reading travels through seven distinct stages before it changes anything.

The energy data lifecycle is not a metaphor. It is a literal chain of custody, defined by industry codes and enforced by Ofgem licence conditions. Every reading, every notification, and every meter event follows the same sequence of stages. Understanding this lifecycle is the foundation of every other topic in this course: governance, privacy, settlement, and digitalisation all depend on knowing where data is, who holds it, and what has been done to it.

This module maps the complete lifecycle using a concrete example: one half-hourly electricity reading, from the moment the meter records it to the moment it appears on your bill. Along the way, we compare three data platforms that serve different parts of this chain, and we trace how gas data follows a parallel but distinct path.

Your smart meter recorded 1.47 kWh between 07:00 and 07:30 this morning. Within hours, that single number will have been collected by the DCC, validated against tolerances, stored in multiple databases, processed by settlement agents, shared between licensed parties, and archived for regulatory audit. Each stage applies rules, transforms the data, and adds metadata. Miss any stage, and the reading either never reaches the market or arrives in a form nobody trusts.

With the learning outcomes established, this module begins by tracing the complete energy data lifecycle stage by stage.

6.1 The seven stages

Every piece of energy data passes through seven stages. The terminology varies between organisations, but the stages themselves are universal. They apply to electricity meter readings, gas volume corrections, network loading data, weather forecasts used for demand modelling, and even the metadata about the meters themselves.

  1. Generate: data is created at the point of measurement or observation
  2. Collect: data is retrieved from the device and transmitted to a central system
  3. Validate: readings are checked for completeness, plausibility, and conformance
  4. Process: validated data is transformed, aggregated, or enriched
  5. Store: processed data is persisted in authoritative industry systems
  6. Distribute: data is shared with entitled parties via defined interfaces
  7. Archive: data is retained for regulatory and audit purposes, then destroyed on expiry

Each stage is detailed below.

Stage 1: Generate

Data is created at the point of measurement or observation. For a SMETS2 electricity meter, the metrology chip records instantaneous power in watts and accumulates energy in watt-hours. Every 30 minutes, it stores a register reading. For gas, a diaphragm meter records volume in cubic metres. For network data, a substation monitor records voltage, current, and power factor. The generation stage is physical: it depends on sensor accuracy, calibration certificates, and measurement standards traceable to the National Physical Laboratory.

At this stage, the data exists only on the device. It has not been transmitted, validated, or seen by anyone. The meter holds up to 13 months of half-hourly data in its local registers, providing a buffer against communication failures.

Stage 2: Collect

Collection is the act of moving data from the device to a central system. For smart meters, the Data Communications Company (DCC) is the collection agent. The DCC sends a Service Request to the meter via its WAN (VMO2 cellular in the south, Arqiva radio mesh in the north). The meter responds with an ALCS (Auxiliary Load Control Switch) response or a standard meter read response containing the register values.

Collection introduces the first data quality risks. Communication failures, timeouts, incomplete responses, and WAN congestion can all prevent collection. The DCC reports a success rate above 96%, but for the remaining 4% the data must be estimated or recovered later. Gas meters add a further complication: the Zigbee HAN link between the gas meter and the electricity meter's communications hub is battery-powered and sends readings only every 30 minutes, introducing latency that does not exist for electricity.

Stage 3: Validate

Raw readings are checked against a set of validation rules before they enter any downstream system. Validation includes range checks (is the reading physically plausible?), consistency checks (is it higher than the previous reading?), format checks (correct number of digits, correct units), and identity checks (does this MPAN or MPRN match a registered meter?). Readings that fail validation are flagged for manual investigation or replaced with estimates calculated from historical consumption patterns.

The validation rules are defined in the Balancing and Settlement Code (BSC) for electricity and the Uniform Network Code (UNC) for gas. They are not optional: licensees are required to apply them, and Ofgem audits compliance through the Performance Assurance Framework. Poor validation has direct financial consequences. If an invalid reading enters settlement, it distorts imbalance calculations and misallocates costs across the entire market.

BSC parties shall ensure that all meter readings submitted for settlement have been validated in accordance with the validation rules set out in the BSC and associated procedures, and that no estimated reading is used where an actual reading is available.

Elexon, Balancing and Settlement Code - Section S

This BSC provision makes Stage 3 (Validate) a legal obligation, not a technical preference. A supplier that submits unvalidated readings faces Performance Assurance action and potential settlement charges if invalid data inflates or deflates their imbalance position.

Stage 4: Store

Validated data is stored in databases operated by the relevant market participant. Suppliers store customer consumption data. Distribution Network Operators (DNOs) store network monitoring data. Elexon stores settlement data. The DCC stores meter technical details and communication logs. There is no single central database for all energy data in GB. Instead, data is distributed across dozens of organisations, each with its own storage policies, retention periods, and access controls.

Retention periods vary enormously. BSC settlement data must be retained for at least 28 months (the settlement run-off period). GDPR requires that personal data is not kept longer than necessary. Network asset data may be retained for the life of the asset, which can be 40 years or more. These conflicting retention requirements create governance challenges that we explore in Module 8.

Stage 5: Process

Processing transforms raw data into information that supports decisions. Settlement processing is the most complex example: the Settlement Administration Agent (SAA) takes half-hourly meter readings from thousands of meters, allocates them to Grid Supply Points (GSPs), calculates each supplier's contracted versus actual position, and determines imbalance charges. This processing runs multiple times for each settlement period, with each run incorporating more actual data and fewer estimates.

Other processing includes demand forecasting (National Energy System Operator uses meter data combined with weather models), network capacity analysis (DNOs aggregate substation data to identify thermal constraints), and billing calculations (suppliers apply tariff rates to consumption data). Every processing step adds value but also adds risk: errors compound through the chain, which is why validation at Stage 3 matters so much.

Stage 6: Share

Sharing makes processed data available to authorised parties. In settlement, the FAA (Funds Administration Agent) shares financial position data with all BSC parties. The System Operator shares demand forecasts and network constraints publicly to support market transparency. DNOs share network capacity data through their Long-Term Development Statements. Suppliers share consumption data with customers via bills and in-home displays.

Sharing is where the tension between transparency and privacy becomes acute. Aggregated data can be shared freely. Individual half-hourly consumption data is personal data under UK GDPR and requires either consent or a legitimate interest justification. The Data Best Practice Guidance (DBP v3.5) establishes a “presumed open” principle for non-personal data, but applying this principle in practice requires the data triage process covered in Module 8.

Stage 7: Archive

Archiving is the transition from active to long-term storage. Archived data is no longer used for operational decisions but must be retained for regulatory compliance, dispute resolution, and historical analysis. The BSC requires settlement data to be available for reconciliation for 28 months. Meter technical data may need to be accessible for the entire asset lifecycle. GDPR imposes a countervailing pressure: personal data must be deleted when the purpose for which it was collected no longer applies. Reconciling these requirements is a governance problem that no single code or regulation fully addresses.

Check your understanding

Which stage of the data lifecycle is responsible for checking whether a meter reading is physically plausible before it enters downstream systems?

The lifecycle stages above describe the abstract pattern. Now let us trace one concrete reading — 1.47 kWh from a single half-hour — through every stage, from the meter's metrology chip to the 42p charge on a consumer bill.

6.2 Tracing 1.47 kWh from meter to bill

Abstract stages become concrete when you trace a single reading through the entire chain. Let us follow 1.47 kWh recorded by a SMETS2 electricity meter in a three-bedroom house in Birmingham between 07:00 and 07:30 on a Tuesday morning in January. The household is on a standard variable tariff with a mid-sized supplier.

07:00 – 07:30: Generate

The metrology chip records energy consumption. The family is having breakfast: the kettle, toaster, fridge-freezer, lighting, and a phone charger are all drawing power. The meter accumulates 1.47 kWh over the half-hour period and stores it in register 2 (import active energy, TOU rate 1).

07:31: Collect

The DCC's Data Service Provider (DSP), operated by CGI, polls the meter via a scheduled Service Request 4.1.1 (Read Instantaneous Import). The meter responds with the register value over the VMO2 cellular WAN. The response includes the meter serial number (MSN), the MPAN, the timestamp, and the register reading. Transit time from meter to the DCC's central systems is typically under 10 seconds. The DCC logs the successful collection and forwards the data to the registered supplier.

08:00 – 10:00: Validate

The supplier's Meter Operator Agent (MOA) and Data Collector (DC) apply BSC validation rules. The 1.47 kWh reading is checked against the previous half-hour (1.12 kWh) and the same half-hour on the previous Tuesday (1.39 kWh). The reading passes all checks: it is within the expected range for this profile class and time period, the register has advanced (not reversed), and the MPAN matches a registered supply point.

Day+1: Store and process

The validated reading enters the supplier's billing system and is also submitted to the Supplier Volume Allocation Agent (SVAA). The SVAA operates at each of the approximately 350 Grid Supply Points (GSPs) across England, Wales, and Scotland. It allocates the 1.47 kWh to the correct GSP based on the MPAN's registered location. This allocation determines which regional price applies.

The Settlement Administration Agent (SAA) then calculates the supplier's total metered volume at this GSP and compares it to the supplier's contracted position (their Forward Contract Notifications and Bid-Offer acceptances). The difference is the supplier's imbalance volume, which is settled at the System Buy Price (SBP) or System Sell Price (SSP).

Day+5 (SF run): Share

The SAA publishes the first Settlement Final (SF) run. This initial calculation uses whatever actual meter data is available (about 95% with smart meters) and estimates the rest. The Funds Administration Agent (FAA) calculates the cash flows and arranges payments between BSC Trading Parties. The supplier discovers that its imbalance for this GSP, for this half-hour, was 2.3 MWh long (they had contracted more than their customers consumed). They receive a credit at the System Sell Price.

Day+14 to Month+28: Reconciliation and archive

Settlement does not stop at the SF run. The R1 reconciliation run at month 5 uses more actual data. R2 at month 10 uses even more. R3 at month 14 uses nearly all actual data. The final Dispute Final (DF) run at month 28 closes the books. Each run can change the imbalance calculation and the associated cash flows. After the DF run, the data is archived but must remain accessible for regulatory investigation.

Under Market-Wide Half-Hourly Settlement (MHHS), the timeline compresses dramatically. The SF run moves to 5 working days. The final reconciliation (RF) moves to 4 months instead of 28 months. This compression reduces credit cover requirements by an estimated 71%, saving the industry hundreds of millions of pounds in working capital.

End of billing period: The 42p

At the end of the billing period, the supplier prices the 1.47 kWh at the customer's unit rate of approximately 24.5p per kWh plus standing charge. The 1.47 kWh contributes about 36p in energy cost, plus approximately 6p allocated to network charges, policy costs, and VAT. The customer's bill shows a single line: “Electricity used: 1.47 kWh — 42p.” Behind that line lies the entire lifecycle we have just traced.

Common misconception

Settlement happens once and the numbers are final.

Settlement runs multiple times over up to 28 months (compressing to 4 months under MHHS). Each run incorporates more actual meter data and fewer estimates, meaning imbalance charges and cash flows change with every reconciliation. The SF run is just the first draft.

The 1.47 kWh trace reveals how the lifecycle works for a single reading today. But the platforms that carry these data flows are actively changing. Understanding the DTS, DIP, and DSI explains where the infrastructure is heading.

6.3 Three platforms: DTS, DIP, and DSI

The GB energy system does not have one data platform. It has evolved three, each built for a different era and a different purpose. Understanding their roles, overlaps, and planned succession is essential for anyone working with energy data.

Data Transfer Service (DTS) — the legacy workhorse

The DTS has been the backbone of energy data exchange for over 25 years. It uses structured flat files in Data Transfer Catalogue (DTC) formats, exchanged between market participants via secure file transfer. Every settlement reading, meter registration, change of supplier notification, and exception report has flowed through DTS files. The format is rigid, well-understood, and deeply embedded in every participant's IT systems.

The DTS works, but it has serious limitations. Data flows are batch-based, with most files processed overnight. There is no real-time capability. The file formats are proprietary to the GB market and incompatible with modern API standards. Error detection relies on manual exception reports rather than automated validation. Onboarding a new participant requires implementing the full DTC specification, which is a significant barrier to entry for innovators and small market entrants.

Data Integration Platform (DIP) — the MHHS enabler

The DIP went live in August 2025 as the primary data exchange platform for Market-Wide Half-Hourly Settlement. Unlike the DTS, the DIP uses RESTful APIs and modern messaging patterns (event-driven architecture with publish-subscribe). Data flows are near real-time rather than batch. The DIP enforces standardised data models and validation at the point of entry, reducing downstream errors.

The DIP is not a database. It is a messaging platform: data flows through it but is not permanently stored in it. Participants publish messages (meter readings, registration events, exception notifications) and authorised subscribers receive them. This design supports the compressed settlement timelines required by MHHS and reduces the latency from days to minutes.

By the time MHHS reaches full migration (expected 2027), all settlement-relevant data will flow through the DIP rather than the DTS. However, the DTS will not disappear immediately: non-settlement data flows (change of tenancy, meter removals, ad-hoc reads) will continue to use DTS until they are migrated, which may take several more years.

Digital Spine Infrastructure (DSI) — the 2030 vision

The DSI is the most ambitious of the three platforms, proposed in Ofgem's digitalisation strategy and expected to develop between 2028 and 2030. Where the DIP handles settlement data between existing BSC participants, the DSI aims to be a sector-wide data infrastructure supporting all energy data — including network data, flexibility markets, and consumer-facing applications.

The DSI would implement the “presumed open” principle from the Data Best Practice Guidance, providing a single point of access for non-personal energy data. It would use open standards, provide a developer-friendly API gateway, and support third-party innovation. However, significant questions remain about governance (who operates it?), funding (who pays?), and the relationship between DSI and existing platforms like the DIP and the DCC.

The timeline for DSI is uncertain. Ofgem's Energy Data Strategy consultation (2024) proposed a phased approach, but no firm delivery dates have been committed. The most likely outcome is that DSI builds on the DIP's architecture rather than replacing it, extending the event-driven API model to non-settlement data domains.

Gas data: a parallel lifecycle

Gas Transporters shall publish daily Calorific Value data for each Local Distribution Zone, and this data shall be used by all parties for the conversion of gas volume to energy for billing and settlement purposes.

Uniform Network Code

This UNC provision creates the parallel gas lifecycle described in this section. Unlike electricity settlement which uses meter readings directly, gas settlement requires a two-step process: volumetric meter reading plus UNC-mandated calorific value data. The CV publication obligation ensures all parties use the same energy conversion factor.

Gas data follows the same seven stages but with critical differences. Gas meters record volume in cubic metres, which must be converted to energy (kWh) using the Calorific Value (CV) of the gas, measured daily at each Local Distribution Zone. The correction formula (Volume × CV × 1.02264 ÷ 3.6) introduces a variable that electricity does not have. Gas settlement operates under the Uniform Network Code (UNC) rather than the BSC, with its own agents (Xoserve operates the Central Data Service Provider role) and its own timelines.

Gas data also has a latency problem. Gas smart meters send readings via the electricity meter's communications hub using ZigBee, which adds delay. Battery life constraints mean gas meters transmit less frequently than electricity meters. And gas does not yet have half-hourly settlement: allocation is daily, which means the per-period accuracy that electricity achieves is not yet available for gas.

Check your understanding

What is the fundamental architectural difference between the DTS and the DIP?

Key takeaways

  • Energy data follows a 7-stage lifecycle: Generate, Collect, Validate, Store, Process, Share, Archive. Every stage is governed by industry codes (BSC for electricity, UNC for gas) and enforced through Ofgem licence conditions.
  • A single 1.47 kWh reading passes through DCC collection, BSC validation, SVAA allocation across ~350 GSPs, SAA imbalance calculation, FAA cash flow settlement, and multiple reconciliation runs over up to 28 months (4 months under MHHS) before contributing approximately 42p to a consumer bill.
  • Three data platforms serve different eras: DTS (25+ years, batch files, legacy but reliable), DIP (live August 2025, event-driven APIs, enables MHHS), and DSI (2028-2030 vision, sector-wide open data infrastructure). The transition is evolutionary, not revolutionary — DTS and DIP will coexist for years.
  • Gas data follows the same lifecycle but with a volume-to-energy conversion step (CV correction), daily rather than half-hourly settlement, and additional latency from battery-powered ZigBee communications through the electricity meter's hub.

Standards and sources cited in this module

  1. Elexon, Balancing and Settlement Code (BSC)

    Section S: Supplier Volume Allocation; Section T: Settlement Administration

    Defines the settlement lifecycle stages including validation rules, SVAA allocation, SAA imbalance calculation, and reconciliation timelines (SF through DF). Referenced throughout Section 6.2.

  2. Elexon, Data Integration Platform (DIP) Design Documentation

    Architecture Overview and Migration Plan

    Source for DIP go-live date (August 2025), API architecture, and the relationship between DIP and DTS during the MHHS transition. Referenced in Section 6.3.

  3. Ofgem, Digitalisation Strategy and Action Plan (2024)

    Digital Spine Infrastructure and Data Best Practice

    Source for the DSI vision, presumed open principle, and 2028-2030 delivery timeline. Referenced in Section 6.3.

Module 6 of 15 · Energy System Data Applied