The big picture: why energy data matters
By the end of this module you will be able to:
- Explain what energy system data is using Ofgem's official definition
- Describe the five key tensions shaping the GB energy data landscape
- Name the scale of data generation across smart meters, SCADA, and market systems

Think about it · Right now, as you read this
Energy data is everywhere - you just cannot see it.
Every time you switch on a light, boil a kettle, or charge your phone, a chain of events happens that you never see. Electricity is generated at a power station or wind farm, travels through high-voltage transmission lines, steps down through local distribution networks, and arrives at your home through your meter.
But here is the part most people never think about: alongside the electricity, there is an invisible river of data. Your smart meter records how much energy you use every 30 minutes. That data travels wirelessly to the DCC (the company that runs the smart meter communications network), then onwards to your energy supplier (so they can bill you) and to Elexon (so the national electricity market can be settled correctly).
This guide will take you on a journey through that invisible river of data. By the end, you will understand not just what energy data is, but who produces it, where it flows, how it is governed, and why it is one of the most important things you have never heard of.
Your smart meter is recording a half-hourly reading right now. A SCADA system at your nearest substation is sampling voltage every 5 seconds. NESO's control room is balancing supply and demand at exactly 50 Hz. What makes all this work, and who decides what happens to the data?
With the learning outcomes established, this module begins by examining what is energy system data? in depth.
1.1 What is energy system data?
In everyday language, energy system data is any information created because energy is being generated, transported, or consumed. It includes everything from your smart meter readings to the technical specifications of a transformer in a substation, from the price of electricity on the wholesale market to the geographic coordinates of an underground cable.
But in the official regulatory world, the definition is precise. Ofgem, the energy regulator, changed it formally in October 2024:
“Energy System Data means any data relating to the energy system, whether produced by, for, or about the energy system.”
Ofgem - Energy System Data Definition Decision, 8 October 2024
This broad definition captures everything: meter readings, network models, market data, planning forecasts, and consumer information. It replaced a narrower definition that only covered 'data necessary for the operation of the energy system', which excluded important categories like planning data and consumer consent records.
The definition matters because it determines which data falls under Ofgem's Data Best Practice Guidance - the rules that govern how energy data must be managed. Under the old definition, many valuable datasets were outside scope. The new definition brings them all in.
Now that we have the official definition, the next question is scale. How much data does this system actually produce? The numbers are larger than most people expect.
1.2 The scale you need to understand
Great Britain's energy system generates an extraordinary volume of data. To appreciate the scale:
- 40 million smart meters each recording 48 readings per day produce 1.92 billion readings per day (or roughly 700 billion per year). Under Market-wide Half-Hourly Settlement (MHHS, cutover July 2027), every one of these will flow through the entire settlement chain.
- Thousands of SCADA-monitored substations, each sampled every 2-10 seconds, produce approximately 69 million data points per hour for a large DNO. This data flows over air-gapped Operational Technology networks, physically separated from the internet.
- ~20 gas chromatographs at NTS entry points continuously measure the exact calorific value of gas flowing into the national pipeline system, determining how every gas bill in GB is calculated.
- 87 distinct data types across 13 categories (A through M), each with a defined producer, consumer, governing instrument, and sensitivity classification.
- 7 major industry codes (BSC, REC, SEC, Grid Code, DCUSA, CUSC, UNC), each governing specific aspects of data creation, flow, and use.
Managing this volume reliably, securely, and accurately is one of the greatest data engineering challenges in UK infrastructure.
“Licensees shall collect, manage and share data in accordance with the Data Best Practice Guidance published by the Authority, and shall publish a Digitalisation Strategy and Action Plan.”
Ofgem, Distribution Licence Condition SpC 9.5
This licence condition makes data governance a regulated obligation for all distribution network operators, not a voluntary commitment. Non-compliance can trigger formal enforcement action by Ofgem.
Why did Ofgem broaden the definition of 'energy system data' in October 2024?
Understanding the sheer volume of data produced is one thing. But volume alone does not explain why energy data is genuinely difficult to govern. Five unresolved tensions shape every policy and technical decision in this landscape.
1.3 Five key tensions you need to know
The GB energy data landscape is defined by genuine, unresolved tensions. Understanding these helps you read every other module in this course in context. These are not academic debates - they affect how data flows, who can access it, and what happens to your privacy.
Tension 1: Open data vs security risk
Ofgem's Data Best Practice Guidance presumes data should be open by default. But SCADA telemetry from substations is a national security concern - the 2015 Ukrainian power grid cyberattack showed what happens when operational data reaches the wrong hands. The Data Triage Playbook (Module 8) tries to square this circle, but the line between "open for innovation" and "restricted for security" remains contested.
Tension 2: Centralisation vs fragmentation
Nordic countries use a single data hub - Norway's Elhub handles all meter data in one hop. GB uses five separate platforms (DCC, Elexon DIP, ElectraLink DTS, RECCo, NESO). The Data Sharing Infrastructure (DSI) aims to connect them, but governance is unresolved. A single hub is simpler but creates a single point of failure and raises governance questions about who controls it.
Tension 3: Consumer consent vs system need
Ofgem argues half-hourly data is needed for settlement (no individual consent required - it's a legal obligation under the BSC). The ICO argues it's personal data requiring explicit consent. As of March 2026, this debate remains unresolved. The Consumer Consent Service (CCS), being built by RECCo, will eventually formalise the rules.
Tension 4: Innovation speed vs regulatory pace
AI and digital twins can transform grid planning today. But RIIO price controls operate on 5-year cycles, and code modification takes 12-18 months. The system data infrastructure risks being permanently one regulatory cycle behind the technology curve. RIIO-3 (starting April 2026) includes a £876.7M digitalisation baseline, but no DSI licence condition.
Tension 5: National standards vs local variation
CIM provides an international data model (IEC 61970). But each of the 14 DNOs implements it differently, with varying maturity levels. The LTDS programme is closing the gap - Stage 1.3 completed November 2025, Stage 2 targets May 2026, Stage 3 targets November 2026 - but full interoperability remains years away.
Common misconception
“Energy data is just smart meter readings and monthly bills.”
The GB energy system produces 87 distinct data types across 13 categories. Smart meter readings are one type within Category A. The full scope includes SCADA telemetry, gas chromatograph measurements, network topology data, wholesale market clearing prices, and regulatory compliance filings. Ofgem broadened the official definition in October 2024 to capture this full breadth.
Which tension best describes the disagreement between Ofgem and the ICO about smart meter data?
Having mapped the tensions that make energy data difficult, it helps to know how this course structures the journey through them. Here is what each stage covers and how to get the most from the material.
1.4 How to use this course
This course has 15 modules across three stages:
- Foundations (Modules 1-5): what energy data is, who produces it, how the physical network generates it, the complete 87-type taxonomy, and smart meters.
- Applied (Modules 6-10): the data lifecycle from meter to market, the regulatory hierarchy, data governance, privacy rights, and settlement mechanics.
- Practice & Strategy (Modules 11-15): international comparison, transformation programmes, digitalisation governance, CIM interoperability, and strategic planning.
Each module starts with a learning contract (the three outcomes you will achieve), includes inline knowledge checks, and ends with key takeaways. Each stage has a practice assessment you can use before attempting the timed stage test.
Key takeaways
- Energy system data is any data produced by, for, or about the energy system - the definition was broadened by Ofgem in October 2024 to cover 87 data types across 13 categories.
- GB generates tens of billions of energy data points every day: 1.92 billion smart meter readings, 69 million SCADA points per hour per large DNO, and continuous gas chromatograph measurements.
- Five unresolved tensions define the landscape: open vs secure, centralised vs fragmented, consent vs system need, innovation vs regulation, national standards vs local variation.
- No tension has a clean answer - every decision involves trading one legitimate goal against another.
Standards and sources cited in this module
Ofgem, 'Energy System Data: Best Practice Guidance' (October 2024)
Section 2: Definitions and Scope
Provides the official broadened definition of energy system data and the 87-type taxonomy referenced throughout this module.
Energy Digitalisation Taskforce, 'Digitalising our Energy System for Net Zero' (2022)
Chapter 3: The role of data in the energy transition
Sets out the case for treating energy data as a strategic national asset. The five tensions explored in this module derive from its analysis of competing stakeholder priorities.