Metadata management and information publication

24 min read 6 outcomes 3 standards cited

Metadata belongs inside information architecture, and publication discipline depends on it: context, lineage, and meaning have to travel with information if anyone else is to trust it. The treatment here is anchored by G234, the TOGAF Series Guide on Metadata Management, and connects metadata to the LTDS-style publication requirements that the London case faces.

By the end of this module you will be able to:

Explain metadata management in plain language and show why it belongs inside information architecture
Identify the six minimum metadata questions a published or shared information asset should answer
Connect metadata to reuse, compliance, and consumer trust rather than treating it as administrative overhead
Describe the G234 information publication requirements and explain what each one protects
Explain why retrofitting metadata is more expensive than including it from the start
Apply metadata discipline to LTDS-style publication and internal reuse in the London case

Data and publication visual suggesting that context, lineage, and meaning must travel with information if others are expected to trust it

Real-world case · 2024

Technically accurate. Practically misleading.

A distribution network operator published its first major data-transparency output in early 2024. The published dataset contained network capacity figures for every primary substation in the region. Within two weeks, a developer queried the figures because several substations appeared to show negative headroom.

The data team investigated and found that the negative values were correct for the scenario assumptions used, but the published dataset did not state which scenario had been applied, when the model had last been refreshed, or what caveats applied to the interpretation.

The data was technically accurate. The published output was practically misleading because it carried no context. The consumer had no way to know whether the figures reflected firm capacity, a planning estimate, a worst-case scenario, or a stale model run. Metadata would have answered those questions before the confusion began.

If a published dataset is technically accurate but carries no scenario, date, or authority context, is it a governed output or an invitation to confusion?

That story shows why metadata is an architecture concern, not an administrative afterthought. Trust, discoverability, and reuse are architectural outcomes. If the enterprise publishes or shares information without context, consumers have to guess whether they can rely on it.

31.1 Why metadata belongs in architecture work

Metadata answers the context questions that raw information often leaves unresolved. It tells consumers what the information means, where it came from, how current it is, who owns it, and what conditions apply to its use.

G234 positions metadata management as an integral part of information architecture, not as a documentation exercise that follows later. The reasoning is practical: if the enterprise waits until after platforms are built and data is flowing to add context, the context will be incomplete, inconsistent, or missing. The cost of retrofitting metadata is always higher than including it from the start because the people who understand the original meaning, assumptions, and provenance have moved on to other work.

Three architectural outcomes depend on metadata. Trust depends on consumers being able to verify what they are looking at. Discoverability depends on information assets being findable and their purpose being clear. Reuse depends on teams being confident enough in the meaning and quality of existing information to use it rather than creating their own copies.

“Metadata management provides the meaning, provenance, stewardship, lifecycle, and usage context that consumers need to interpret, trust, and govern the information they receive. It is an essential component of information architecture, not an administrative afterthought.”
TOGAF Series Guide G234 - G234, Metadata Management
G234 treats metadata as part of the architecture because trust, discoverability, and reuse are architectural outcomes. Without metadata, even correct data can mislead.

31.2 The six minimum metadata questions

G234 identifies six metadata concerns that form the minimum set any published or shared information asset should address. These are not optional extras. They are the baseline for governed information.

Meaning. What does this information item represent and what assumptions shape its interpretation? This includes definitions, units, scope boundaries, and any conditions that affect how the information should be read. For London capacity data, meaning includes whether the figure represents firm capacity, maximum capacity, or scenario-based headroom.
Provenance. Where did it come from, through which process or transformation, and on what basis was it produced? This is the lineage that allows consumers to trace the output back to its sources. For a London planning forecast, provenance includes which modelling tool produced it and which input datasets were used.
Stewardship. Who owns the meaning, quality, and governanceof the information asset? This tells consumers who to contact when they have questions and who is accountable for the asset's integrity.
Lifecycle. When is this information current, how often is it refreshed, and when should it no longer be relied upon? This prevents consumers from treating stale or superseded information as current.
Classification and access. What sensitivity, security, or compliance classification applies? Who is permitted to access, modify, or redistribute the information? For regulated utilities, classification affects what can be published externally versus what must remain internal.
Quality and confidence. What quality thresholds apply and what known limitations exist? If the data has been estimated, extrapolated, or derived from incomplete sources, that should be stated. A planning estimate carries different confidence from a measured operational reading.

These six concerns form the information publication requirements that G234 describes. Every published London output should be testable against them. If any concern is unanswered, the publication is incomplete.

Four quality gates a catalogue entry clears before publication

Structure, lineage, quality and approval run in sequence. Each gate carries its pass criterion in green above the typical block signals in red, and the entry only publishes once it clears all four.

An entry that passes all four gates can be subscribed to safely. A block on any gate returns the entry to the analyst with a specific reason, so nothing reaches consumers without a named owner, traced lineage, fresh data and a sign-off.

Metadata bridging the gap between producer knowledge and consumer needs so that correct data does not mislead — Metadata bridges the gap between what the producer knows and what the consumer needs to know. Without it, even correct data can mislead.

31.3 Why publication discipline depends on metadata

Published and shared information assets carry a higher burden of trust than internal working data. Metadata is what bridges the gap between what the producer knows and what the consumer needs to know.

Consumers need to distinguish current, historic, and scenario-based views. Without lifecycle metadata, every version looks equally valid.
Published outputs need enough context to prevent accidental misuse. A capacity figure published without scenario and date context invites misinterpretation.
Internal reuse becomes much easier when teams can discover what information exists and how trustworthy it is. Discoverability is a metadata outcome.
Governance review becomes more practical when meaning and ownership are explicit rather than assumed.
LTDS-style publication requires the enterprise to share network development information that external consumers can interpret without asking the data team for context. The published output must carry its own context through metadata.

In the London case, every published output should carry at minimum: the scenario assumption, the model-run date, the authority source, the stewardship contact, the quality confidence level, and interpretation notes. Those six pieces of metadata transform a raw data file into a governed publication.

Metadata publication pipeline: a catalogue entry from draft to subscribed version

Five stages carry a catalogue entry from analyst draft through schema check, steward review and publication to the version that downstream consumers subscribe to. Each stage has a named owner and a versioned artefact.

Metadata is published like code. Each stage has a named owner and a versioned artefact, so consumers subscribe to a specific version and can roll back when a definition changes, the same discipline a code release pipeline uses.

31.4 Why teams still postpone metadata work

Metadata is often delayed because it looks administrative and does not feel like visible transformation. The team is under pressure to deliver platforms, dashboards, and integrations. Recording meaning, provenance, and stewardship feels like overhead.

Later, the organisation discovers that publication, analytics, and integration all became harder because no one captured the context the consumers needed. By that point, retrofitting metadata is more expensive than including it from the start for three reasons.

First, the people who understood the original meaning and assumptions have moved on. Second, the number of information assets that need metadata has grown during the delay. Third, consumers have already built interpretations based on their own assumptions, and those interpretations now need to be corrected.

The architecture team's job is to make the case for metadata early enough that it is included in the platform and publication design rather than bolted on afterwards. G234 provides the framework. The authority ownership model from Module 28 provides the stewardship structure. Together they make metadata a planned architecture output rather than an afterthought.

Common misconception

“Metadata is optional documentation that can be added after the main platform work is complete.”

Treating metadata as optional usually means the enterprise is quietly outsourcing interpretation risk to every downstream user. The cost does not disappear. It gets distributed across every team that has to guess what the information means.

London Grid Distribution

The London case uses metadata to support LTDS-style publication, internal planning reuse, and consumer confidence. If a published London information asset lacks date context, ownership, meaning, or model assumptions, it cannot support reliable reuse.

London publication needs metadata because the consumer is not standing inside the producer's head.
Reuse improves when metadata is treated as part of the information asset, not as an optional attachment.
A London capacity figure published with clear scenario, date, authority, quality confidence, and interpretation context is an architecture output. The same figure published without that context is a liability.
All six G234 metadata concerns should be testable for every London published output in the Phase C pack.
The Phase C pack should specify which metadata fields are mandatory for each publication type and who is responsible for populating them.

Check your understanding

A distribution network operator publishes substation capacity figures. Consumers cannot tell whether the figures reflect firm capacity, a planning estimate, or a worst-case scenario. The data itself is technically accurate. What is missing?

A programme board asks why metadata work should be funded alongside platform delivery. What is the strongest argument?

G234 identifies six minimum metadata concerns. Which one answers the question: 'What known limitations exist and what confidence level should consumers attach to this information?'

Key takeaways

Metadata answers the context questions that raw data alone cannot answer reliably.
Publication and reuse both depend on metadata discipline. Without it, even correct data can mislead.
G234 identifies six minimum metadata concerns: meaning, provenance, stewardship, lifecycle, classification, and quality.
Metadata management belongs in architecture because trust at scale depends on it.
Retrofitting metadata is always more expensive than including it from the start, because the original knowledge holders have moved on.
Every London published output should be testable against all six metadata concerns.

Standards and sources cited in this module

G234, Metadata Management
Full guide
Primary guide for metadata management discipline and information publication requirements within the TOGAF ecosystem.
G190, Information Mapping
Full guide
Provides the information-domain context that metadata management operates within.
The TOGAF Standard, 10th Edition (C220)
Part 1, Phase C
Core standard context for information architecture, including metadata as an architecture concern.

You now understand why metadata is an architecture concern and what six concerns every published output must address. The next question is: how does analytics architecture connect information to the decisions that planning, governance, and operational teams depend on? That is Module 32.

Back: Asset and network data architecture for utilities Next: Business intelligence, analytics, and decision support

Module 31 of 64 · Information Systems Architecture