CPD timing for this level

Advanced time breakdown

This is the first pass of a defensible timing model for this level, based on what is actually on the page: reading, labs, checkpoints, and reflection.

Reading

16m

2,272 words · base 12m × 1.3

Labs

60m

4 activities × 15m

Checkpoints

20m

4 blocks × 5m

Reflection

32m

4 modules × 8m

Estimated guided time

2h 8m

Based on page content and disclosed assumptions.

Claimed level hours

14h

Claim includes reattempts, deeper practice, and capstone work.

The claimed hours are higher than the current on-page estimate by about 12h. That gap is where I will add more guided practice and assessment-grade work so the hours are earned, not declared.

What changes at this level

Level expectations

I want each level to feel independent, but also clearly deeper than the last. This panel makes the jump explicit so the value is obvious.

Anchor standards (course wide)

TOGAF StandardISO/IEC/IEEE 42010 (architecture description)

Assessment intent

Advanced

Governance, evolution, and operational design.

Assessment style

Format: mixed

Pass standard

Coming next

Not endorsed by a certification body. This is my marking standard for consistency and CPD evidence.

Evidence you can save (CPD friendly)

A bounded context map with ownership and language boundaries, plus one risk of boundary drift.
An architecture governance note: review cadence, decision rights, and how you avoid architecture as theatre.
A runbook for one failure mode: detection signal, triage steps, containment, rollback, and a post-incident improvement.

Software Development and Architecture Advanced

Level progress0%

CPD tracking

Fixed hours for this level: 14. Timed assessment time is included once on pass.

View in My CPD

Pricing and CPD Sign in to record progress

Progress minutes

0.0 hours

CPD and certification alignment (guidance, not endorsed):

Advanced architecture is about long-lived systems: domains, distributed trade-offs, and governance that survives change. It maps well to:

iSAQB style advanced architecture reasoning (boundaries, trade-offs, communication)
TOGAF (orientation) for enterprise constraints and governance language
Cloud architecture certifications for resilience, observability, and cost trade-offs

How to use Advanced

At this point, you are designing for the organisation you will become in two years, not the team you have today.

Good practice

Design boundaries around language and ownership, not just code. If the ownership is unclear, the architecture will drift.

Bad practice

Best practice

Advanced architecture is about running big systems across many teams. You design for change, failures, and the messy reality of long lived software.

Domains and bounded contexts

Concept block

Domains and bounded contexts

Bounded contexts keep meaning local so change does not break everything.

Assumptions

Meaning is local

Interfaces are explicit

Failure modes

Boundary denial

Shared database coupling

Domain driven design starts with language. A is not the same as a UI screen. A keeps teams aligned.

When contexts blur, systems become expensive to change. Clear language and boundaries protect velocity.

Bounded contexts

Split the domain where language changes.

Customer context

Profiles, consent, contact methods.

Billing context

Invoices, payment status, tariffs.

Operations context

Outages, assets, field updates.

🧪
Worked example. “Customer” means two different things and your system pays the price

One team uses “customer” to mean “bill payer”. Another uses it to mean “occupier”. Both are reasonable in isolation. The problem is when services silently mix them. You then get strange behaviour: wrong notifications, bad reporting, and angry users.

⚠️
Common mistakes in bounded contexts

Naming contexts after systems instead of meaning.
Sharing one database across contexts because it is “easier”.
Letting one team own language for everybody else.

🔎
Verification. A quick boundary test

If a term is overloaded, can you write two definitions that do not overlap.
If a model changes, which teams break first.
Can you assign an owner to each context’s data and contracts.

📝
Reflection prompt

If you split your current system into contexts, where are the natural seams and why.

Quick check: domains and bounded contexts

Why does language matter in architecture

What is a bounded context

Scenario: Two teams both own 'Customer' but mean different things. What do you do first

Why do blurred contexts slow change

What should define a domain

What is ubiquitous language

Why do contexts reduce coupling

What is a sign of overloaded context

How does this connect to Intermediate styles

Advanced patterns and distributed systems

Concept block

Distributed patterns

Distributed patterns solve coupling and scale. They introduce new failure modes that must be owned.

Assumptions

Failure is expected

Ownership exists

Failure modes

Retry storms

Invisible queues

These patterns help when you need scale and clarity, but they add complexity. Use them when the problem demands it.

CQRS and events

Commands change state, queries read from projections.

Write model

Commands and validation.

Event stream

Immutable record of change.

Read model

Optimised for queries.

🧪
Worked example. CQRS added for “scale”, but the real problem was a slow query

A team implements CQRS and events because reads are slow. After weeks of work, the system is more complex and the read model is still slow, because the real issue was one unindexed query and an N+1 access pattern.

⚠️
Common mistakes with advanced patterns

Using patterns to avoid fixing basic data access and caching.
No event versioning or replay strategy, so evolution becomes scary.
Treating “eventual consistency” as a surprise rather than a designed behaviour.

🔎
Verification. When CQRS is justified

Reads are high volume and have different shape than writes.
You can operate projections and handle replays safely.
You have monitoring for lag, staleness, and failed consumers.

📝
Reflection prompt

Which parts of your system would truly benefit from CQRS, and which would suffer.

Quick check: patterns and distribution

Why use CQRS

What is event sourcing

Why do sagas exist

What does idempotent mean

When should you avoid event sourcing

What is the risk of these patterns

Why do distributed systems need ordering rules

What is a good signal to use CQRS

Resilience, performance and scale

Concept block

Resilience and performance

Resilience is how you behave on bad days. Performance is how you behave on normal days.

Assumptions

Budgets exist

Degradation is designed

Failure modes

Cascading failure

Optimising the mean

Failures will happen. Resilience is about what you do when they do. protects systems from cascades. keeps things alive.

Caching helps, but it creates new risks. Always decide where stale data is acceptable.

Resilience mesh

Protect the path between services.

Service A

Timeouts and retries.

Service B

Circuit breaker and fallback path.

🧪
Worked example. Retries turned a small outage into a full incident

A dependency slows down. Callers timeout and retry with no jitter. Load multiplies, queues fill, and what started as “a bit slow” becomes total failure. This is why resilience is a system property, not a library checkbox.

⚠️
Common mistakes in resilience

Retrying everything. Not all errors are retryable.
No circuit breaker behaviour, so failure cascades are guaranteed.
No backpressure, so overload becomes collapse.

🔎
Verification. A resilience review in five questions

What is the timeout. What is the retry budget.
Is the operation idempotent. If not, retries can be harmful.
What is the fallback path. Can we degrade safely.
How will we detect saturation early.
How do we roll back quickly.

📝
Reflection prompt

Where do timeouts or retries make things worse in your current system.

Quick check: resilience and scale

Why use circuit breakers

What is backpressure

Why can retries be dangerous

Where should caches sit

What is graceful degradation

Why plan for scale early

What is a simple scaling model

What should you monitor in scale tests

Architecture evolution and governance

Concept block

Evolution and governance

Systems evolve safely when governance enables change and prevents accidental harm.

Assumptions

Decision rights are clear

Evidence is captured

Failure modes

Stale governance

No feedback loop

Architects guide change with small decisions, not massive documents. ADRs make intent visible. Fitness checks catch drift early.

ADR lifecycle

Lightweight decisions with a clear trail.

Propose

Write the decision and options.

Decide

Pick and document trade offs.

Review

Revisit when context changes.

Evolve

Refactor and update the rules.

🧪
Worked example. The same argument every quarter because nothing is written down

Teams re-litigate the same decisions: “monolith vs services”, “SQL vs NoSQL”, “build vs buy”. The debate burns time because context and trade-offs are not captured, so new people restart the argument from scratch.

⚠️
Common mistakes in governance

ADRs that record conclusions but not options and rationale.
Governance as meetings without decision rights.
Decisions made but not enforced through automation or review.

🔎
Verification. A healthy ADR set

Each ADR has context, options, decision, trade-offs, and consequences.
Status is explicit (proposed, accepted, deprecated, superseded).
There is a review trigger (when constraints change, revisit).

📝
Reflection prompt

Which decisions in your system should be recorded as ADRs this quarter.

Quick check: evolution and governance

Why use ADRs

What is a fitness function

Why is technical debt risky

What keeps governance lightweight

Why revisit decisions

What is a sign of architecture drift

Why involve security and operations

What makes refactoring safer

🧾
CPD evidence (advanced, still practical)

What I studied: bounded contexts, distributed patterns, resilience under failure, and architecture governance.
What I practised: one context split, one event and projection scenario, one resilience review, and one ADR written from real constraints.
What changed in my practice: one habit. Example: “I write retry budgets and failure modes as part of design, not after incidents.”
Evidence artefact: a short pack of four pages (context map, event flow, resilience checklist, ADR).

Software Development and Architecture Advanced

Advanced time breakdown

Level expectations

Software Development and Architecture Advanced

Domains and bounded contexts

Bounded contexts

Domain and bounded context mapper

🧪
Worked example. “Customer” means two different things and your system pays the price

⚠️
Common mistakes in bounded contexts

🔎
Verification. A quick boundary test

📝
Reflection prompt

Advanced patterns and distributed systems

CQRS and events

CQRS and events lab

🧪
Worked example. CQRS added for “scale”, but the real problem was a slow query

⚠️
Common mistakes with advanced patterns

🔎
Verification. When CQRS is justified

📝
Reflection prompt

Resilience, performance and scale

Resilience mesh

Resilience and latency simulator

🧪
Worked example. Retries turned a small outage into a full incident

⚠️
Common mistakes in resilience

🔎
Verification. A resilience review in five questions

📝
Reflection prompt

Architecture evolution and governance

ADR lifecycle

Architecture decision workbench

🧪
Worked example. The same argument every quarter because nothing is written down

⚠️
Common mistakes in governance

🔎
Verification. A healthy ADR set

📝
Reflection prompt

🧾
CPD evidence (advanced, still practical)

Quick feedback

Software Development and Architecture Advanced

Advanced time breakdown

Level expectations

Software Development and Architecture Advanced

Domains and bounded contexts

🧪Worked example. “Customer” means two different things and your system pays the price

⚠️Common mistakes in bounded contexts

🔎Verification. A quick boundary test

📝Reflection prompt

Advanced patterns and distributed systems

🧪Worked example. CQRS added for “scale”, but the real problem was a slow query

⚠️Common mistakes with advanced patterns

🔎Verification. When CQRS is justified

📝Reflection prompt

Resilience, performance and scale

🧪Worked example. Retries turned a small outage into a full incident

⚠️Common mistakes in resilience

🔎Verification. A resilience review in five questions

📝Reflection prompt

Architecture evolution and governance

🧪Worked example. The same argument every quarter because nothing is written down

⚠️Common mistakes in governance

🔎Verification. A healthy ADR set

📝Reflection prompt

🧾CPD evidence (advanced, still practical)

Quick feedback

🧪
Worked example. “Customer” means two different things and your system pays the price

⚠️
Common mistakes in bounded contexts

🔎
Verification. A quick boundary test

📝
Reflection prompt

🧪
Worked example. CQRS added for “scale”, but the real problem was a slow query

⚠️
Common mistakes with advanced patterns

🔎
Verification. When CQRS is justified

📝
Reflection prompt

🧪
Worked example. Retries turned a small outage into a full incident

⚠️
Common mistakes in resilience

🔎
Verification. A resilience review in five questions

📝
Reflection prompt

🧪
Worked example. The same argument every quarter because nothing is written down

⚠️
Common mistakes in governance

🔎
Verification. A healthy ADR set

📝
Reflection prompt

🧾
CPD evidence (advanced, still practical)