MODULE 19 OF 7 · PRACTICE AND STRATEGY

Deployment Strategies

30 min read 4 outcomes Interactive quiz

By the end of this module you will be able to:

Compare blue-green, canary, rolling, and feature flag deployment strategies
Select the appropriate strategy for a given risk and team context
Explain how feature flags decouple deployment from release
Describe the rollback process and its speed for each strategy

Cargo shipping containers stacked in a port (photo on Unsplash)

Real-world incident · August 2023

A Kubernetes rolling deployment that broke production for 22 minutes because two API versions ran simultaneously.

In August 2023 an engineering team at a logistics platform deployed a new version of their shipment service using Kubernetes rolling updates. The new version renamed a JSON field from shipment_id to tracking_id in the API response. The consuming notification service expected shipment_id.

During the 22-minute rolling update window, approximately 40% of shipment notifications failed silently. Requests hitting the new version receivedtracking_id, which the notification service did not recognise. Requests hitting the old version received shipment_id as expected. The load balancer distributed traffic between both versions throughout the rollout.

The fix required rolling back the deployment and deploying a version that supported both field names simultaneously. The deployment could then proceed safely. This is the backward compatibility requirement for rolling deployments: during the transition window, multiple versions must coexist without breaking consumers.

During a rolling deployment, old and new versions of a service run simultaneously. If the new version changes an API response field that the old version of a consuming service expects, some requests hit the new version and some hit the old version. What should the team have done before deploying?

With the learning outcomes established, this module begins by examining blue-green deployment in depth.

19.1 Blue-green deployment

Blue-green deployment maintains two identical production environments: Blue (currently serving traffic) and Green (the new version, deployed and ready but not serving traffic). When Green has been validated, the load balancer redirects 100% of traffic from Blue to Green in a single atomic switch. Blue remains running and immediately available for rollback: redirecting traffic back to Blue reverses the deployment instantly.

The primary advantage is the instant rollback capability. No re-deployment is required. A bad deployment takes minutes to reverse rather than the time it takes to re-run the CI/CD pipeline. Green can also be smoke-tested with real production data (using a small internal traffic leak or synthetic tests) before receiving customer traffic.

The primary disadvantage is cost: you must maintain double the infrastructure during the transition period. Database schema changes require careful staging: both Blue and Green must be compatible with the same database schema during the switch, because you cannot perform a rollback after a destructive schema migration. The safe sequence is: migrate schema in a backward-compatible way first, deploy new code second, remove old schema support last.

“By having two production environments, you can have the new version running in the green environment and tested before switching. Once you are satisfied it is working, flip the router to send all traffic to the new environment.”
Fowler, M. - BlueGreenDeployment, martinfowler.com, 2010
Fowler's original description establishes the key insight: Green is tested before receiving traffic, not after. The atomic switch is not a leap of faith; it follows validation. The Blue environment remains as the safety net. This is the fundamental difference between blue-green and other strategies: rollback is a traffic redirect, not a re-deployment.

With an understanding of blue-green deployment in place, the discussion can now turn to canary deployment, which builds directly on these foundations.

19.2 Canary deployment

Canary deployment releases a new version to a small percentage of users first, monitors for errors and performance degradation, and gradually increases traffic if metrics are healthy. The name refers to canaries used in coal mines to detect toxic gases: the canary population detects problems before the full workforce is exposed.

A typical canary progression starts at 1% to 5% of traffic, holds for a defined observation window (10 to 30 minutes for high-traffic services, longer for lower-traffic ones), then proceeds to 20%, 50%, and 100% if metrics remain within acceptable thresholds. Automated monitoring gates make this reliable: the deployment system checks error rate and p99 latency at each stage and either proceeds or rolls back automatically.

Canary deployments require good observability at low traffic volumes. Checking a 5xx error rate that normally averages 0.01% of traffic requires enough requests to the canary to produce a statistically meaningful rate. For low-traffic services, the canary window must be longer to accumulate enough samples.

Common misconception

“Canary deployments require dedicated canary infrastructure.”

Canary deployments can be implemented with a simple load balancer weight configuration. Kubernetes supports canary patterns using Ingress annotations or service mesh traffic splitting (Istio, Linkerd). The new version runs as a separate Deployment with a small replica count; the load balancer sends a proportional share of traffic based on replica count. No separate infrastructure is required.

Common misconception

“Blue-green deployment eliminates downtime completely.”

Blue-green deployment eliminates planned downtime for the web tier, but long-running database migrations can still cause issues. If the green environment uses a schema incompatible with the blue environment's code, you cannot roll back without data loss. The solution is to decouple schema migrations from code deployments, applying backward-compatible schema changes first.

With an understanding of canary deployment in place, the discussion can now turn to rolling deployments and feature flags, which builds directly on these foundations.

19.3 Rolling deployments and feature flags

Rolling deployment replaces instances one by one (or in small batches), with health checks between each replacement. Kubernetes rolling updates are the canonical implementation: the maxUnavailable setting limits how many pods are unavailable at once, and maxSurge allows temporary over-provisioning during the rollout.

The critical constraint for rolling deployments is backward API compatibility. During the rollout window, old and new versions serve traffic simultaneously. Any change that breaks the old version's ability to consume the new version's output (or vice versa) will cause errors during the window. The logistics platform incident in the opening is the exact failure mode.

Feature flags (also called feature toggles) decouple deployment from release. The new feature's code ships to production in a disabled state. The flag controls whether the new code path runs. This enables dark launching (testing with internal users before enabling for customers), kill switches (instantly disabling a problematic feature without a rollback deployment), and progressive rollout (enabling for 1%, 10%, 50%, 100% of users over time).

Feature flag debt is a real operational risk. Every long-lived flag is a branch in production code. Teams that accumulate 50 or more active flags report difficulty reasoning about system behaviour and testing all combinations. Establish a policy: temporary flags are removed within two sprints; permanent flags are operational toggles with explicit ownership and governance.

“Feature Toggles (often also referred to as Feature Flags) are a powerful technique, allowing teams to modify system behavior without changing code. They fall into various usage categories, and it's important to be thoughtful about how and when to use them.”
Hodgson, P. - Feature Toggles (aka Feature Flags), martinfowler.com, 2017
The phrase 'without changing code' captures the key benefit: a feature flag rollback is a configuration change, not a deployment. This makes it faster and safer than re-running the deployment pipeline. The reference to 'various usage categories' is important: release toggles, ops toggles, experiment toggles, and permission toggles each have different lifecycles and governance requirements.

With an understanding of rolling deployments and feature flags in place, the discussion can now turn to choosing the right strategy, which builds directly on these foundations.

19.4 Choosing the right strategy

Blue-green is best for major releases and database migrations where instant rollback is essential. The double infrastructure cost is acceptable for high-value, high-risk changes. The DORA (DevOps Research and Assessment) 2023 State of DevOps Report found that elite-performing teams deploy multiple times per day with change failure rates below 5%. Blue-green enables that cadence for high-risk changes without sacrificing rollback safety.

Canary is best for high-risk changes where gradual exposure is more important than instant rollback. It produces real user signal before full rollout and has minimal infrastructure overhead. The limitation is that it requires enough traffic to generate statistically meaningful metrics at low percentages.

Rolling deployment is the Kubernetes default and is appropriate for routine deployments where the new version is backward-compatible with the old version. It requires no additional infrastructure and is built into Kubernetes' native deployment controller.

Feature flags are best for decoupling release from deployment and for providing instant kill switches. They are not a replacement for deployment strategies; they complement them. A canary deployment can use feature flags to further restrict who sees the new feature within the canary population.

19.5 Check your understanding

A team is deploying a rewritten payment flow. The change includes a database schema migration that adds a column and a code change that writes to it. Why does blue-green deployment require special care here?

A canary deployment routes 5% of payment requests to the new version. What metrics should the automated monitoring gate check, and at what point should the rollout pause?

What is the primary benefit of feature flags compared to canary deployment for managing risk?

In the opening logistics platform incident, what should the team have ensured before performing the rolling deployment?

Key takeaways

Blue-green provides instant rollback via traffic switch; it requires double infrastructure during transition and careful database migration sequencing (expand then contract).
Canary limits blast radius by initially routing a small percentage of traffic to the new version; automated monitoring gates on SLI metrics determine whether to proceed or roll back.
Rolling deployment is the Kubernetes default; it requires backward-compatible API changes during the transition window when old and new versions serve traffic simultaneously.
Feature flags decouple deployment from release: code ships disabled, and the flag controls visibility. Flag rollback is a configuration change, not a deployment. Manage flag debt by removing temporary flags within two sprints.
Most mature teams combine strategies: rolling for routine backward-compatible changes, blue-green or canary for high-risk changes, and feature flags for controlled rollout and instant kill switches.

Standards and sources cited in this module

DORA State of DevOps Report 2023
Key metrics: deployment frequency, change failure rate, time to restore
Documents the correlation between deployment frequency, change failure rate, and business outcomes. Cited in Section 19.4 for the elite performer benchmarks.
Fowler, M. BlueGreenDeployment. martinfowler.com, 2010
Full article including database migration guidance
The definitive description of the blue-green pattern. Quoted in Section 19.1 for the pre-switch validation principle. The database migration guidance (expand then contract) is drawn from this source.
Hodgson, P. Feature Toggles (aka Feature Flags). martinfowler.com, 2017
Toggle categories; Managing toggle debt
The thorough reference for feature flag patterns, categories, lifecycle, and debt management. Quoted in Section 19.3 for the decoupling of release from deployment.

What comes next: Deploying a single service is one problem. Coordinating transactions across multiple services is another. Module 20 covers integration patterns at scale: sagas, choreography, orchestration, the transactional outbox pattern, and API versioning for cross-team boundaries.

Previous: Observability and SRE Next: Integration patterns at scale

Module 19 of 22 in Practice and Strategy