MODULE 10 OF 6 · APPLIED

Microservices Architecture

35 min read 4 outcomes Interactive diagram

By the end of this module you will be able to:

  • Define microservices in terms of business capability, independent deployment, and data ownership
  • Distinguish microservices from monoliths and modular monoliths and choose the appropriate option
  • Explain service mesh, service discovery, and the saga pattern for distributed data management
  • Apply the Netflix case to identify the operational prerequisites for a production microservices system
Server infrastructure at scale (photo on Unsplash)

Real-world incident · 2008 to 2012

Netflix's 3-day outage became a 4-year migration and 700 microservices.

In August 2008, a database corruption event took Netflix offline for three days. DVD shipments stopped. Streaming customers could not load titles. The failure was in a single database in their DVD-era monolith. One corrupt table cascading into dependent systems brought the entire platform down.

The engineering team made a decision: move entirely to AWS and decompose the monolith into independent services. The migration took four years. By 2012, Netflix was running over 700 microservices on AWS. Each service was independently deployable, owned its own data, and could fail in isolation. To test this, Netflix built Chaos Monkey: a tool that randomly terminates production instances to confirm services survive neighbour failures.

The outcome was not just availability. Netflix published the patterns publicly: Hystrix for circuit breakers, Eureka for service discovery, Zuul for API gateway. The Netflix OSS stack became the infrastructure blueprint that most enterprises now buy off-the-shelf via Kubernetes and Istio. One 3-day outage seeded a decade of industry architecture.

When a 3-day outage causes a 4-year architectural rewrite, what are you actually paying for?

With the learning outcomes established, this module begins by examining what microservices are in depth.

10.1 What microservices are

A microservice is a service that implements a single business capability, runs in its own process, owns its own data store, and can be deployed independently of every other service. All three properties are required. A service that shares a database with another service cannot be deployed independently: a schema change in one service breaks the other. A service that handles both user authentication and order fulfilment violates the single business capability constraint.

The term was defined in a 2014 article by Martin Fowler and James Lewis. Before that, the same approach existed under various names: service-oriented architecture (SOA), fine-grained services, and component services. The defining distinction from SOA is the emphasis on independent deployability and data ownership rather than shared enterprise service buses.

Netflix operated 700 microservices by 2012. Amazon operates tens of thousands. At that scale, the organisational benefit is as important as the technical one: teams own their services end-to-end and release independently. Conway's Law states that systems mirror their organisation's communication structures. Microservices only deliver their benefits when team boundaries match service boundaries.

The microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API.

Martin Fowler and James Lewis - microservices.io, 2014

This definition emphasises two properties that are often overlooked: each service runs in its own process (separate deployment unit, not just a separate class), and communication uses lightweight mechanisms. Both properties enforce the independence that makes microservices worth their operational cost.

With an understanding of what microservices are in place, the discussion can now turn to microservices vs monolith vs modular monolith, which builds directly on these foundations.

10.2 Microservices vs monolith vs modular monolith

The architecture spectrum runs from a monolith at one end to fully distributed microservices at the other, with the modular monolith in between. The right choice depends on team size, deployment frequency, and domain clarity.

A monolith deploys as a single unit. All code is in one repository, sharing a single database. Changes anywhere require a full deployment. Shopify, the e-commerce platform with over 10,000 engineers and billions in annual revenue, runs on a Rails monolith they call the “Shopify Monolith”. They manage complexity through modular design within the single codebase, not through service decomposition.

A modular monolith is a single-process application with strong internal boundaries. Modules communicate through explicit APIs, not direct function calls across domain lines. It deploys as one unit but the internal structure prevents the coupling that makes monoliths painful to evolve. Martin Fowler calls this the “majestic monolith” when the modular discipline is maintained.

Microservices deliver their benefits when independent teams need to release at different velocities, when components have dramatically different scaling requirements, or when technology diversity is genuinely required. They carry the full cost of distributed systems: network failures, distributed transactions, and observability complexity. Both Netflix and Amazon paid those costs deliberately because the organisational benefits at their team sizes justified them.

Common misconception

Microservices are always better than monoliths.

The majestic monolith is a valid architectural choice for teams under 30 engineers. Shopify runs on a monolith with 10,000 engineers. The benefit of microservices is independent deployability for independent teams. A single team with a monolith has no coordination problem to solve. Microservices introduce distributed systems complexity that only pays off when the team autonomy benefits exceed those costs.

With an understanding of microservices vs monolith vs modular monolith in place, the discussion can now turn to service mesh and service discovery, which builds directly on these foundations.

10.3 Service mesh and service discovery

In a microservices system of 700 services, no service knows the network address of any other service at compile time. Instances scale up, scale down, and restart continuously. Service discovery solves this: each service registers its address with a registry on startup and queries the registry to find other services by name. Netflix built Eureka for this. Kubernetes provides DNS-based discovery natively. HashiCorp Consul is the most widely used solution outside Kubernetes.

A service mesh is the infrastructure layer that manages service-to-service communication. It provides: mutual TLS between services (encrypted and authenticated), traffic routing and load balancing, circuit breaker enforcement, retry policies, and distributed tracing. Istio and Linkerd are the two dominant service meshes for Kubernetes. The mesh deploys as a sidecar proxy (typically Envoy) alongside each service instance, intercepting all network traffic without requiring application-layer changes.

Before the service mesh concept existed, Netflix implemented all these concerns as libraries: Hystrix for circuit breaking, Ribbon for client-side load balancing, Eureka for discovery. The service mesh abstracts these concerns out of the application entirely. A Go service and a Java service can participate in the same mesh with identical policies, without either language needing to implement the resilience logic.

With an understanding of service mesh and service discovery in place, the discussion can now turn to data management in microservices, which builds directly on these foundations.

Network topology aerial view representing distributed service communication (photo on Unsplash)
Distributed service communication patterns, analogous to a microservices topology where each node is an independent service.

10.4 Data management in microservices

The most operationally challenging aspect of microservices is data management. Each service owns its own database, which means operations that would be a single transaction in a monolith become distributed operations across multiple services.

Consider an order placement: the Order Service creates an order record, the Inventory Service reserves the items, and the Payment Service charges the card. In a monolith, these are three rows in a single ACID transaction: either all succeed or all roll back. In microservices, each step is a separate database write in a separate service. If the Payment Service fails after Inventory has reserved the items, the system is in an inconsistent state.

The saga pattern solves this. A saga is a sequence of local transactions, each publishing an event that triggers the next step. If any step fails, compensating transactions undo the completed steps. The Inventory Service publishes an ItemsReserved event. If Payment fails, it publishes a PaymentFailed event, and the Inventory Service listens for it to execute the compensating ReleaseReservation transaction.

Sagas accept eventual consistency: the system is briefly inconsistent during the transaction. This is a deliberate trade-off. Distributed ACID transactions (via two-phase commit) preserve immediate consistency but are slower, harder to scale, and create coordination coupling between services. Most real-world microservices systems accept eventual consistency with sagas.

With an understanding of data management in microservices in place, the discussion can now turn to operational complexity trade-offs, which builds directly on these foundations.

Loading interactive component...

10.5 Operational complexity trade-offs

DORA (DevOps Research and Assessment) metrics are the standard way to measure software delivery performance: deployment frequency, lead time for changes, change failure rate, and time to restore service. In high-performing organisations that deploy multiple times per day, microservices are a frequent architectural pattern. But the correlation is not causal: the organisations that deploy microservices successfully do so because they first built the engineering capability.

Netflix at 700 services manages: 700 separate CI/CD pipelines, 700 separate deployment configurations, distributed tracing across all inter-service calls, on-call rotations for each service, and capacity planning for each service independently. Each of those operational concerns multiplies linearly with the number of services.

Sam Newman's guideline, drawn from working with hundreds of organisations, is that a team should be able to explain why they need microservices before adopting them. The specific pain points that justify the cost are: independent deployment speed across multiple teams, significantly different scaling requirements per component, or mandatory technology diversity. Adopting microservices for future flexibility, because it is fashionable, or because it is what large companies do, almost always produces a distributed monolith with no autonomy benefit.

You must be this tall to use microservices. You need to be this tall: a high-functioning CI/CD pipeline, monitoring and tracing across services, and a team mature enough to operate distributed systems before the benefits outweigh the costs.

Sam Newman - Chapter 1

Newman is the most cited practitioner on microservices. His point is that microservices require operational maturity as a prerequisite. Teams that adopt microservices before building this foundation spend most of their time managing infrastructure failures, not shipping features.

Common misconception

Each microservice should be as small as possible.

Size is not the goal; a single business capability is. A service that handles user authentication is the right size if authentication is a distinct business capability with its own team, its own release cadence, and its own scaling requirements. Making it smaller by splitting into sub-services adds network complexity without benefit. The naming of the pattern is misleading: 'micro' describes the scope of responsibility, not the lines of code.

Container ship in port with individual cargo units, representing independently deployable microservices
At scale, microservices run across thousands of physical and virtual servers, each hosting one or more service instances managed by Kubernetes.
10.6 Check your understanding

A 15-person startup plans to build their SaaS platform as microservices from day one. They have no CI/CD pipeline and are still discovering their domain model. What is the most significant risk?

Netflix's Order Service needs to reserve inventory and process payment as part of order placement. The Inventory Service fails after reserving items but before payment completes. Which pattern prevents the system from being left in an inconsistent state?

Netflix built Chaos Monkey to randomly terminate production service instances. What does this tool prove about their microservices architecture, and what would it reveal about a distributed monolith?

Check your understanding

A team splits a monolith into microservices. Two services need the same customer data. Service A owns the customer table. What data management pattern should Service B use to access customer information?

Key takeaways

  • A microservice owns a single business capability, its own data store, and is independently deployable. All three properties are required; a shared database destroys independent deployability.
  • The microservices vs monolith decision is organisational. Shopify runs a monolith with 10,000 engineers. Netflix uses 700 microservices. Both are correct for their team structure.
  • Service mesh (Istio, Linkerd) and service discovery (Consul, Kubernetes DNS) are required infrastructure, not optional additions, for a production microservices system.
  • The saga pattern handles distributed transactions by chaining local transactions with compensating rollback steps, accepting eventual consistency in exchange for service autonomy.
  • Netflix's 4-year migration was enabled by building Hystrix, Eureka, Zuul, and Chaos Monkey first. Operational infrastructure precedes service decomposition, not the reverse.

Standards and sources cited in this module

  1. Newman, S. (2021). Building Microservices. 2nd edition. O'Reilly Media.

    Chapters 1, 2, 4, and 6

    The most thorough practical reference for microservices. Sam Newman's 'you must be this tall' threshold is drawn from Chapter 1. The saga pattern coverage in Section 10.4 follows his treatment in Chapter 6.

  2. Fowler, M. and Lewis, J. Microservices. martinfowler.com, 2014.

    Characteristics of a Microservice Architecture

    The article that named the pattern. The definition quoted in Section 10.1 is from this article. The bounded context decomposition approach in Section 10.2 is also discussed here.

  3. Richardson, C. Microservices Patterns. microservices.io

    Saga pattern, API Gateway pattern, Service Discovery pattern

    The canonical pattern catalogue. Used in Section 10.3 for service discovery and mesh patterns, and in Section 10.4 for the saga pattern with compensating transactions.

  4. Fowler, M. MonolithFirst. martinfowler.com, 2015.

    Full article

    The primary counterpoint to microservices-first thinking. Referenced in Section 10.2 for the recommendation to start with a modular monolith.

  5. Netflix Technology Blog. Completing the Netflix Cloud Migration. 2016.

    Full post

    Primary source for the Netflix migration timeline, the 700 microservices figure, and the Chaos Monkey tool described in the opening story.

What comes next: Microservices create service boundaries. The next question is how those services communicate without creating tight coupling. Module 11 introduces event-driven architecture: publish-subscribe patterns, event streaming, and the architectural distinction between commands, events, and queries.

Module 10 of 22 in Applied