System Design Notes
Don’t forget to get your copy of Designing Data Intensive Applications the single most important book to read for system design interview prep!

Replication

Keeping a copy of the same data on multiple machines connected by a network is known as “Replication.

Why Replicate?

There are several reasons why data must be replicated. Some include:

  • Keeping data geographically closer to the consumers of data
  • Tolerate failure in case some parts of the system fail.
  • Scale the number of machines that can serve read queries.

Replication becomes tricky when the replicated data can change over time. If the data were constant, we could make a copy of the data on every machine and be done. However, any meaningful distributed system sees data change over time and making sure that the replicated copies of data look the same on all the machines becomes challenging. At this point, we’ll assume that each copy of the data we are trying to replicate can fit neatly on a single machine. We’ll explore the case of data that can’t fit onto a single machine later.

Ways to Replicate

There are three primary algorithms used to replicate data, each with its own pros and cons.

  • Single leader replication
  • Multi leader replication
  • Leaderless replication

Additionally, replication can be synchronous or asynchronous and usually systems such as databases have knobs and levers to tweak replication behavior.