System Design Notes
Don’t forget to get your copy of Designing Data Intensive Applications the single most important book to read for system design interview prep!

Multi Leader Replication Issues

Some of the problems with multi-leader replication include:

  • Copies of the same data being modified concurrently in different datacenters requiring conflict resolution.
  • Writes being lost from a data center that experiences permanent failure if they haven’t been replicated to other leaders.
  • In case of databases, auto incrementing keys, triggers, and integrity constraints can be problematic to handle.

Handling Conflicts

One of the major problems of multileader replication is conflicting writes. Suppose you are editing a Google document simultaneously with a colleague sitting across the globe. One of you changes the first heading in the document from A to B and the other changes the same heading from A to C respectively. The changes by the two users are committed at the local leader in the nearest data center to each of the users. Unbeknownst to the parties, a conflicting write has been recorded for the document. Also, it can’t be said if the change from A to B is correct or the change from A to C. Both are equally valid. The conflicting write only gets detected at a later time when the write is asynchronously copied between datacenters and too late to ask either of the users to resolve the conflict.

The single leader architecture doesn’t suffer from conflicting writes, since the writes are serialized and the conflicting second write can be blocked or aborted and the user forced to retry. We could potentially fix conflicting writes in a multi-leader environment by detecting a conflict synchronously. A write request is replicated to all the leaders and followers before acknowledging success. However, doing so will essentially prevent the leaders in a multileader architecture from accepting writes independently of other leaders in the system.

Conflict avoidance

A reasonable strategy is to avoid conflicts altogether in multi-leader architectures. Consider a record for a user’s profile on TikTok. The system can enforce updates to a given record to always route to the same leader or datacenter. This essentially serializes the writes to the record at a single leader. These writes are asynchronously replicated to other datacenters. Thus, the issue of conflicts doesn’t arise. However, there are caveats to plans such as:

  1. The user moves geographically or uses a different device which causes the assigned leader to change.
  2. The assigned leader or datacenter experiences failure.

In systems with multi-leader architectures, we eventually want the data copies at each leader to converge to the same snapshot of data. If a record is changed from A to B at leader#1 and from A to C at leader#2 around the same time, we have a couple of ways to resolve the conflicting writes at the two leaders:

  1. The timestamp of the latest write wins and is considered the final value for a record. This policy is called LWW (Last Write Wins). This approach is prone to data loss. This approach can be generalized to assign unique IDs to every write and then allow the write with the highest ID in value to be the last value for a record.
  2. Another approach is to assign IDs to replicas and the write request received at the replica with the highest ID be the final value for the record and discard conflicting write requests received at replicas with IDs lower in value. This approach is also prone to data loss.
  3. Merge or concatenate the conflicting writes. For instance, Evernote uses this approach. If a note within the app has conflicting writes, the app will merge/concatenate the two writes and leave it up to the user to resolve them. In our hypothetical record example, if the values B and C are strings then they can be concatenated as B/C and left for the user to resolve.
  4. The fourth approach is to store the conflicting writes in separate data structures and then have custom application code that automatically resolves the conflict and stores the correct value.

Automatic Conflict Resolution

Automatic resolution of conflicting writes is a complex and broad topic but we’ll briefly discuss some aspects of it here. There are data structures such as lists, maps, etc that can be edited by multiple users concurrently and have the ability to intelligently resolve conflicts automatically. These CRDT (conflict-free replicated datatypes) use sensible rules to auto-resolve conflicts. Similarly, there exist the mergeable persistent data structures, which track history of changes like a version control system (e.g. Git) and have the ability to merge conflicts. Collaborative editing products such as Google docs or Apache Wave use operational transformational technology for resolving conflicts in multi user scenarios.