Video Streaming Queue: A Concurrent Writes Example
Consider two users A and B, using the same credentials to log-in to a video streaming service such as
YouTube. Let’s assume these two users sit in different geographical locations unaware of each other’s
actions. We’ll also assume the service has a single replica to make the example simple. Both users try
to add movies for various actors in the queue and their interactions with the service are captured
below. Instead of a two user scenario, we could also play the sequence of events for a single user who
interacts with the service using two different devices.
- User A starts first and adds a movie for Tom Cruise in the queue. The browser sends this update to
the backend server which notes the state of the queue [Tom] as version number#1. We’ll use first
names of actors in the queue to keep the example simple. Furthermore, the server responds to the
client request with the current state of the queue as well as its version. Since the queue was empty
to begin with, the state returned by the server is [Tom] and version#1.
- Next user B adds a Brad Pitt movie to the queue. Note, that from user B’s perspective the queue is
empty and she doesn’t know what movies user A has already put into the queue. When the server
receives the entry from user B as [Brad], the server must merge the state of the queue with that
received from user A, since the writes from the two users are concurrent, i.e. neither knows about
the other. The server thus stores the queue state as [Tom, Brad] and increments the queue state to
version#2 and the same is sent back to user B.
- Next, user A wants to add a George Cloony movie in the queue. User A received [Tom] and version#1
from the server in response to the first write request. User A sends the previously received version
and the new queue state [Tom, George] to the server. When the server receives user A’s second write
request, it can determine from the version number sent by user A that the user wants to overwrite
the version#1 state of the queue with the new state [Tom, George]. The server, however, already
received the write request from user B to include a Brad Pitt movie in the queue. You can now
appreciate the shortcomings of a LWW-like conflict resolution mechanism, which would simply override
the queue state with [Tom, George] and drop user B’s write request for Brad. We can assume that the
server simply merges the write requests as a conflict resolution mechanism. Thus the server will
override the queue state versioned#1 and merge the outcome with any concurrent writes received from
B. The resulting queue state will be [Tom, Brad, George] and the version will be bumped to 3. Note,
the server responds to the write request of user A with the updated queue state as well as the
version.
- Say now user B wants to add a Johnny Depp movie to the queue. This new write request builds upon the
response of the first write request by user B, which was [Tom, Brad]. The new write request is
therefore causally dependent on the response of the first write request. The second write request
from user B should overwrite the version#2 of the queue state. Thus the user B sends to the server
[Tom, Brad, Johnny] and version#2. The server on receiving the write request overwrites version#2 of
the queue state and also merges the concurrent writes from user A that occurred after version#2 of
the queue state. The queue state is thus changed to [Tom, Brad, George, Johnny] and the server sends
it back to user B along with the version bumped-up to 4.
- The pattern should be apparent now. User A wants to add a Angelina movie to the queue and sends the
state [Tom, Brad, George, Angelina] along with version#3 to the server. The server overwrites the
queue state at version#3 [Tom, Brad, George] with the new write [Tom, Brad, George, Angelina]
received from user A and at the same time it merges the concurrent writes from user B received after
version#3 of the queue state. Finally, the queue state at the server is [Tom, Brad, George,
Angelina, Johnny] and the version is 5, which are also sent back to user A.
We can observe from the above events that the server attempts to distinguish between overwrittes and
concurrent writes based on version numbers of the queue’s state. The client has the onus of merging the
contents it receives from its immediately preceding write with the new write the client intends to make
and ship the result to the server. The server can safely replace a queue state with the same or a higher
versioned queue state because the client must have sent the merged results back. The contents from the
higher versioned queue states are merged with the lower versioned queue state as the contents in the
higher versioned queue state represent concurrent writes with respect to the contents of the lower
versioned queue state. As a concrete example, consider when user B sends an update to the server in
step#4. The update consists of the new write Johnny, and the previous state of the queue which already
contains the merged concurrent writes from user A and user B i.e. [Tom, Brad]. The server receives [Tom,
Brad, Johnny] with version 2. The server holds a higher versioned queue state numbered 3, which implies
that there are concurrent writes with respect to what the server just received from user B and attempts
to resolve the conflict by taking the union of values. Also, note that in case if user B were the only
user then each time the server received a request from user B, the server would overwrite the existing
version and would not need to increment the version of the queue’s state.
The above example treats the user as the key and the state of the user’s queue as the value of the key.
We could have used any other key/value pair for the example. The concurrent writes are resolved by
taking a union of the contents, in some systems such as Riak the concurrent writes are referred to as
siblings. Taking a union of siblings works well for some scenarios but may not be suitable in others
e.g. when movies can also be removed by a user. A user adding and another user removing the same movie
from the queue as concurrent writes will lead to a queue state with the movie present when the union of
siblings is applied as a conflict resolution mechanism.
Version vectors
The scheme we described above assigns a version per key. When multiple replicas are involved, a version
is assigned per key per replica. Each replica maintains the version number for a given key locally that
is incremented when the replica processes a write to the key. Additionally, each replica also tracks the
version numbers for a given key from other replicas. All the versions for a key across all the replicas
is referred to as a version vector and is sent to the client upon a read of the key.