Data Consistency in Apache Cassandra — Part 3

Rajendra Uppal
Software Architecture
3 min readAug 24, 2017

--

In part 2, I explained how to achieve immediate and eventual consistency using different write and read consistency levels.

In this part, I’ll go a bit deeper into understanding different configuration settings and consistency levels.

Immediate Consistency with Write CL = ALL, Read CL = ONE

  1. The write request is sent to all replicas.
  2. Complete write operation is considered successful when all replicas respond a write success. If any one node is down or write operation is failed, complete write request is failed. When all replicas respond success, then we are guaranteed that the data or partition that we just wrote is exact same copy on complete cluster.
  3. While reading, the read request will only read from any 1 of 3 replicas as Read CL is set to ONE. And, because the data is same on all 3 replicas, we are fully guaranteed that we are going to get most recent data.

Immediate Consistency with Write CL = ONE, Read CL = ALL

  1. with Write CL = ONE, the write request is still sent to all replicas, only one of them needs to respond success for the complete write request to be successful. Cassandra attempts to write the data to all 3 nodes but after this operation is successful, you only have the guarantee that data will be current on 1 node.
  2. the read request queries all 3 replicas. these 3 replicas respond to the coordinator and then coordinator merges (finds out most recent data) the responses and send a single response to the client. because querying all replicas, you are guaranteed that one of them contains most recent copy of the requested data.

So, this configuration is a good way to achieve immediate consistency, if you need very high performance, optimize writes at the expense of slower and lower availability on your reads.

Immediate Consistency with Write CL = QUORUM, Read CL = QUORUM

  1. a write request is sent to all 3 replicas expecting that data will be written successfully to all 3 replica nodes.
  2. as write CL is set to QUORUM, so complete write operation is successful if a simple majority (2 replica nodes here) of the nodes respond success.
  3. a read request is only sent to simple majority number of nodes as read CL is also set to be QUORUM. and the coordinator them merges response from both replicas and returns the most recent data.
  4. we are guaranteed to get immediate consistency because at least 1 replica node would overlap between 2 written nodes and 2 nodes from which data is read.

So, using QUORUM as consistency levels for reads and writes, we are getting a balanced approach on reads and writes in terms of balancing our read and write performance, balanced availability and balanced throughput.

Eventual Consistency with Write CL = ONE, Read CL = ONE

  1. the write request is sent to all 3 replicas but considered successful even if only 1 node responds success.
  2. read request queries data from only 1 replica node. so there is a possibility that the data just read is not the most recent data written.

As we seen earlier that eventual consistency is typically few milliseconds away, so we are going to read inconsistent data for a very long time in normal scenarios.

What you gain from this configuration is, you get highest performance reads and writes, highest read and write throughput and highest read and write availability. To gain all those benefits, you are trading off immediate consistency for eventual consistency.

Comments and thoughts welcome. Cheers!

--

--

Rajendra Uppal
Software Architecture

Software Engineering Leader, Formerly at Microsoft, Adobe, Studied at IIT Delhi and IIT Kanpur. www.linkedin.com/in/rajendrauppal