Apache Cassandra is an open source, replicated store of key-value pairs (also commonly known as a NoSQL database) modeled after Amazon’s Dynamo. Like Dynamo, Cassandra’s replication can be described as active-active, fault-tolerant, highly available and peer-to-peer. Unlike WANdisco’s DConE replication technology, however, they do not guarantee consistency. Instead, they guarantee what they call “eventual consistency”.
“Eventual consistency” in this context means that Cassandra has to lie to you sometimes.
Let’s look at an example to illuminate this further.
Imagine an online merchant using Cassandra with a dozen worldwide replicas for inventory management. Our merchant’s supply of Widgets is down to the last one, and we start with all Cassandra replicas accurately reflecting this worldwide supply of one Widget.
Two customers, Michael and James, order a Widget at the nearly same time. Both orders are fulfilled because the ordering software accesses two different replicas of Cassandra, each replica reporting one available Widget. One replica has a record indicating that the last Widget was sold to Michael. The other has a conflicting record indicating that the last Widget was sold to James. Nobody is aware of this conflict.
Using the “gossip protocol”, the replicas now push records around until they are fully distributed.
Eventually, a replica sees both records, detects a conflict and moves to resolve the conflict. A simplistic algorithm may resolve the conflict in favor of Michael, perhaps using timestamps and automatically generating an apology to James. Over time, James’ order is rescinded in favor of Michael’s at all replicas, and other than the effect on customer relationship with a disappointed James, everything returns to normal. Let’s just hope that in the midst of this process of conflict resolution and gossip protocol, the shipping department did not consult a replica with the wrong information and sent the last Widget to James!
Let’s now look at the same scenario using a true active-active technology with absolute consistency like WANdisco’s DConE replication.
As before, Michael and James try to buy a Widget.
Two replicas generate proposals. One proposes that a Widget be sold to Michael. The other proposes that a Widget be sold to James. DConE delivers these proposals to the replicas, and they conclude Michael’s order was fulfilled and James’ was not.
Since all replicas have the same information, the replica accessed by Michael informs him that his order is fulfilled. The replica accessed by James informs him that Widgets are sold out. The replica accessed by Shipping indicates that the Widget should ship to Michael.
So while Cassandra may not want to lie to you, it can’t help it due to the effect of its eventual consistency. DConE never lies, so when your data is really important to you, DConE’s absolute consistency is the ultimate in data-safe replication technology.