Let’s play with Cassandra… (Part 1/3)
I have already talked about it but NoSQL is about diversity and includes various different tools and even kind of tools. Cassandra is one of these tools and is certainly and currently one of the most popular in the NoSQL ecosystem. Built by Facebook and currently in production at web giants like Digg, Twitter, Cassandra is a hybrid solution between Dynamo and BigTable.
Hybrid firstly because Cassandra uses a column-oriented way of modeling data (inspired by the BigTable) and permit to use Hadoop Map/Reduce jobs and secondly because it uses patterns inspired by Dynamo like Eventually Consistent, Gossip protocols, a master-master way of serving both read and write requests…
Another DNA of Cassandra (and in fact a lot of NoSQL solutions) is that Cassandra has been built to be fully decentralized, designed for failure and Datacenter aware (in a sense you can configure Cassandra to ensure data replication between several Datacenter…). Hence, Cassandra is currently used between the Facebook US west and east coast datacenters and stored (around two years ago) 50+ TB of data on a 150 node cluster.
(more…)

