Visual Guide To NoSQL Systems - Nathan Hurst's Blog
Visual Guide To NoSQL Systems - Nathan Hurst's Blog
com/visual-guide-to-nosql-systems
Login
Nathan Hurst
I'm on the technical side of entrepreneurship in NYC. I love programming, board games, and my wife. Find out more about
me at nahurst.com.
Posted over 3 years ago
256,083 views
Tags
guide
database
nosql
comparison
data
cap
visual
Without further ado, here's what you came here for (and further explanation after the visual).
Note: RDBMSs (MySQL, Postgres, etc) are only featured here for comparison purposes. Also, some of these systems can
vary their features by configuration (I use the default configuration here, but will try to delve into others later).
1 of 10 29.07.2013 11:07
Visual Guide to NoSQL Systems - Nathan Hurst's Blog http://blog.nahurst.com/visual-guide-to-nosql-systems
As you can see, there are three primary concerns you must balance when choosing a data management system: consistency,
availability, and partition tolerance.
Consistency means that each client always has the same view of the data.
Availability means that all clients can always read and write.
Partition tolerance means that the system works well across physical network partitions.
According to the CAP Theorem, you can only pick two. So how does this all relate to NoSQL systems?
One of the primary goals of NoSQL systems is to bolster horizontal scalability. To scale horizontally, you need strong
network partition tolerance which requires giving up either consistency or availability. NoSQL systems typically
accomplish this by relaxing relational abilities and/or loosening transactional semantics.
In addition to CAP configurations, another significant way data management systems vary is by the data model they use:
relational, key-value, column-oriented, or document-oriented (there are others, but these are the main ones).
Relational systems are the databases we've been using for a while now. RDBMSs and systems that support ACIDity
and joins are considered relational.
Key-value systems basically support get, put, and delete operations based on a primary key.
Column-oriented systems still use tables but have no joins (joins must be handled within your application).
Obviously, they store data by column as opposed to traditional row-oriented databases. This makes aggregations
much easier.
Document-oriented systems store structured "documents" such as JSON or XML but have no joins (joins must be
handled within your application). It's very easy to map data from object-oriented software to these systems.
Now for the particulars of each CAP configuration and the systems that use each configuration:
Consistent, Available (CA) Systems have trouble with partitions and typically deal with it with replication. Examples of
CA systems include:
2 of 10 29.07.2013 11:07
Visual Guide to NoSQL Systems - Nathan Hurst's Blog http://blog.nahurst.com/visual-guide-to-nosql-systems
Consistent, Partition-Tolerant (CP) Systems have trouble with availability while keeping data consistent across
partitioned nodes. Examples of CP systems include:
BigTable (column-oriented/tabular)
Hypertable (column-oriented/tabular)
HBase (column-oriented/tabular)
MongoDB (document-oriented)
Terrastore (document-oriented)
Redis (key-value)
Scalaris (key-value)
MemcacheDB (key-value)
Berkeley DB (key-value)
Available, Partition-Tolerant (AP) Systems achieve "eventual consistency" through replication and verification. Examples
of AP systems include:
Dynamo (key-value)
Voldemort (key-value)
Tokyo Cabinet (key-value)
KAI (key-value)
Cassandra (column-oriented/tabular)
CouchDB (document-oriented)
SimpleDB (document-oriented)
Riak (document-oriented)
If you're a developer and looking for a job or if you're hiring developers and these data systems are important to you,
consider coming to Hirelite: Speed Dating for the Hiring Process on Tuesday.
This guide draws heavily from a recent Ruby meetup (by Matthew Jording and Michael Bryzek) and a recent
MongoDB presentation (given by Dwight Merriman).
Thanks to DBNess and ansonism for their help with validating system categorizations.
Thanks to those who helped shape the post after it was written: Stan, Dwight, and others who commented here and on
this Hacker News thread.
Update: Here's a print version of the Visual Guide To NoSQL Systems if you need one quickly (warning: it's not all that
pretty and I may not keep it updated, but as of 3/17/2010, it's current).
Upvote 17
83 responses
I'm new to the NoSql movement, but I'm wondering where a graph database such as neo4j fits in.
— ariejdl over 3 years ago
I believe Neo4J locks nodes and edges until commit, so it would be a CP system with a graph data model.
— Nathan Hurst over 3 years ago
MondoDB has consistency? How can that be true if MondoDB does "lazy writes"? http://ivoras.sharanet.org/blog/tree
/2009-11-05.a-short-time-with-mongodb.html
— Stan Harris over 3 years ago
Stan, I am on the fence about MongoDB. Dwight Merriman, CEO of 10gen (the commercial MongoDB backer), says that
MongoDB is headed in the CP direction http://www.leadit.us/hands-on-tech/MongoDB-High-Performance-SQL-Free-
Database
However, from what I understand, MongoDB does not currently use Paxos or 2PC to provide consistency. I'll look in to this
a bit more (thanks for the link) and update as necessary.
3 of 10 29.07.2013 11:07
Visual Guide to NoSQL Systems - Nathan Hurst's Blog http://blog.nahurst.com/visual-guide-to-nosql-systems
File-system), complete query language.. maybe its the missing part for connecting objects (in an external index?).. take a
look: sones.com
— Steven Bailey over 3 years ago
Nathan, since CouchDB fully supports ACID properties, shouldn't it be under consistency? http://couchdb.apache.org
/docs/overview.html
— Stan Harris over 3 years ago
Without further _ado_
— Association for the Preservation of the English Language over 3 years ago
Thanks APEL
— Nathan Hurst over 3 years ago
This is a great topic that has not sufficiently been covered yet. A few comments:
- mysql isn't really a distributed system, so I don't know where it really belongs on here. anyone else have thoughts? also the
answer is complex; master/slave replication in mysql, if you do reads on the slave, is eventually consistent. but normally as
a single server, it's "strong consistent".
- not sure but i thought Riak is dynamo-base and thus AP
- with CouchDB, it depends. single server it's strong consistent, like almost single server systems. with master-master
replication in action, it's AP.
@stan: i think of CAP as someone orthogonal to ACID durability. They are related, but fairly separate topics: CAP is about
distributed portion of the system. You could argue a system without full durability is theoretically inconsistent period, in
which case, there is no where to put it on this chart. So I think best thing is to make this chart about distribution. Plus, single
server durability is on the mongodb roadmap: http://blog.mongodb.org/post/381927266/what-about-durability
"CP" is right for MongoDB - it's a lot like BigTable in terms of sharding, with a few twists. Currently the metadata storage
is updated with 2 phase commits (so strong consistent). Further on each shard, at a given point of time, all writes are going
first to one server (although over time the server in charge can vary) -- so that's strong consistent too.
For example suppose I am using MySQL and 3 data centers: 1 master and 2 slaves, 1 instance at each DC. The network
partitions. I can still read the local slave's data, but I can't write as i can't reach the master. So reads are available, and writes
aren't. Now, perhaps this qualifies as eventually consistent as the data i'm reading may not be the most current. It is however
a consistent snapshot from the past. It's also easy to do with a conventional db; what Dynamo added was availability for
writing.
I'm not sure the best way to articulate the above, but throwing up the topic for discussion.
I just wanted to point out that Terrastore is a CA or CP system depending on server-to-master reconnection parameters: that
is, Terrastore can be configured to try to reconnect servers to master(s) for a given time window: in such a window, the
system will behave as CA, rather than CP.
Anyways, the default configuration is to behave as CP (no reconnection attempts).
Sergio B.
4 of 10 29.07.2013 11:07
Visual Guide to NoSQL Systems - Nathan Hurst's Blog http://blog.nahurst.com/visual-guide-to-nosql-systems
it's exactly the same in how it shards -- it has "chunks" ("tablets"?) which have key ranges that are stored as metadata.
it is very different from bigtable in some other respects - such as the storage engine, and data model (JSON rather than
tabular)
Also would be worth mentioning that AsterData, Vertica, and Greenplum are database appliances; and are highly optimized
for data distribution; all three employ a shared nothing architecture; maintaining consistency during loads.
Sergio B.
- text updates to clarify that this diagram is for non-single-server environments (thus I'm going to keep CouchDB on AP per
@dmerr's comments)
- text updates to clarify that systems are categorized in their default configurations. In the future, I'll try to delve into
multiple configurations (ex: SimpleDB, Terrastore, Riak)
5 of 10 29.07.2013 11:07
Visual Guide to NoSQL Systems - Nathan Hurst's Blog http://blog.nahurst.com/visual-guide-to-nosql-systems
- moving Riak to AP
- updating the definition of A to "each client can always read and write"
- removing the sharding reference on CA systems
And of course since "A" is almost always the most important thing, basically every production service at Google uses
multiple replicated bigtables (and is designed for this), thus for this graphic I think it would be misleading for Bigtable to be
anywhere but AP.
So basically in a worst case scenario where 2 different "set" operations occur on the same row at the exact same time, one of
them will essentially randomly win and that will be reflected in all of the Bigtables after the operation is replicated.
There's a kind of blurry line between availability and partition-tolerance, but Nathan is right here: many systems choose CP
over CA because choosing CA would mean to potentially block the whole cluster in case of network partitions. Choosing
CP instead means that the system is allowed to put partitioned nodes "out" of the cluster (making them unavailable),
allowing so the remaining nodes to continue working.
For a more detailed explanation: http://pl.atyp.us/wordpress/?p=2521
Cheers,
Sergio B.
6 of 10 29.07.2013 11:07
Visual Guide to NoSQL Systems - Nathan Hurst's Blog http://blog.nahurst.com/visual-guide-to-nosql-systems
My follow-up comment:
I believe the notion of Fault-Tolerance lies between both the availability and partition tolerance. Best systems that show
case them are definitely AP systems, which are actually distributed and decentralized systems providing us an eventual
consistency.
You are definitely write that diagram does imply somewhat that CA, systems are more available vs CP, but they do so
relying on a fact that the high end systems used to serve them without cost of partitioning makes system more available vs.
a grid of low-end servers aka a distributed way.
Summarizing, I second Bradford (@LusciousPear), that consistency is your the main choice, while partitioning and
availability are both knobs contributing towards different level/ kind of faults tolerance. Best to treat them like knobs, and
NoSQL systems like Dynamo, Cassandra and Riak, allow us to do that. Depending on your context of application and
requirement of service you can tune those knobs and have best of all three.
Probably most suited for single server environment, but it does not require a server daemon. Dead simple to use for Python
apps.
For instance, Sybase IQ also uses such a "vertically fragmented" data storage, but it is also an RDBMS that supports SQL.
More interesting example is MonetDB. The core of the MonetDB server is just a vertically fragmented data storage.
However, on the top of this data storage MonetDB developers have developed a RDBMS, a native XMLDB, an RDF
storage and a spatial DB ! How do you categorize this polymorphic DBMS?
I think it is better to clearly distinguish between logical data models and physical data models. Otherwise you should
introduce a formal definition of the column-oriented "logical" data model.
7 of 10 29.07.2013 11:07
Visual Guide to NoSQL Systems - Nathan Hurst's Blog http://blog.nahurst.com/visual-guide-to-nosql-systems
also as a Document-store one when looking at its REST API (some people we shown it compared its data store to
CouchDB). Here some talks about it:
- http://www.wakandasoftware.com/blog/nosql-but-so-much-more/
- http://www.slideshare.net/alexandre_morgaut/wakanda-js-conf-eu-09-slideshare
- http://jsconfeu.blip.tv/ (third video)
— amorgaut over 3 years ago
What about a NoSQLite initiative?
I really miss the embedded side. AFAIK, only BerkeleyDB (both Java and C editions), Tokyo/Kyoto Cabinet, Neo4J fill this
gap for different requirements. Of course, I know about good'ol GDBM and the various fliesystem-based serialization
mechanisms dynamic languages provide, but some points such as replication, speed and safety are always nice to have.
— Nando Sola over 3 years ago
...Not to mention that a JSON document-oriented embedded database à la CouchDB would be awesome.
— Nando Sola over 3 years ago
Nice Post Nathan. This is a great summary that will be very useful in helping me evaluate data store systems.
— MightyByte over 2 years ago
@MightyByte Thanks! Glad it was helpful.
— Nathan Hurst over 2 years ago
This is a great discussion, thanks to all. +1 on sugibuchi's request for better distinction between logical and physical data
models. As a big fan of the relational *data model*, I wince when I see generalizations about traditional relational DB
implementations that imply more than they should about the relational model itself. And I smile at the irony of the NoSQL
movement's unofficial slogan ("select fun, profit from real_world where relational=false;") because it implies a relational
model itself (ref. http://en.wikipedia.org/wiki/Nosql). In fact, relational purists know SQL barely qualifies as a relational
query language and it is merely one of those quirks of history that SQL is the dominant query language. This consideration
for distinguishing between data model and implementation applies equally to non-relational DBs, so I think it would help
everyone if we could try to keep them separate in our conversations. I personally would love to see the relational data model
creatively applied to other parts of the CAP universe. And I would love for critics of the relational model to familiarize
themselves with the old 1970s debates about navigational and relational data models, so that we avoid rehashing old
arguments.
Changing topics... consistency. Am I right that the C in CAP is a combination of A/C/I/D in ACID? If so that is unfortunate.
In various discussions there seem to be at least three interpretations of what Consistency means. One is is about whether
multiple updates (that are related to each other and thus have to all be kept consistent with each other) happen in an all-or-
nothing (atomic) fashion. Another is about whether multiple clients see the same data values modulo time, geography, and
partitioning. And another is whether data that is stored can violate various rules like referential constraints or value
constraints. This seems to be a source of wasted energy, but I don't know the solution.
I don't have a good answer to your consistency question. Anyone else? I'll have to get back to you on it.
consistency in distributed systems (hence in NOSQL systems) is all about the order of read/write operations as seen by the
clients.
That is, in a consistent system (such as MongoDB or Terrastore), clients are guaranteed to read/write the latest version that
has been previously written (or read in case it was unmodified); in eventually consistent systems (such as Cassandra or
Riak) clients have no such a guarantee, so they may actually read/write stale data.
So it's pretty different from the C in ACID, which refers to data constraints, and is maybe more a mix of A and I, where
atomicity and isolation must be taken in the context of a fully distributed system.
HTH,
Cheers,
Sergio B.
8 of 10 29.07.2013 11:07
Visual Guide to NoSQL Systems - Nathan Hurst's Blog http://blog.nahurst.com/visual-guide-to-nosql-systems
Furthermore I wonder where to locate ParStream (www.parstream.com) in your category-system. ParStream is an analytical
database with a hybrid-data-store (row and/or column), using a highly compressed bitmap-index, operates MPP on
distributed environments inlcuding redundancy and automatic rebalancing and import and updates of distributed data,
provides interfaces for SQL, JDBC and a C++API and offers JOINS. Currently, it does not offer full ACID-support as
known from RDBMS.
Anybody who can help me with the classification.
Thanks
If you have tables to join, use SQL. But if these are only a few tables and you normalized your data to have more tables to
index, you could denormalize your data, store it on a document-oriented DB and forget about joins.
Or do you store lists in a SQL DB? I once used a table like a fifo queue and it was a really stupid decision, because I had to
rewrite the index every few minutes. Use Redis for something like that.
9 of 10 29.07.2013 11:07
Visual Guide to NoSQL Systems - Nathan Hurst's Blog http://blog.nahurst.com/visual-guide-to-nosql-systems
http://www.sdbexplorer.com/
Your Comment
10 of 10 29.07.2013 11:07