Wide-Column Stores
Big Data Management
Phil Bartie
[email protected] EM G.29
Using material from
Alasdair Gray, HWU
Aidan Hogan, Universidad de Chile
Guillaume Marquis
https://www.tutorialspoint.com/cassandra/
https://pandaforme.gitbooks.io/introduction-to-cassandra/
Big Data Management
Database Landscape
RDF Virtuoso Object XML Relational Oracle
Jena Caché MarkLogic
Stardog Db4o MySQL
RDF4J
Versant
Sedna MS SQL Server
Tamino
GraphDB BaseX
Blazegraph ObjectStore eXist-db
PostgreSQL DB2
SQLite
MS Access Teradata
NoSQL
SAP Adaptive Server
Key-Value Redis Document MongoDB Hive
Memcached DynamoDB FileMaker
Riak KV MariaDB
Aerospike CouchBase
SimpleDB Elasticsearch Informix Vertica
Wide-Column Graph Neo4J NewSQL SAP HANA
Cassandra Titan Google Spanner
HBase Giraph Clustrix
Accumulo
HyperTable InfiniteGraph VoltDB MemSQL NuoDB
2
Big Data Management
Relational Databases Recap
Two-dimensional tables
Relationships between tables
Fixed schema
Homogeneous
Highly structured
NULLs – arrghh!
Source : http://excel.quebec/attachments/Image/excel-quebec-requete-sql-excel-1.jpg
3
Big Data Management
Key-value and Tabular
Key–Value = a Distributed Map
Countries
Primary Key Value
Afghanistan capital:Kabul,continent:Asia,pop:31108077#2011
Albania capital:Tirana,continent:Europe,pop:3011405#2013
… …
Tabular = Two-dimensional Maps
Countries
Primary Key capital continent pop-value pop-year
Afghanistan Kabul Asia 31108077 2011
Albania Tirana Europe 3011405 2013
… … … … …
5
Big Data Management
Wide-Column Stores
a sparse, distributed, persistent, multi-dimensional, sorted map
Sparse – not a value for every column
(i.e. not dense square)
Distributed – each node has the same role
– no single point of failure
Masterless – each node can service any request
New nodes can be added without downtime
Keyspace: container for column families
Column Family: container for rows
Rows: ordered columns https://www.tutorialspoint.com/cassandra/cassandra_data_model.htm
6
<< 12th
Big Data Management
Column Family (Table)
10
Big Data Management
Row
Row: smallest unit that stores related data
Data partition mechanism
https://pandaforme.gitbooks.io/introduction-to-cassandra/content/understand_the_cassandra_data_model.html 11
Big Data Management
Keys
Composite Row Key
Composite Column Key
https://pandaforme.gitbooks.io/introduction-to-cassandra/content/understand_the_cassandra_data_model.html 13
Big Data Management
Column Family View: Single-row partitions
14
https://pandaforme.gitbooks.io/introduction-to-cassandra/content/understand_the_cassandra_data_model.html
Big Data Management
Column Family: Multi-row partitions
https://pandaforme.gitbooks.io/introduction-to-cassandra/content/understand_the_cassandra_data_model.html 15
Big Data Management
Wide-Column Advantages
Highly scalable: designed for distributing across:
Cluster
Data centres
Data manipulation: includes limited query language
Data stored in sorted order
Wide-columns: increased granularity of operation
Not affected by increasing number of rows
16
Big Data Management
Cassandra
Wide-Column Store
Big Data Management
Meta (Facebook) Stats (2022)
Messenger
2.91 billion users
Search requires inverse-index
Search term to message id
Continuous data arrival
Instantaneous responses
Cassandra developed as a solution
https://www.statista.com/topics/4625/facebook-messenger/#dossierKeyfigures https://www.messenger.com/ 18
Big Data Management
Cassandra History
History
Avinash Lakshman, one of the authors of Amazon's Dynamo, and Prashant Malik initially developed Cassandra at Facebook
to power the Facebook inbox search feature. Facebook released Cassandra as an open-source project on Google code in
July 2008. In March 2009 it became an Apache Incubator project. On February 17, 2010 it graduated to a top-level project.
Facebook developers named their database after the Trojan mythological prophet Cassandra - with classical allusions to a
curse on an oracle.
Free and open-source
Distributed
Can add more hardware nodes with no downtime
Wide Column Store
Should always be able to read/write to Cassandra
NoSQL database
Consistency can be adjusted – at expense of availability
Masterless replication
Each node has same role
- Secondary Index support is weak
Low latency
(single columns only; equality comparisons only)
https://en.wikipedia.org/wiki/Apache_Cassandra
Big Data Management
http://cassandra.apache.org/ 20
CONSISTENT HASHING
https://www.scnsoft.com/blog/cassandra-performance
21
Commit log : Append only log = very fast https://www.scnsoft.com/blog/cassandra-performance
MemTable stored in memory
Acknowledge to client
Flush MemTable to SSTable (Sorted Strings Table) See intro video: https://youtu.be/B_HTdrTgGNs?t=947
22
SSTable – Sequential, Immutable
Every so often Cassandra carries out a COMPACTION
Does big sequential READ, MERGE, WRITE
Check video: https://youtu.be/B_HTdrTg
GNs?t=1143
Cassandra Write Path
Fully distributed with no single point of failure (masterless)
QUORUM Consistency:
(n/2 +1) rounded down
where n= replication factor
24
https://www.scnsoft.com/blog/cassandra-performance
25
Big Data Management
Distributed, Replicated and Fault Tolerant
Consistent Hashing
Hashed to ring
Order preserving hash function
Gossip style membership algorithm
Data replication
Eventual Consistency
Merkle Tree
26
Big Data Management
Where is Cassandra?
CA : Guarantees
But (like to give a
Dynamo), CP: Guarantees
correct response but only while responses are correct even
tables are tunable
C
network works fine if there are network
towards CP
(Centralised / Traditional) failures, but response may
fail (Weak availability)
A P
AP: Always provides a “best-effort”
response even in presence of network failures
(Eventual consistency)
27
Big Data Management
Tuneable Consistency
Write = Commit Log + Memtable
Quorom = Majority of replicas: ⌊R/2⌋+1 for R the replication factor
Hinted handoff: central 3 hour TODO log (not readable)
Level Explanation
Availability ANY One replica node or a hinted handoff
ONE One replica node (hinted handoff not enough)
TWO Two replica nodes
THREE Three replica nodes
QUORUM A quorum of replica nodes
ALL All replica nodes
Consistency
28
Big Data Management
Tuneable Consistency
Level Explanation
For write operations, ANY is the lowest consistency (but ANY One replica node or a hinted handoff
highest availability), and ALL is the highest consistency ONE One replica node (hinted handoff not enough)
(but lowest availability). TWO Two replica nodes
THREE Three replica nodes
For read operations, ONE is the lowest consistency (but QUORUM A quorum of replica nodes
highest availability), and ALL is the highest consistency ALL All replica nodes
(but lowest availability).
QUORUM is a good middle-ground ensuring strong
consistency, yet still tolerating some level of failure.
The size of the quorum is calculated as (replication_factor / 2) + 1 Replication factor
Replication factor is total number of replicas
across the cluster.
https://blog.imaginea.com/consistency-tuning-in-cassandra 29
Big Data Management
Cassandra Query Language (CQL)
SQL-like declarative query language
30
Big Data Management
CQL: Create Keyspace (Database)
CQL MySQL (equivalent)
Create Keyspace Create Database
CREATE KEYSPACE MyKeySpace CREATE DATABASE MyKeySpace;
WITH REPLICATION = {
'class' : 'SimpleStrategy’,
'replication_factor' : 3 };
Load in Database
Load in keyspace USE MyKeySpace;
USE MyKeySpace;
32
Big Data Management
CQL: Create Column Family(Table)
CQL MySQL (equivalent)
Create Column Family Create Database
CREATE COLUMNFAMILY MyColumns
(id varint,
CREATE TABLE MyColumns (
lastname varchar, id int NOT NULL,
firstname varchar, lastname varchar(50),
PRIMARY KEY (id)); firstname varchar (100),
Load data PRIMARY KEY (id));
INSERT INTO MyColumns Load data
(id, lastname, firstname)
VALUES (1, 'Doe', 'John’); INSERT INTO MyColumns
(id, lastname, firstname)
(inserts will always overwrite) VALUES (1, 'Doe', 'John'); 33
Big Data Management
CQL: Retrieve data
CQL MySQL (equivalent)
Retrieve all rows Retrieve all rows
SELECT * FROM MyColumns; SELECT * FROM MyColumns;
34
Big Data Management
CQL: Retrieve data
CQL MySQL (equivalent)
Retrieve row 1 Retrieve id 1
SELECT * FROM MyColumns SELECT * FROM MyColumns
WHERE id = 1; WHERE id = 1;
35
Big Data Management
CQL: Retrieve data
CQL MySQL (equivalent)
Retrieve all Johns Retrieve all Johns
SELECT * FROM MyColumns SELECT * FROM MyColumns
WHERE firstname = 'John'; WHERE firstname = 'John';
Bad Request: Cannot execute this query as it
might involve data filtering and thus may have
unpredictable performance. If you want to execute
this query despite the performance
unpredictability, use ALLOW FILTERING.
CREATE INDEX on MyColumns (firstname);
36
Big Data Management
How is this different from RDBMS?
In a static-column storage engine, each row must reserve space for every column
ALTER TABLE users ADD birth_date INT;
new columns can be added on the fly while running
and processing queries
https://www.datastax.com/dev/blog/schema-in-cassandra-1-1 37
Big Data Management
Using Columns
siteid date mean_temp
1 2012-09-01 20.6
1 2012-09-01 21.9 RDBMS approach
1 2012-09-01 21.7
siteid 2012-09-01 2012-09-02 2012-09-03
1 20.6 21.9 21.7
CASSANDRA approach
38
Big Data Management
CQL: Consistency
SELECT totalsales
FROM sales
USING CONSISTENCY QUORUM
WHERE customerid=5;
UPDATE SALES
USING CONSISTENCY ONE
SET totalsales=50000
WHERE customerid=4;
39
Big Data Management
Limitations of CQL
No join or subquery support, and limited support for aggregation.
- This is by design, to force you to denormalize into partitions that can be efficiently queried from a single replica,
instead of having to gather data from across the entire cluster.
A single column value may not be larger than 2GB
- in practice, "single digits of MB" is a more reasonable limit, since there is no streaming or random access of blob
values.
The maximum number of cells (rows x columns) in a single partition is 2 billion.
https://wiki.apache.org/cassandra/CassandraLimitations 40
Big Data Management
Using Bloom Filters for Fast Data Retrieval
Each SSTable (String Sorted Table) has
an associated Bloom Filter
Bloom Filter Stored in Memory
Highly Efficient
Can produce false positives
41
Big Data Management
Bloom Filters
Efficient test for data location
Hash object on insert using k hash functions
Set bit to 1
Hash object on read using k hash functions
Any 0s then not present
Bit would have been set to 1 on insert
They can give FALSE POSITIVES, but not FALSE NEGATIVES – so a good way to check if data has been
processed before
Video on Bloom Filters: https://youtu.be/bEmBh1HtYrw
42
Big Data Management
Bloom Filters: Insert A
Hash object on insert using k
hash functions
Set bit to 1
e.g. input word = ‘aardvark’
Output from hash function 1 = 3
Output from hash function 2 = 1
Output from hash function 3 = 14
Big Data Management: http://chimera.labs.oreilly.com/books/1234000001802/ch06.html#_bloom_filters
43
Big Data Management
Bloom Filters: Insert B
Hash object on insert using k
hash functions
Set bit to 1
e.g. input word = ‘bat’
Output from hash function 1 = 16
Output from hash function 2 = 1
Output from hash function 3 = 7
Big Data Management: http://chimera.labs.oreilly.com/books/1234000001802/ch06.html#_bloom_filters
44
Big Data Management
Bloom Filters: Read Y
Hash object on read using k
hash functions
Any 0s then not present
(as Bit would have been set to 1 on
insert)
e.g. input word = ‘elephant’
Output from Hash 1 = 16
Output from Hash 2 = 2
Output from Hash 3 = 7
Big Data Management: http://chimera.labs.oreilly.com/books/1234000001802/ch06.html#_bloom_filters
45
Big Data Management
Bloom Filters: Read X
Hash object on read using k
hash functions
All 1s then data may be
present
Bit would have been set to 1 on
insert
e.g. input word = ‘bat’
Hash results [16,1,7]
e.g. input word = ‘snake’
Hash results [1,14,16] FALSE POSITIVE Big Data Management: http://chimera.labs.oreilly.com/books/1234000001802/ch06.html#_bloom_filters
46
DEMO of a BLOOM Filter
https://llimllib.github.io/bloomfilter-tutorial/
Big Data Management
Cassandra vs MongoDB
Problem domain needs a rich data model = MongoDB
Need secondary indexes and flexibility in the query model = MongoDB
(by contrast Cassandra secondary indexes only support single columns and equality
comparisons)
100% uptime = Cassandra
Write scalability = Cassandra
Query language support = CQL is similar to SQL
The Apache Cassandra database is a good choice when you need scalability and high
availability without compromising performance, and with no single point failure.
https://scalegrid.io/blog/cassandra-vs-mongodb/ 51
Big Data Management
Summary: Wide-column stores
Conceptually a big table
Sparse: not all cells have values
Distributed and persistent
Multi-dimensional: multiple values
Sorted Map
Model:
Keyspace: container for column families
Column Family: container for rows
Rows: Set of ordered columns
Column: (name, value, timestamp)
Cassandra:
Distributed, replicated, and fault tolerant
SQL-like query language
Bloom filters: efficient data presence testing 52
Big Data Management
Reading
Cassandra Vs MongoDB In 2018 by Matan Sarig
https://blog.panoply.io/cassandra-vs-mongodb
Cassandra Scales well (linear with more nodes)
Apache Cassandra introduction video
https://www.youtube.com/watch?v=B_HTdrTgGNs
53