Chapter 1: Introduction to Databases
and NoSQL
1.1 Understanding Data and Databases
Data is everywhere. Every click on a website, every online purchase, and even the
temperature reading from a sensor produces data. At its most basic level, data refers to raw
facts and figures—for instance, numbers like 56, names like Rajesh, or dates like 2025-09-
07. By itself, data has little meaning. When this data is organized, processed, and
interpreted, it becomes information, which provides value and supports decision-making.
A database is a structured collection of such information. It allows large amounts of data to
be systematically stored and retrieved. Consider:
- A university maintains a database of student records.
- An online store uses a database to track product inventory.
- E-commerce platforms like Amazon rely on databases to manage their product catalogs.
Databases form the foundation of almost every digital system today.
Figure 1: Data vs Information
1.2 Database Management Systems (DBMS)
Handling large amounts of data manually is both inefficient and error-prone. This challenge
is addressed by the Database Management System (DBMS)—software that acts as an
interface between users, applications, and the database itself.
Functions of a DBMS include:
- Data storage and retrieval.
- Security to protect against unauthorized access.
- Backup and recovery in case of failures.
- Data integrity to maintain accuracy and consistency.
Example: A railway reservation system uses a DBMS to manage bookings, cancellations, and
seat availability—all in real time.
1.3 Relational Database Management Systems (RDBMS)
The most widely adopted model for decades has been the Relational Database Management
System (RDBMS), based on the relational model developed by Edgar F. Codd in 1970.
Features of RDBMS:
- Tables (relations) where data is stored in rows and columns.
- SQL (Structured Query Language) to manage data.
- ACID properties to guarantee reliable transactions.
- Relationships via foreign keys.
Example: Banks use Oracle or PostgreSQL to maintain customer accounts, balances, and
transactions.
Figure 2: RDBMS: Tables and Foreign Key Relationship
1.4 Emergence of NoSQL Databases
The explosion of the internet, mobile apps, and IoT devices created a new problem—
massive, rapidly changing, and unstructured data. Traditional RDBMS systems struggled
under this load. This gave rise to NoSQL databases, meaning 'Not Only SQL.' Unlike RDBMS,
NoSQL does not rely solely on rigid tables. Instead, it supports flexible, scalable data models.
Characteristics of NoSQL include:
- Schema-less or dynamic schema.
- Horizontal scalability.
- BASE properties (Basically Available, Soft state, Eventually consistent).
- Multiple data models: key-value, document, column, graph.
Case Studies:
- Amazon DynamoDB (Key-Value Store) handles millions of shopping cart operations per
second.
- Netflix (Cassandra – Column Store) manages user viewing history and recommendations.
- Twitter (Graph Database – Neo4j) tracks billions of 'follows' and relationships.
Figure 3: ACID vs BASE
1.5 Types of NoSQL Databases
NoSQL databases can be classified into four major types:
1. Key-Value Stores:
- Data stored as key-value pairs.
- Extremely fast for lookups.
- Example: Redis.
2. Document Stores:
- Store data in JSON-like documents.
- Flexible for semi-structured data.
- Example: MongoDB.
3. Column Stores:
- Store data by columns instead of rows.
- Ideal for analytics.
- Example: Cassandra.
4. Graph Databases:
- Store data as nodes and edges.
- Excellent for relationship-heavy data.
- Example: Neo4j.
Figure 4: NoSQL Data Models Quadrant
1.6 Horizontal Scaling in NoSQL
One of the biggest strengths of NoSQL databases is their ability to scale horizontally. Instead
of upgrading a single powerful server (vertical scaling), NoSQL spreads the data across
multiple servers (horizontal scaling). This technique is called sharding.
Each shard stores a portion of the data, and a router or coordinator directs queries to the
correct shard. Replicas of each shard ensure availability and fault tolerance.
Figure 5: Horizontal Scaling via Sharding
1.7 NoSQL Models in Depth
Document Store Example:
MongoDB stores data in flexible JSON-like documents. This allows nested structures such as
arrays and objects.
Figure 6: Document Store Example
Column-Family Store Example:
Cassandra organizes data into column families. Each row is identified by a key and
can contain sparse columns grouped into families.
Figure 7: Column-Family Store Layout (Cassandra-like)
Graph Store Example:
Neo4j stores data as nodes and edges. Nodes represent entities such as people, and edges
represent relationships such as 'follows' or 'friends.'
Figure 8: Graph Database Example
1.8 SQL vs NoSQL Comparison
SQL databases rely on fixed schemas, vertical scaling, and ACID properties, while NoSQL
databases emphasize dynamic schemas, horizontal scaling, and BASE properties.
SQL is best suited for structured, transactional applications such as banking, whereas
NoSQL thrives in large-scale, unstructured, and real-time applications such as social media.
1.9 Use Cases of NoSQL
NoSQL databases are widely used in:
- Social Media: Facebook, Instagram.
- Real-Time Analytics: Advertising platforms.
- IoT Systems: Smart home devices.
- Personalized Recommendations: Amazon, Netflix.
1.10 Summary
Databases have evolved significantly over time. From traditional DBMS to RDBMS and now
to NoSQL systems, the goal remains the same: effective data management. RDBMS continues
to serve structured, transactional needs, while NoSQL addresses the challenges of
unstructured, large-scale, and rapidly changing data.