SQL NOTES:
➔ What is NoSQL?
NoSQL stands for "Not Only SQL" — a term used to describe non-relational databases that
are designed to handle:
● Large volumes of data
● Unstructured or semi-structured data
● Scalable and distributed systems
● Modern application needs (e.g. real-time web, IoT, big data)
➔ Why NoSQL is Needed:
1. Impedance Mismatch
○ Difficulty in mapping complex in-memory data (like objects, arrays) to relational
tables.
○ Example: An Order object split into Order, Customer, Product, etc. tables,
then reassembled using joins.
2. Scalability Limitations
○ Relational databases don’t run efficiently on clusters (multiple servers working
together).
○ NoSQL is designed to scale out across many machines.
3. Big Data Needs
○ Growing need to store and process huge volumes of data.
○ Relational databases struggle with high-traffic web apps and real-time
analytics.
➔ Common Characteristics of NoSQL Databases:
1. Non-Relational : Doesn’t use tables or SQL as primary model.
2. Cluster-Friendly : Designed to run efficiently on clusters (distributed systems).
3. Open Source : Most NoSQL databases are community-driven and freely available.
4. Built for the Web : Handles large-scale, modern web application needs.
5. Schemaless : No fixed schema; data can be stored flexibly., But implicit data structure
still needs to be managed by developers.
Feature NoSQL SQL (Relational DB)
Scalability Horizontal (scale out) Vertical (scale up)
Architecture Distributed, fault-tolerant Centralized or limited distribution
Best For Big data, fast growth, flexible Strong consistency, complex
data queries
Transaction Model Often eventual consistency Strict ACID compliance(ensure
data integrity and consistency.)
Performance on Efficient and optimized Slower, complex to manage
Cluster
➔ NoSQL vs. SQL Comparison Table
Category SQL Databases NoSQL Databases
Data Storage Data stored in rows and Data format varies by type: -
Model columns (tables). Separate Key-value stores have "key" and
entities (e.g., Employees, Offices) "value" - Document stores keep all
are stored in separate tables and data in one JSON/XML document -
linked with joins. Supports hierarchical and nested data.
Schema Fixed schema: Structure and Dynamic schema: Fields can be added
data types must be defined in or changed without taking the database
advance. Altering schema often offline. Data can be dissimilar across
requires downtime. records.
Scalability Vertical scaling: Add more Horizontal scaling: Add more servers
power (CPU/RAM) to one server. (nodes) to handle increased load. Ideal
for big data applications.
Development Mix of open-source (e.g., Mostly open-source (e.g., MongoDB,
Model MySQL, PostgreSQL) and Cassandra, CouchDB).
closed-source (e.g., Oracle).
Transactions Strong support for ACID Varies by database: Some support
transactions — either all ACID at document level (e.g.,
changes apply or none. MongoDB); others offer eventual
consistency (e.g., Cassandra).
Data Uses SQL language — e.g., Uses APIs and object-oriented
Manipulation SELECT, INSERT, UPDATE. methods. Query language varies (e.g.,
MongoDB uses its own query syntax).
Consistency Typically strong consistency — Varies: Some offer strong
data is always accurate and up to consistency, others eventual
date. consistency depending on use case
and setup.
➔ Four types of NoSQL data model
1. Key-value
2. Document
3. Column-family
4. Graph
➔ Key-Value Stored
What it is:
● Stores data as key-value pairs
● The key is unique, the value holds the actual data (often a JSON, string, or binary)
Examples: Redis, DynamoDB, Riak
Structure:
Key => Value
user_123 => {name: "aiza", age: 24, email: "[email protected]"}
Strengths:
● Simple to use
● Fast performance for key-based access
● Great for scalability
Weaknesses:
● No complex queries (can’t search inside value)
● Can’t do joins or aggregations
Common Use Cases:
Use Case Example Schema
Session Store user sessions in web apps Key: session_id → Value:
Storage session data
Caching Store results of expensive DB Key: query_key → Value:
queries result
User Profiles Store user data by user ID Key: user_id → Value: user
info
➔ Document Data Model
What it is:
● Stores data as documents (usually in JSON, XML, or BSON).
● All related data (like an order) is stored in one document.
Examples: MongoDB, CouchDB, RavenDB
Structure:
{
"_id": "user123",
"name": "Ali",
"age": 24,
"email": "[email protected]"
}
● Documents are self-describing and can have nested fields.
Strengths:
● Flexible Schema: Easy to add/remove fields as data evolves.
● Better Querying: Can search inside fields, unlike key-value stores.
Weaknesses:
● Performance issues with very large or deeply nested documents.
Common Use Cases:
Use Case Example Description Document Fields
CMS (Content Mgmt) Store blog/articles as title, author, content, tags,
documents published_date
Product Catalogs Each product stored with name, description, price,
details category, stock
User-Generated Store blog posts with title, author, content,
Content comments comments[], post_date
Example – Blog Post Document:
{
"_id": "post123",
"title": "My First Blog Post",
"author": "user123",
"content": "This is the content of my first blog post...",
"comments": [
{"user": "user456", "comment": "Great post!", "date":
"2024-06-03T13:00:00Z"}
],
"posted_date": "2024-06-03T12:00:00Z"
}
➔Column-Family Data Model – Quick Notes
What it is:
● Stores data in columns instead of rows.
● Data is grouped into column families (like a folder of related columns).
● Everything about an entity (e.g., customer or order) is stored together in one group of
columns.
Examples:
● Apache Cassandra
● HBase
● ScyllaDB
Structure:
● Row Key identifies the entry.
● Columns are grouped into families based on related data.
Strengths:
● Highly Scalable: Great for large datasets and distributed systems.
● Efficient for Writes and Reads in time-series and analytical apps.
Weaknesses:
● Complex to Design: Harder to model and manage than key-value or document stores
Common Use Cases:
Use Case Example Description Schema
Time-Series Data Store sensor readings over time Row Key: sensor + time
Columns: temp, humidity,
pressure
Data Store sales or analytics data across Row Key: product_id
Warehousing time/regions Columns: sales_q1, sales_q2,
etc.
Logging Store system logs with log details Row Key: log_id + time
Columns: level, message,
user_id
➔Graph Data Model – Quick Notes
What it is:
● Stores data as nodes (entities) and edges (relationships).
● Ideal for highly connected data and complex relationships.
Examples:
● Neo4j
● ArangoDB
● Amazon Neptune
Structure:
● Node: Entity (e.g., User, Product)
● Edge: Relationship (e.g., FRIENDS_WITH, PURCHASED)
● Both nodes and edges can have properties
(User1) ---[FRIENDS_WITH]---> (User2)
(Product1) <---[PURCHASED]--- (User1)
Strengths:
● Best for relationships: Easily models and queries complex connections.
● Pattern detection: Great for finding paths, clusters, or indirect links.
Weaknesses:
● Scalability: Can be hard to scale horizontally (especially with huge datasets).
Common Use Cases:
Use Case Example Description Schema
Social Networks Users and friendships Node: User
Edge: FRIENDS_WITH
Recommendation Users/products with Node: User, Product
Engines interest/purchase links Edge: PURCHASED,
INTERESTED_IN
Fraud Detection Transactions and suspicious Node: User, Transaction
links Edge: MADE_BY,
CONNECTED_TO
Why Use NoSQL?
1. Better fit for modern apps
○ Matches in-memory structures like JSON, objects, arrays
○ Removes impedance mismatch (no need to split objects into multiple tables)
2. Improved performance
○ Handles large data volumes, low latency, and high throughput
3. Scales horizontally
○ Can run on multiple servers (good for big data and distributed apps)
Why Stick with Relational Databases?
1. Familiarity : Easier to find developers and DBAs with SQL skills
2. Mature and reliable : Stable, well-tested, and proven technology
3. Tool support : Many enterprise tools and software depend on SQL databases
Fun Quote:
“A DBA walks into a NoSQL bar, but turns and leaves because he couldn’t find a
table.”
Polyglot Persistence
● Polyglot = using multiple languages or technologies
● Future apps will likely use different types of databases together
● Example:
○ Use SQL for structured financial data
○ Use NoSQL (e.g., MongoDB) for flexible user content
○ Use Graph DB for social relationships