Graph No-SQL Databases
Graph Database
Graph databases are a type of NoSQL database that employ graph structures to store and
manage data.
Unlike traditional relational databases that use tables and rows, graph databases utilize
nodes, edges, and properties to represent and store information.
Nodes represent entities such as people, products, or places, while edges represent the
relationships between these entities.
Properties are used to store additional attributes or metadata about nodes and edges.
Graph databases excel at modelling and querying complex relationships, making them
particularly suited for applications with highly connected data.
Graph DB
Advantages
Flexible Schema: Graph databases offer a flexible schema that can adapt to evolving data
models without requiring costly schema migrations.
Relationship Modelling: Graph databases excel at modelling complex relationships
between entities, making them well-suited for applications with highly interconnected data.
Efficient Queries: Graph databases enable efficient traversal of relationships, allowing for
fast and complex queries on connected data.
Dis-advantages
Performance Challenges: Complex queries and traversals can pose performance
challenges, requiring careful optimization to ensure acceptable response times.
Scalability Issues: While graph databases are designed to scale horizontally, maintaining
performance at scale can be challenging, especially for write-heavy workloads.
Data Modelling Complexity: Designing effective graph schemas and managing evolving
data structures can be complex, requiring expertise in graph database design and modelling.
Graph Databases
There are many Graph Databases:
o Neo4J [Neo4J]
o Infinite Graph [Infinite Graph]
o OrientDB [OrientDB]
o FlockDB [FlockDB]
Features of Graph Database
o Consistency
o Transaction
o Availability
o Query features
o Scaling
Consistency
Graph databases ensure data consistency through the maintenance of relationships between
nodes and edges.
This means that whenever a change is made to one node or edge, the corresponding
changes are propagated throughout the graph, ensuring that the data remains coherent and
accurate.
Consistency - Example
Social Media Friendships
In a social media platform, when a user adds another user as a friend, it's crucial that both
users' friend lists are updated simultaneously to maintain consistency.
If user A adds user B as a friend, but the addition fails to update user B's friend list due to
inconsistency, it could lead to confusion and potential issues in displaying mutual
connections.
Transaction
Graph databases support ACID compliant transactions, providing atomicity, consistency,
isolation, and durability.
This ensures that database transactions are executed reliably, and the database remains in a
consistent state even in the event of failures.
Transaction - Example
Banking Transactions
In banking systems, when a customer transfers funds from one account to another, the
transaction must be completed securely and reliably to maintain the integrity of the
customer's financial data.
The transaction must adhere to the principles of ACID (Atomicity, Consistency, Isolation,
Durability) to ensure that funds are transferred accurately and that the customer's account
balances are updated correctly.
Availability
High availability is a critical feature of graph databases, ensuring that users can access data
without interruption.
This is achieved through features such as data replication, fault tolerance, and automatic
failover mechanisms, which ensure continuous access to data even in the face of hardware
failures or network outages.
Availability - Example
Online Retail Recommendation System
In an online retail recommendation system, when a user browses products, the system
needs to be continuously available to provide personalized recommendations in realtime.
If the recommendation system becomes unavailable due to server downtime or network
issues, users may experience frustration and may abandon their shopping session, resulting
in lost revenue for the retailer.
Query Features
Graph databases offer powerful query capabilities for traversing and analysing
relationships between nodes and edges.
This enables users to perform complex graph queries efficiently, such as finding the
shortest path between two nodes, identifying patterns within the graph, or performing
graph-based analytics.
Query Features - Example
Healthcare Network Analysis
In a healthcare network, doctors, patients, hospitals, and medical conditions can be
represented as nodes, and relationships between them (e.g., doctor-patient, patient-hospital)
as edges.
Graph database query features allow healthcare providers to perform complex analyses,
such as identifying patterns of disease spread, optimizing patient care pathways, and
predicting healthcare outcomes based on historical data.
Scaling
Graph databases are designed to scale horizontally, allowing them to handle largescale
connected datasets with ease.
Horizontal scaling involves distributing the data and workload across multiple nodes,
enabling the database to handle increasing data volumes and user loads without sacrificing
performance.
Scaling - Example
Social Network Growth
In a social networking platform, as the number of users and connections between them
grows, the graph database must be able to scale horizontally to accommodate the increasing
volume of data.
Horizontal scaling allows the platform to distribute the workload across multiple servers,
ensuring that performance remains consistent even as the social network expands in size
and complexity.
Use Cases of Graph Database
Connected Data
Graph databases are well-suited for applications that deal with highly interconnected data,
such as social networks, where relationships between entities are as important as the
entities themselves.
Examples include social media platforms, recommendation systems, and knowledge
graphs.
Use Cases of Graph Database
IoT (Internet of Things)
Graph databases can effectively manage and analyse the complex networks of devices and
sensors in IoT deployments.
They enable organizations to track relationships between devices, monitor performance
metrics, detect anomalies, and optimize resource allocation.
Use Cases of Graph Database
Recommendation Engine
Graph databases power recommendation engines by modelling user preferences, item
attributes, and historical interactions as a graph.
This enables personalized recommendations based on the connections between users,
items, and their attributes, leading to improved user engagement, satisfaction, and
conversion rates.
Use Cases of Graph Database
Fraud Detection:
Graph databases are used in fraud detection applications to identify patterns of fraudulent
behaviour and detect suspicious activities.
By modelling relationships between entities such as users, transactions, and IP addresses,
graph databases can uncover complex fraud networks and prevent fraudulent activities in
real-time.
Graph Database Using Neo4j
Neo4j is a leading graph database management system known for its performance,
scalability, and ease of use.
It offers a robust set of features, including support for ACID transactions, built-in graph
algorithms, and a powerful query language called Cypher.
Neo4j's architecture consists of a graph database engine, storage layer, and query
processing components, all optimized for handling graph data efficiently.
Neo4j is widely used in various industries, including social networking, recommendation
systems, fraud detection, and network and IT operations management.
Neo4j Architecture
Neo4j's architecture consists of three main components:
Graph Database Engine: Responsible for executing graph queries and managing the
traversal of relationships between nodes and edges.
Storage Layer: Stores graph data efficiently on disk and manages data retrieval and storage
operations.
Query Processing: Executes Cypher queries and performs optimizations to ensure efficient
query execution.
Example Queries:
Creating Nodes: `CREATE (Alice: Person { name: 'Alice' }), (Bob: Person { name: 'Bob'
})`
Creating Relationship: `CREATE (Alice)[:FRIENDS_WITH]>(Bob)`
Neo4j Cypher Query Language
Cypher is a declarative query language for Neo4j, designed specifically for querying and
manipulating graph data.
It allows users to express patterns and relationships in a concise and readable manner.
Examples of Cypher Queries
Create a Single Node: `CREATE (:Node)`
Create Multiple Nodes: `CREATE (:Node), (:Node)`
Create a Node with a Label: `CREATE (:Label)`
Create a Node with Multiple Labels: `CREATE (:Label1:Label2)`
Create a Node with Properties: `CREATE (:Node { property: value })`
Thank You