0% found this document useful (0 votes)

6 views18 pages

Unit 2

Uploaded by

Athiyaman A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views18 pages

Unit 2

Uploaded by

Athiyaman A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Ramaiah Institute of Technology

(Autonomous Institute, Affiliated to VTU)

Department of AIML

Course Name : NoSQL Data Bases

Curse Code : AIE734
Credits : 3:0:0
Unit-2
NoSQL Data Architecture Patterns

Ø Architecture Pattern is a logical way of categorizing data that will be stored on the
Database.
Ø An architecture pattern is a way of organizing data in a logical and structured manner.
Ø NoSQL is a type of database which helps to perform operations on big data and store it in
a valid format.
Ø It is widely used because of its flexibility and a wide variety of services.

Architecture Patterns of NoSQL

The data is stored in NoSQL in any of the following four data architecture patterns.
1. Document data model
2. Key Value data model
3. Column data model
4. Graph based data model
Document data model

Ø The document database fetches and accumulates data in form of key-value pairs but here, the
values are called as Documents.
Ø Document can be stated as a complex data structure.
Ø Document can be a form of text, arrays, strings, JSON, XML or any such format.
Ø It is very effective as most of the data created is usually in form of JSONs and is unstructured.
Ø Each document contains all the necessary data, and the data can be indexed for easy retrieval.

Advantages

• This type of format is very useful and apt for semi-structured data.
• Storage retrieval and managing of documents is easy.
• Flexible schema allows for easy changes to data structure.
• High performance for read-heavy workloads.

Disadvantages

• Lack of support for joins and transactions.

• Poor performance for write-heavy workloads.
Example

MongoDB a popular document store with support for dynamic schemas and horizontal scaling.

Figure – Document Store Model in form of JSON documents

Key-Value data model
Ø This model is one of the most basic models of NoSQL databases.
Ø As the name suggests, the data is stored in form of Key-Value Pairs.
Ø The key is usually a sequence of strings, integers or characters but can also be a more advanced data type.
Ø The value is typically linked or co-related to the key.
Ø The key-value pair storage databases generally store data as a hash table where each key is unique.
Ø The value can be of any type (JSON, BLOB(Binary Large Object), strings, etc).
Ø This type of pattern is usually used in shopping websites or e-commerce applications.

Advantages
• Simple and efficient architecture.
• It allows for the fast read and write operations.
• Can handle large amounts of data and heavy load

Disadvantages

• Not suitable for complex data structures.

Example
• DynamoDB
• Berkeley DB
Column Data Model

Ø This pattern stores data in column families, rather than storing data in relational tuples, the data is
stored in individual cells which are further grouped into columns.
Ø Basically, the relational database stores data in rows and also reads the data row by row, column
store is organized as a set of columns.
Ø So if someone wants to run analytics on a small number of columns, one can read those columns
directly without consuming memory with the unwanted data.
Ø For eg: Cassandra and Apache Hadoop Hbase.

Sl.No Name Course ID

1 Ramya BE 4
2 Kiran BE 7
3 Kapilan M.Tech 20
4 Priya M.Tech 8

Fig : Eg of Row-Oriented Table

Sl.No Name ID
1 Ramya 4
2 Kiran 7
3 Kapilan 20
4 Priya 8

Fig : Column – Oriented Table

Sl.No Course ID
1 BE 4
2 BE 7
3 M.Tech 20
4 M.Tech 8

Fig : Column – Oriented Table

Working of Column data model

Ø In Columnar Data Model instead of organizing information into rows, it does in columns.
Ø This makes them function the same way that tables work in relational databases.

Advantages

• Well Structured
• Flexibility
• Scalability
• Load Time

Disadvantages
• Designing indexing schema
• Online transaction processing
• Security Vulnerabilitie
Graph Based Data Model

Ø Graph Based Data Model in NoSQL is a type of Data Model which tries to focus on building the relationship
between data elements.
Ø As the name suggests Graph-Based Data Model, each element here is stored as a node, and the association
between these elements is often known as Links.
Ø Association is stored directly as these are the first-class elements of the data model.
Ø These data models give us a conceptual view of the data.
Ø These are the data models which are based on topographical network structure.

•Nodes: These are the instances of data that represent objects which is to be tracked.
•Edges: As we already know edges represent relationships between nodes.
•Properties: It represents information associated with nodes.

Fig: A simple graph with two vertices and one edge.

Fig : Image represents Nodes with properties from relationships represented by edges
Working of graph data model.
Ø In these data models, the nodes which are connected together are connected physically and the physical
connection among them is also taken as a piece of data.
Ø Connecting data in this way becomes easy to query a relationship.
Ø This data model reads the relationship from storage directly instead of calculating and querying the
connection steps.
Ø Like many different NoSQL databases these data models don’t have any schema as it is important because
schema makes the model well and good and easy to edit.
Advantages
• Structure
• Real time output results

Disadvantages
• No standard query language
• Unprofessional graphs

NoSQL system ways to handle big data problems

Ø Datasets that are difficult to store and analyze by any software database tool are referred to as big data.
Ø Due to the growth of data, an issue arises that based on recent fads in the IT region, how the data will be
effectively processed.
Ø A requirement for ideas, techniques, tools, and technologies is been set for handling and transforming a lot of
data into business value and knowledge.
Ø The major features of NoSQL solutions are stated below that help us to handle a large amount of data.
NoSQL databases that are best for big data are:
• MongoDB
• Cassandra
• CouchDB
• Neo4j

Different ways to handle Big Data problems

1. Moving query to the data, not data to the query.
2. Hash rings to distribute the data on clusters.
3. Replication to scale read.
4. Distributed queries to data nodes.

Fig: One or many databases? Here are some of the challenges you face when you move from a single processor to a distributed
computing system. Moving to a distributed environment is a nontrivial endeavor and should be done only if the business problem really
warrants the need to handle large data volumes in a short period of time. This is why platforms like Hadoop are complex and require a
complex framework to make things easier for the application developer.
Moving query to the data, not data to the query
Ø With the exception of large graph databases, most NoSQL systems use commodity processors that each hold a
subset of the data on their local shared-nothing drives.
Ø When a client wants to send a general query to all nodes that hold data, it’s more efficient to send the query to
each node than it is to transfer large datasets to a central processor.
Ø Keeping all the data within each data node in the form of logical documents means that only the query itself and
the final result need to be moved over a network.
Ø This keeps your big data queries fast.
Ø The entire data is kept inside hub/node in document form which means just the query and result are needed to
move over the network.
Hash rings to distribute the data on clusters.
Ø Hash rings are common in big data solutions because they consistently determine how to assign a piece of
data to a specific processor.
Ø Hash rings take the leading bits of a document’s hash value and use this to determine which node the
document should be assigned.
Ø This allows any node in a cluster to know what node the data lives on and how to adapt to new assignment
methods as your data grows.
Ø Using a hash ring technique to evenly distribute big data loads over many servers with a randomly generated
40-character key is a good way to evenly distribute a network load.
Ø One of the most challenging problems with distributed databases is figuring out a consistent way of assigning
a document to a processing node.
Fig: Using a hash ring to assign a node to a key that uses a 40-character hex number. This
number can be expressed in 2160 bits. The first bits in the hash can be used to map a
document directly to a node. This allows documents to be randomly assigned to nodes and
new assignment rules to be updated as you add nodes to your cluster.
Replication to scale read.

Ø Databases use replication to make backup copies of data in real time.

Ø Using replication allows you to horizontally scale read requests.
Ø In real-time, replication is used by databases for making data’s backup copies.
Ø There are only a few times when you must be concerned about the lag time between a write to the read/write
node and a client reading that same record from a replica.
Ø One of the most common operations after a write is a read of that same record.
Ø If a client does a write and then an immediate read from that same node, there’s no problem.
Ø The problem occurs if a read occurs from a replica node before the update happens.
Ø The best way to avoid this type of problem is to only allow reads to the same write node after a write has been
done.
Ø This logic can be added to a session or state management system at the application layer.
Ø If your application needs fast read/write consistency, you must deal with it at the application layer.

Distributed queries to data nodes.

Ø In order to get high performance from queries that span multiple nodes, it’s important to separate the concerns
of query evaluation from query execution.
Ø The query is moved to the data by the NoSQL database instead of data moving to the query.
Ø NoSQL systems move the query to a data node, but don’t move data to a query node.
Ø In this example, all incoming queries arrive at query analyzer nodes.
Ø These nodes then forward the queries to each data node.
Ø If they have matches, the documents are returned to the query node.
Ø The query won’t return until all data nodes (or a response from a replica) have responded to the original query
request.
Ø If the data node is down, a query can be redirected to a replica of the data node.

Here are some typical big data use cases:

1. Bulk image processing

2. Public web page data
3. Event log data
4. Remote sensor data
5. Mobile phone data
6. Social media data
7. Game data
8. Open linked data
Exercise

1. Create a JSON document for a "user" profile with fields like username, email, age, and an array posts that
contains several objects, each representing a user's post with attributes like post_id, title, and content.

C - TADM23 - SAP S4HANA System Administration
No ratings yet
C - TADM23 - SAP S4HANA System Administration
28 pages
Running Head: Nosql Technologies
No ratings yet
Running Head: Nosql Technologies
14 pages
What Is NoSQL
No ratings yet
What Is NoSQL
4 pages
Nosql What Does It Mean
No ratings yet
Nosql What Does It Mean
15 pages
Case Study On Different Nosql Data Models
No ratings yet
Case Study On Different Nosql Data Models
6 pages
NoSQL for Developers and IT Pros
No ratings yet
NoSQL for Developers and IT Pros
3 pages
Unit No 1
No ratings yet
Unit No 1
34 pages
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
No ratings yet
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
42 pages
NoSQL Databases Overview
No ratings yet
NoSQL Databases Overview
8 pages
Introduction To Nosql: What Is A Nosql Database Used For?
No ratings yet
Introduction To Nosql: What Is A Nosql Database Used For?
6 pages
NoSQL: A Guide for IT Students
No ratings yet
NoSQL: A Guide for IT Students
15 pages
Aggregate Models in Big Data
No ratings yet
Aggregate Models in Big Data
3 pages
More Details On Data Models
No ratings yet
More Details On Data Models
23 pages
NOSQL
No ratings yet
NOSQL
15 pages
BGD Mod 2 QB Solns
No ratings yet
BGD Mod 2 QB Solns
11 pages
POV 5 Migrate To Cloud
No ratings yet
POV 5 Migrate To Cloud
1 page
Unit 2 Handouts
No ratings yet
Unit 2 Handouts
11 pages
NoSQL Databases
No ratings yet
NoSQL Databases
20 pages
Unit 6
No ratings yet
Unit 6
143 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
Overview of Scheduling For Multiprocessor Real-Time System: Xuan Qi
No ratings yet
Overview of Scheduling For Multiprocessor Real-Time System: Xuan Qi
28 pages
NOSQL
No ratings yet
NOSQL
25 pages
Nosql
No ratings yet
Nosql
20 pages
No SQL
No ratings yet
No SQL
12 pages
Unit 3
No ratings yet
Unit 3
10 pages
NoSql 2024 Assign2
No ratings yet
NoSql 2024 Assign2
189 pages
Unit 2
No ratings yet
Unit 2
65 pages
Unit 2 BDA
No ratings yet
Unit 2 BDA
32 pages
Big Data - No SQL Databases and Related Concepts
100% (1)
Big Data - No SQL Databases and Related Concepts
101 pages
NoSQL Databases for Tech Enthusiasts
No ratings yet
NoSQL Databases for Tech Enthusiasts
33 pages
NoSQL Notes
No ratings yet
NoSQL Notes
11 pages
1.1 Newsletter Foundations
No ratings yet
1.1 Newsletter Foundations
3 pages
BIG Data 2
No ratings yet
BIG Data 2
18 pages
Unit 1
No ratings yet
Unit 1
23 pages
Technical Overview
No ratings yet
Technical Overview
64 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
BDA (2) Merged
No ratings yet
BDA (2) Merged
29 pages
Unit 2
No ratings yet
Unit 2
26 pages
Investigating Teachers' Barriers To Ict (Information Communi Cation Technology0 Integration in Teaching English at Senior High Schools in Pekanbaru
No ratings yet
Investigating Teachers' Barriers To Ict (Information Communi Cation Technology0 Integration in Teaching English at Senior High Schools in Pekanbaru
9 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
Hostel Management System Guide
No ratings yet
Hostel Management System Guide
5 pages
Unit 5 NOSQL
No ratings yet
Unit 5 NOSQL
102 pages
41 NoSQL Introduction
No ratings yet
41 NoSQL Introduction
18 pages
Kaspersky Secure Mail Gateway Overview
No ratings yet
Kaspersky Secure Mail Gateway Overview
25 pages
Genesys Management Layer Guide
No ratings yet
Genesys Management Layer Guide
155 pages
Manualslib Download Page
No ratings yet
Manualslib Download Page
140 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
56 pages
CV Kosong
No ratings yet
CV Kosong
98 pages
Nosql, Mongodb
No ratings yet
Nosql, Mongodb
18 pages
CSE IIIYearAutonomousSyllabus PDF
No ratings yet
CSE IIIYearAutonomousSyllabus PDF
62 pages
CS 211: Computer Architecture Cache Memory Design
No ratings yet
CS 211: Computer Architecture Cache Memory Design
32 pages
Module 5 - Nosql
No ratings yet
Module 5 - Nosql
45 pages
Microsoft OST To PST Converter
No ratings yet
Microsoft OST To PST Converter
13 pages
Unit 2 (Big Data Analytics)
No ratings yet
Unit 2 (Big Data Analytics)
11 pages
Chapter 1 - Introducing Big Data & NoSQL
No ratings yet
Chapter 1 - Introducing Big Data & NoSQL
14 pages
JD - RCTO - DB & Data Platform Architect
No ratings yet
JD - RCTO - DB & Data Platform Architect
4 pages
Blue Iris Remote Access Guide
No ratings yet
Blue Iris Remote Access Guide
24 pages
API Interview Questions
No ratings yet
API Interview Questions
18 pages
Resume XidaRen
No ratings yet
Resume XidaRen
1 page
Android RAT Remote Administrative Tool
No ratings yet
Android RAT Remote Administrative Tool
3 pages
Unit V Big Data Frameworks
No ratings yet
Unit V Big Data Frameworks
42 pages
Agile PLM Sample Resume - 1
No ratings yet
Agile PLM Sample Resume - 1
4 pages
DMND 1
No ratings yet
DMND 1
8 pages
Full Stack UNIT3
No ratings yet
Full Stack UNIT3
57 pages
Nosql Module 1
No ratings yet
Nosql Module 1
23 pages
CrewWatch Modernization Project - RFP
No ratings yet
CrewWatch Modernization Project - RFP
4 pages
Unit 3 Nosql Databases Adt
No ratings yet
Unit 3 Nosql Databases Adt
64 pages
DBMSregi
100% (1)
DBMSregi
33 pages
Unit 2 Bda
No ratings yet
Unit 2 Bda
28 pages
BDA Module 5 - Part1 (No SQL) 2023
No ratings yet
BDA Module 5 - Part1 (No SQL) 2023
32 pages
Forescout Eyeextend CrowdStrike
No ratings yet
Forescout Eyeextend CrowdStrike
2 pages
Unit II Nosql Data Management
No ratings yet
Unit II Nosql Data Management
57 pages
Unit II - BIG DATA ANALYTICS
No ratings yet
Unit II - BIG DATA ANALYTICS
11 pages
Et 8021
No ratings yet
Et 8021
8 pages
SUSE Manager
No ratings yet
SUSE Manager
7 pages
Non Relational Database Management Systems
No ratings yet
Non Relational Database Management Systems
15 pages
Aa 7-6-50 Conf Guid
No ratings yet
Aa 7-6-50 Conf Guid
1,008 pages
DSA Notes Unit-03
No ratings yet
DSA Notes Unit-03
144 pages
Android Internship Report
No ratings yet
Android Internship Report
40 pages
Module 2
No ratings yet
Module 2
42 pages
Resumes - Ronithvalluri11@gmail - Com - Resume
No ratings yet
Resumes - Ronithvalluri11@gmail - Com - Resume
1 page
Unit 2
No ratings yet
Unit 2
48 pages
Best Practices For Stream Processing With GridGain and Apache Ignite and Kafka
No ratings yet
Best Practices For Stream Processing With GridGain and Apache Ignite and Kafka
59 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
38 pages
Unit III (FSWD)
No ratings yet
Unit III (FSWD)
27 pages
Unit Ii Nosql Data Management
No ratings yet
Unit Ii Nosql Data Management
26 pages
NoSQL Module1 PPT
No ratings yet
NoSQL Module1 PPT
64 pages
Unit 2
No ratings yet
Unit 2
25 pages

Unit 2

Uploaded by

Unit 2

Uploaded by

Ramaiah Institute of Technology

(Autonomous Institute, Affiliated to VTU)

Course Name : NoSQL Data Bases

Architecture Patterns of NoSQL

• Lack of support for joins and transactions.

Figure – Document Store Model in form of JSON documents

• Not suitable for complex data structures.

Sl.No Name Course ID

Fig : Eg of Row-Oriented Table

Fig : Column – Oriented Table

Fig : Column – Oriented Table

Fig: A simple graph with two vertices and one edge.

NoSQL system ways to handle big data problems

Different ways to handle Big Data problems

Ø Databases use replication to make backup copies of data in real time.

Distributed queries to data nodes.

Here are some typical big data use cases:

1. Bulk image processing

You might also like