Dynamo DB (RDS)
Dynamo DB (RDS)
Topics to be covered:
Week Broader Topic Topics Tools to be
covered
1. Introduction to Data:
1.1 Types of Data:
3. Lab on Dynamo DB
4. Case Study: Real-Time Data
Processing with AWS DynamoDB
1
AWS Solutions Architect Professional Week-4
1. Introduction to Data:
A collection of unique, compact pieces of information are called data. It may be utilized in
many different ways, including text, numbers, media, bytes, etc. It may be kept on paper, in
electronic memory, etc.
The term "data" comes from the Latin word "datum," which means "a single piece of
information." It is the term datum's plural.
Data is information that can be transformed into a form for rapid transfer and processing in
computers. Data can be swapped out.
● Structured data: This type of data is highly organized and follows a defined format.
Structured data is typically stored in tables with rows and columns, each
representing a specific data type or attribute. Examples of structured data include
financial data, customer information in a database, and inventory lists. Structured
data is easy to analyze using algorithms and queries since it can be easily processed,
searched, and organized.
● Unstructured data: This type of data has no predefined format or organization.
Unstructured data can come in various forms, such as text, images, audio, or video,
and is usually stored in free-form documents. Examples of unstructured data include
emails, social media posts, and web pages. Unstructured data is difficult to analyze
using traditional algorithms since it requires natural language processing and
machine learning techniques to extract useful information.
● Semi-structured data: This type of data has some structure but also includes
unstructured elements. Semi-structured data can be represented in various formats,
such as XML or JSON, and includes tags or labels that provide some level of
organization. Examples of semi-structured data include email headers, sensor data,
and log files. Semi-structured data can be analyzed using both traditional algorithms
and machine learning techniques.
2
AWS Solutions Architect Professional Week-4
DynamoDB is a document database, meaning that it stores and retrieves data as documents
rather than tables with rows and columns. It supports both structured and semi-structured
data, including JSON and XML formats.
DynamoDB also provides fast and consistent performance, with the ability to scale up or
down automatically in response to changing application demands. It also includes built-in
security features, such as rest and transit encryption, and integrates with other AWS
services like Lambda, CloudWatch, and IAM.
DynamoDB offers the possibility for on-demand backups. It enables you to make complete
backups of your tables for long-term storage and archiving for legal compliance
requirements.
You may automatically remove expired objects from tables using DynamoDB, which will help
you save space and money by not having to store obsolete data.
● Tables: In Amazon DynamoDB, tables are the primary data storage containers that
store and organize items. Tables in DynamoDB are similar to tables in traditional
relational databases, but with some key differences.
Here are some key features of DynamoDB tables:
I. Table structure: DynamoDB tables consist of a primary key and an
optional secondary index. The primary key is used to uniquely identify
each item in the table. It can be either a single attribute (partition key)
or a combination of two attributes (partition key and sort key).
Secondary indexes can be created to allow fast lookups based on
other attributes.
II. Capacity: DynamoDB tables can be provisioned with read and write
capacity units, which determine the maximum number of reads and
writes per second that the table can support.
III. Data types: DynamoDB supports various data types, including strings,
numbers, binary data, sets, and documents (JSON and XML). The data
types are defined in the table schema.
IV. Consistency: DynamoDB provides two consistency models: eventually
consistent and strongly consistent. In the eventually consistent model,
readers may not reflect on the latest write, but eventually, catch up.
In the strongly consistent model, reads always reflect the latest write.
V. Access control: DynamoDB provides fine-grained access control using
AWS Identity and Access Management (IAM). You can create policies
to control who can perform operations on tables and items, and what
actions they can perform.
3
AWS Solutions Architect Professional Week-4
● Items: There are zero or more items in each table. An item is a collection of
characteristics that can be distinguished from all other items in a certain way.
● Attribute: The characteristics that make up each thing are one or more. An attribute
is a fundamental data component that does not require additional dissection.
The following diagram shows a table named Department with some example items
and attributes.
Department
{
“DepartmentID”: 101,
“Name”: “Production”,
“Manager”: “John”
}
{
“DepartmentID”: 102,
“Name”: “Marketing”,
“Manager”: “Smith”
}
● Every item in the database has a distinct identifier, or primary key, that sets it apart
from every other row.
● The Department table does not have a schema other than the primary key, therefore
neither the attributes nor their data types need to be predefined.
● The majority of the properties are scalar, meaning they can only take on one value.
Scalars frequently take the form of strings and integers.
4
AWS Solutions Architect Professional Week-4
No two entries in a table with just a partition key can have the same partition
key value.
II. Partition key and sort key - This kind of key, also known as a composite main
key, is made up of two properties. The partition key is the first property,
while the sort key is the second. The partition key value is used by DynamoDB
as an input to an internal hash function. The partition (physical storage
internal to DynamoDB) in which the item will be stored is chosen based on
the hash function's result. The same partition key value is used to group all
things together and store them in alphabetical order by sort key value.
It is possible for many items to have the same partition key value in a table
with a partition key and a sort key.
Secondary indexes:
On a table, you may add one or more secondary indexes. You may query the data thanks to
a secondary index. Searches against the primary key as well as queries in the table using an
alternative key. Although indexes are not a requirement for DynamoDB, they do provide
your apps additional flexibility when accessing your data.
Following the creation of a secondary index on a table, reading data from the index is quite
similar to reading data from the table. Two types of indexes are supported by DynamoDB:
Global secondary index- An index having a partition key and sort key that may differ from
those in the table.
Local secondary index- An index with a separate sort key but the same partition key as the
table.
5
AWS Solutions Architect Professional Week-4
DynamoDB tables may be created and managed via control plane procedures. Additionally,
they let you interact with things that depend on tables, such as streams and indexes.
● CreateTable – Creates a new table. Optionally, you can create one or more
secondary indexes, and enable DynamoDB Streams for the table.
● Describe Table– Returns information about a table, such as its primary key schema,
throughput settings, and index information.
● ListTables – Returns the names of all of your tables in a list.
6
AWS Solutions Architect Professional Week-4
Deleting data –
DeleteItem - Removes a single item from a table using the DeleteItem command. The
main key for the object you wish to remove must be specified.
BatchWriteItem - Up to 25 items from one or more tables can be deleted. This is
more effective than making many calls to DeleteItem since it just requires one
network round trip for your application to remove the things.
7
AWS Solutions Architect Professional Week-4
Amazon DynamoDB has two read/write capacity modes for processing reads and writes on
your tables:
● On-demand
Provisioned: If you select provisioned mode, you may define how many reads and writes
per second your application needs. The provided capacity of your table may be dynamically
changed in response to variations in traffic using auto-scaling. This makes it easier for you to
control your DynamoDB usage so that it stays at or below a predetermined request rate to
get cost predictability.
If any of the following statements are accurate, the provisioned mode is a wise choice:
I. You execute applications with predictable application traffic that goes up gradually
or remains constant.
8
AWS Solutions Architect Professional Week-4
9
AWS Solutions Architect Professional Week-4
3. Lab on Dynamo DB
Step 1: Create a table
To create a new table using the DynamoDB console:
• In the navigation pane on the left side of the console, choose Dashboard.
• On the right side of the console, choose Create Table
10
AWS Solutions Architect Professional Week-4
11
AWS Solutions Architect Professional Week-4
• In the table list, choose the Department table to check the details of Table
Department
12
AWS Solutions Architect Professional Week-4
Step 2:
Write
data to a
table
using
the
console
• Selec
t
13
AWS Solutions Architect Professional Week-4
• Repeat this process and create another item with the following values:
Department_ID: 1002
Department_Name: Marketing
Number_of_Employee: 70
• Do this one more time to create another item.
Department_ID: 1003
Department_Name: Sales
Number_of_Employee: 101
14
AWS Solutions Architect Professional Week-4
15
AWS Solutions Architect Professional Week-4
● On the Items tab, view the list of items stored in the table, sorted by Department_ID and
Department_Name.
16
AWS Solutions Architect Professional Week-4
Step 4:
Update
data in a
table
In this step,
you update
an item
that you
created in
previous
steps.
• Choose View items
• Choose the item whose Department_ID value is 1002 and Department_Name value
is Marketing.
• Update the Number_of_Employee value to Updated Department_Employee, and
then choose Save.
• Updated Attribute
17
AWS Solutions Architect Professional Week-4
18
AWS Solutions Architect Professional Week-4
19
AWS Solutions Architect Professional Week-4
Step 6:
Create a
global
secondary index
In this step, you create a global secondary index for the Department table.
• In the navigation pane on the left side of the console, choose Tables.
• Choose the Department table from the table list.
• Choose the Indexes tab for the Department table.
• Choose to Create an index.
20
AWS Solutions Architect Professional Week-4
21
AWS Solutions Architect Professional Week-4
Global secondary indexes allow you to perform queries on attributes that are not part of a
table's primary key.
22
AWS Solutions Architect Professional Week-4
23
AWS Solutions Architect Professional Week-4
24
AWS Solutions Architect Professional Week-4
The case
study
project
revolves
around
a global
e-
25
AWS Solutions Architect Professional Week-4
commerce organization that faced challenges in efficiently processing and analyzing large
volumes of real-time data. The organization needed a scalable and reliable database
solution to handle the increasing data demands while ensuring low latency and high
availability.
Challenges:
The organization's existing database infrastructure struggled to keep up with the rapidly
growing data ingestion rates and complex query patterns. The challenges included:
2. Latency: The requirement for real-time data processing necessitated low-latency access
to the database, enabling fast query responses and real-time analytics.
3. High Availability: Downtime or disruptions in the database service were unacceptable due
to the critical nature of the business. The organization needed a solution that offered high
availability and fault tolerance.
Solution:
To address these challenges, the organization opted for AWS DynamoDB, a fully managed
NoSQL database service that provides seamless scalability, low latency, and high availability.
The solution involved the following AWS services and their implementation:
1. DynamoDB: The primary database was migrated to DynamoDB, taking advantage of its
ability to automatically scale throughput capacity in response to changing workloads. This
eliminated the need for manual provisioning and ensured that the database could handle
the increasing data ingestion rates.
3. AWS Lambda: By integrating DynamoDB Streams with AWS Lambda, the organization was
able to process the incoming data streams in real time. Lambda functions were written to
perform specific business logic, such as aggregations, transformations, and analytics, on the
streaming data.
4. AWS Glue and Amazon Redshift: For further analysis and reporting, the processed data
was periodically loaded into the Amazon Redshift data warehouse using AWS Glue, allowing
the organization to run complex queries and generate insights.
26
AWS Solutions Architect Professional Week-4
Results:
The implementation of DynamoDB and associated AWS services resulted in several benefits
for the organization:
1. Scalability: DynamoDB's auto-scaling capability ensured that the database could handle
high data ingestion rates during peak traffic, eliminating performance bottlenecks and
ensuring a seamless user experience.
2. Low Latency: With DynamoDB's single-digit millisecond latency, the organization achieved
real-time data processing and fast query responses. This enabled timely decision-making
and improved customer experience.
4. Cost Efficiency: The pay-per-request pricing model of DynamoDB allowed the organization
to optimize costs, paying only for the resources utilized during peak traffic periods. This
eliminated the need for upfront investments in provisioning capacity.
1. Scalable and managed database services like DynamoDB can effectively handle large
volumes of real-time data, eliminating the need for manual scaling and resource
provisioning.
2. The integration of event-driven services like DynamoDB Streams and AWS Lambda
enables real-time processing of streaming data, allowing organizations to extract immediate
value from their data.
4. By combining multiple AWS services like DynamoDB, Glue, and Redshift, organizations can
build end-to-end data processing and analytics pipelines that drive actionable insights.
In conclusion, the adoption of AWS DynamoDB and associated services empowered the e-
commerce organization to efficiently handle real-time data processing, resulting in improved
scalability, low latency, high availability, and cost efficiency. The lessons learned from this
project can be applied to similar scenarios where real-time data processing and analysis are
crucial for business success.
27
AWS Solutions Architect Professional Week-4
28