Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
63 views28 pages

Dynamo DB (RDS)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views28 pages

Dynamo DB (RDS)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 28

AWS Solutions Architect Professional Week-4

Topics to be covered:
Week Broader Topic Topics Tools to be
covered
1. Introduction to Data:
1.1 Types of Data:

2. What is Amazon DynamoDB?


2.1 Core components of
Amazon DynamoDB:
2.2 Primary key:
Amazon
2.3 DynamoDB Streams DynamoDB
AWS
Database 2.4 DynamoDB Control plane
4
Services
2.5 Classic APIs
2.6 Table classes
2.7 Read consistency
2.8 Partitions and data
distribution
2.9 Data distribution

3. Lab on Dynamo DB
4. Case Study: Real-Time Data
Processing with AWS DynamoDB

1
AWS Solutions Architect Professional Week-4

1. Introduction to Data:
A collection of unique, compact pieces of information are called data. It may be utilized in
many different ways, including text, numbers, media, bytes, etc. It may be kept on paper, in
electronic memory, etc.
The term "data" comes from the Latin word "datum," which means "a single piece of
information." It is the term datum's plural.
Data is information that can be transformed into a form for rapid transfer and processing in
computers. Data can be swapped out.

1.1 Type of Data:


Three forms of data—structured, unstructured, and semi-structured—can be found in a
variety of settings, including computer science, information management, and data analysis.

● Structured data: This type of data is highly organized and follows a defined format.
Structured data is typically stored in tables with rows and columns, each
representing a specific data type or attribute. Examples of structured data include
financial data, customer information in a database, and inventory lists. Structured
data is easy to analyze using algorithms and queries since it can be easily processed,
searched, and organized.
● Unstructured data: This type of data has no predefined format or organization.
Unstructured data can come in various forms, such as text, images, audio, or video,
and is usually stored in free-form documents. Examples of unstructured data include
emails, social media posts, and web pages. Unstructured data is difficult to analyze
using traditional algorithms since it requires natural language processing and
machine learning techniques to extract useful information.
● Semi-structured data: This type of data has some structure but also includes
unstructured elements. Semi-structured data can be represented in various formats,
such as XML or JSON, and includes tags or labels that provide some level of
organization. Examples of semi-structured data include email headers, sensor data,
and log files. Semi-structured data can be analyzed using both traditional algorithms
and machine learning techniques.

2. What is Amazon DynamoDB?


Amazon DynamoDB is a fully managed NoSQL database service provided by Amazon Web
Services (AWS). DynamoDB is designed to be highly scalable, with the ability to handle
petabytes of data and millions of requests per second.

2
AWS Solutions Architect Professional Week-4

DynamoDB is a document database, meaning that it stores and retrieves data as documents
rather than tables with rows and columns. It supports both structured and semi-structured
data, including JSON and XML formats.
DynamoDB also provides fast and consistent performance, with the ability to scale up or
down automatically in response to changing application demands. It also includes built-in
security features, such as rest and transit encryption, and integrates with other AWS
services like Lambda, CloudWatch, and IAM.
DynamoDB offers the possibility for on-demand backups. It enables you to make complete
backups of your tables for long-term storage and archiving for legal compliance
requirements.
You may automatically remove expired objects from tables using DynamoDB, which will help
you save space and money by not having to store obsolete data.

2.1 Core components of Amazon DynamoDB:


Tables, objects, and attributes are the main building blocks you deal with in DynamoDB.

● Tables: In Amazon DynamoDB, tables are the primary data storage containers that
store and organize items. Tables in DynamoDB are similar to tables in traditional
relational databases, but with some key differences.
Here are some key features of DynamoDB tables:
I. Table structure: DynamoDB tables consist of a primary key and an
optional secondary index. The primary key is used to uniquely identify
each item in the table. It can be either a single attribute (partition key)
or a combination of two attributes (partition key and sort key).
Secondary indexes can be created to allow fast lookups based on
other attributes.
II. Capacity: DynamoDB tables can be provisioned with read and write
capacity units, which determine the maximum number of reads and
writes per second that the table can support.
III. Data types: DynamoDB supports various data types, including strings,
numbers, binary data, sets, and documents (JSON and XML). The data
types are defined in the table schema.
IV. Consistency: DynamoDB provides two consistency models: eventually
consistent and strongly consistent. In the eventually consistent model,
readers may not reflect on the latest write, but eventually, catch up.
In the strongly consistent model, reads always reflect the latest write.
V. Access control: DynamoDB provides fine-grained access control using
AWS Identity and Access Management (IAM). You can create policies
to control who can perform operations on tables and items, and what
actions they can perform.

3
AWS Solutions Architect Professional Week-4

● Items: There are zero or more items in each table. An item is a collection of
characteristics that can be distinguished from all other items in a certain way.
● Attribute: The characteristics that make up each thing are one or more. An attribute
is a fundamental data component that does not require additional dissection.

The following diagram shows a table named Department with some example items
and attributes.

Department
{
“DepartmentID”: 101,
“Name”: “Production”,
“Manager”: “John”

}
{
“DepartmentID”: 102,
“Name”: “Marketing”,
“Manager”: “Smith”
}

Note the following about the Department table:

● Every item in the database has a distinct identifier, or primary key, that sets it apart
from every other row.
● The Department table does not have a schema other than the primary key, therefore
neither the attributes nor their data types need to be predefined.
● The majority of the properties are scalar, meaning they can only take on one value.
Scalars frequently take the form of strings and integers.

2.2 Primary key:


The primary key of the table must be specified together with the table name when creating
a table. Each item in the table is uniquely identified by the main key, which ensures that no
two objects may have the same key.
Two types of primary keys are supported by DynamoDB:
I. Partition key - A simple primary key made up of the partition key, a single
property.
The value of the partition key is used by DynamoDB as an input to an internal
hash function. The partition (physical storage internal to DynamoDB) in which
the item will be stored is chosen based on the hash function's result.

4
AWS Solutions Architect Professional Week-4

No two entries in a table with just a partition key can have the same partition
key value.
II. Partition key and sort key - This kind of key, also known as a composite main
key, is made up of two properties. The partition key is the first property,
while the sort key is the second. The partition key value is used by DynamoDB
as an input to an internal hash function. The partition (physical storage
internal to DynamoDB) in which the item will be stored is chosen based on
the hash function's result. The same partition key value is used to group all
things together and store them in alphabetical order by sort key value.
It is possible for many items to have the same partition key value in a table
with a partition key and a sort key.

Secondary indexes:
On a table, you may add one or more secondary indexes. You may query the data thanks to
a secondary index. Searches against the primary key as well as queries in the table using an
alternative key. Although indexes are not a requirement for DynamoDB, they do provide
your apps additional flexibility when accessing your data.
Following the creation of a secondary index on a table, reading data from the index is quite
similar to reading data from the table. Two types of indexes are supported by DynamoDB:
Global secondary index- An index having a partition key and sort key that may differ from
those in the table.
Local secondary index- An index with a separate sort key but the same partition key as the
table.

2.3 DynamoDB Streams:


An optional feature that records data update events in DynamoDB tables is called
DynamoDB Streams. These events' data are displayed in the stream in almost real-time and
in the chronological sequence in which they happened.
A stream record serves as an icon for each occurrence. If you activate a stream for a table,
every time one of the following things happens, DynamoDB Streams produces a stream
record:
• The table now includes a fresh item: The item's whole picture, including all of its
characteristics, is captured by the stream.
• An update to an item: Any properties that were changed in the object are captured
both in the "before" and "after" images by the stream.
• The table has an item removed: The stream records a picture of the whole thing
before it was

2.4 DynamoDB Control plane:

5
AWS Solutions Architect Professional Week-4

DynamoDB tables may be created and managed via control plane procedures. Additionally,
they let you interact with things that depend on tables, such as streams and indexes.

● CreateTable – Creates a new table. Optionally, you can create one or more
secondary indexes, and enable DynamoDB Streams for the table.
● Describe Table– Returns information about a table, such as its primary key schema,
throughput settings, and index information.
● ListTables – Returns the names of all of your tables in a list.

● UpdateTable – Modifies the settings of a table or its indexes, creates or removes


new indexes on a table, or modifies DynamoDB Streams settings for a table.
● DeleteTable – Removes a table and all of its dependent objects from DynamoDB.

2.5 Classic APIs:


Creating data-
PutItem - Adds one item at a time to a table. The main key properties must be
specified, while the other characteristics need not be.
BatchWriteItem - Writes up to 25 items to a table using it. This is more effective than
repeatedly executing PutItem since your program only needs to make one round trip across
the network to write the things. Additionally, you may delete many things from one or more
tables using BatchWriteItem.
Reading data-
GetItem - Pulls one item out of a table. The primary key for the desired item must be
specified. You have the option of retrieving the whole object or only a portion of its
properties.
BatchGetItem - Pulls up to 100 items from a single or more tables. This is more
effective than making many calls to GetItem since it just requires one network round
trip for your application to read the things.
Query - Obtains every object with a certain partition key. The value of the partition
key must be specified. You can get a subset of an item's characteristics or the
complete thing.
Scan - Retrieves every item from the index or table that is supplied. You can retrieve
an item's entirety or only a portion of its properties.
Updating data-
UpdateItem – Alters one or more item properties. The primary key for the object you
wish to edit must be specified. You can change or remove current properties as well
as add new ones.

6
AWS Solutions Architect Professional Week-4

Deleting data –
DeleteItem - Removes a single item from a table using the DeleteItem command. The
main key for the object you wish to remove must be specified.
BatchWriteItem - Up to 25 items from one or more tables can be deleted. This is
more effective than making many calls to DeleteItem since it just requires one
network round trip for your application to remove the things.

2.6 Read consistency


Each AWS Region is made up of a number of unique places called Availability Zones. Each
Availability Zone offers low-cost, low-latency network connection to other Availability Zones
in the same Region while being insulated from failures in other Availability Zones. This
enables quick data replication across several Availability Zones in a Region.
Your application has successfully and durably written data to a DynamoDB database when it
returns an HTTP 200 response (OK). Within a second or less, the data is eventually
consistent across all storage sites.
DynamoDB supports eventually consistent and strongly consistent reads.
Eventually Consistent Reads- In Amazon DynamoDB, eventually consistent reads are a type
of read operation that may not immediately reflect the latest updates to the data.
When you perform an eventually consistent read operation in DynamoDB, you may get a
response that reflects a stale view of the data, where the changes made by recent write
operations may not be immediately visible. This is because DynamoDB uses multiple copies
of data across multiple servers, and it may take time for all the copies to be updated with
the latest changes.
Strongly Consistent Reads- In Amazon DynamoDB, strongly consistent reads are a type of
read operation that provides immediate access to the latest updates to the data. When you
perform a strongly consistent read operation in DynamoDB, you are guaranteed to receive
the most recent data changes, including updates made by recently completed write
operations. DynamoDB achieves strong consistency by ensuring that all copies of the data
are updated before responding to read requests.
Strongly consistent reads are resource-intensive and more costly than eventually consistent
reads, as they require more capacity units to complete. However, they are suitable for use
cases where you need to have immediate access to the latest data, such as in real-time
applications. To perform a strongly consistent read in DynamoDB, you can specify the
"ConsistentRead" parameter as "true" when making a read request.

Read/write capacity mode:

7
AWS Solutions Architect Professional Week-4

Amazon DynamoDB has two read/write capacity modes for processing reads and writes on
your tables:

● On-demand

● Provisioned (default, free-tier eligible)

On-demand: Amazon Without capacity planning, DynamoDB on-demand provides a flexible


charging solution that can handle thousands of queries per second. For read and write
requests, DynamoDB on-demand offers pay-per-request pricing so that you only pay for
what you really need.
An on-demand mode is a good option if any of the following occurs:
I. You create new tables with unknown workloads.
II. You have unpredictable application traffic.
III. You prefer the ease of paying for only what you use.
For On-demand Read request units and write request units:
You don't need to describe the read and write throughput your application will require for
on-demand mode tables. DynamoDB bills you in terms of read request units and write
request units for the reads and writes that your application makes on your tables.
• A strongly consistent read request of an item up to 4 KB requires one read request
unit.
• An eventually consistent read request of an item up to 4 KB requires one-half read
request unit.
• A transactional read request of an item up to 4 KB requires two read request units.
For items up to 1 KB in size, one write request unit corresponds to one writer. DynamoDB
needs to use more write request units if you need to write something bigger than 1 KB. For
objects up to 1 KB, transactional write requests need two write request units to execute one
write. Depending on the item size, a total number of write request units may be needed. For
instance, if your item size is 2 KB, you will need two write request units to support a single
write request and four write request units to support a transactional write request.

Provisioned: If you select provisioned mode, you may define how many reads and writes
per second your application needs. The provided capacity of your table may be dynamically
changed in response to variations in traffic using auto-scaling. This makes it easier for you to
control your DynamoDB usage so that it stays at or below a predetermined request rate to
get cost predictability.
If any of the following statements are accurate, the provisioned mode is a wise choice:
I. You execute applications with predictable application traffic that goes up gradually
or remains constant.

8
AWS Solutions Architect Professional Week-4

II. You may anticipate your capacity needs to reduce expenditures.


For Provisioned Read request units and write request units:
• One read capacity unit represents one strongly consistent read per second
• Two eventually consistent reads per second, for an item up to 4 KB in size.
• Transactional read requests require two read capacity units to perform one read per
second for items up to 4 KB.
For items up to 1 KB in size, one write capacity unit corresponds to one write per second.
DynamoDB has to use more write capacity units if you need to write something bigger than
1 KB.
For objects up to 1 KB, transactional write requests require two write capacity units to
accomplish one write per second.
Depending on the item size, a total number of write capacity units may be needed.

2.7 Table classes:


In Amazon DynamoDB, Table classes represent the tables that store your data. DynamoDB is
a NoSQL database service, which means that you can store and retrieve data in flexible,
schema-less tables.
DynamoDB offers two table classes designed to help you optimize for cost:
• The DynamoDB Standard (Default)
• The DynamoDB Standard-Infrequent Access
Example: Tables that contain seldom viewed data, such as old social network postings, order
histories for online stores, and gaming accomplishments, are suitable candidates for the
Standard-IA table type.

2.8 Partitions and data distribution:


Partitioning is how Amazon DynamoDB stores data. A partition is a storage allocation for a
table that is supported by SSDs and automatically replicated across several Availability
Zones inside of an AWS Region. DynamoDB manages all aspects of partition management;
you never have to do it yourself.
When a table is created, its initial state is CREATING. DynamoDB allocates enough partitions
to the table at this phase for it to meet your specified throughput needs. After the table
state switches to ACTIVE, you may start writing and reading table data.
DynamoDB allocates additional partitions to a table in the following situations:
If you raise the table's provisioned throughput settings above what the current partitions
can allow, or if an existing partition fills to capacity and more storage space is needed,
DynamoDB adds additional partitions to the database.

9
AWS Solutions Architect Professional Week-4

2.9 Data distribution


Data distribution: Partition key
DynamoDB stores and retrieves each item based on its partition key value if your table has a
simple primary key (partition key only).
DynamoDB utilizes the value of the partition key as input to an internal hash function to
write an item to the table. The partition in which the item will be kept is determined by the
hash function's output value.
You must provide the partition key value for the item in order to read it from the table. This
value is sent into DynamoDB's hash function, which returns the partition where the item
may be located.
Data distribution: Partition key and sort key
DynamoDB determines the partition key's hash value if the table includes a composite
primary key that combines the partition key and the sort key.
But it physically groups together all the objects with the same partition key value in the
order of the sort key value. DynamoDB calculates the hash value of the partition key to
identify which partition should hold the item before writing it to the table. Multiple objects
in the partition may share the same partition key value. As a result, DynamoDB saves the
item alongside other objects with the same partition key, sorted in ascending order.
You must supply the partition key value and sort the key value for each item before you can
read it from the table. DynamoDB determines the partition by computing the hash value of
the partition key.

3. Lab on Dynamo DB
Step 1: Create a table
To create a new table using the DynamoDB console:
• In the navigation pane on the left side of the console, choose Dashboard.
• On the right side of the console, choose Create Table

10
AWS Solutions Architect Professional Week-4

For the table name, enter Department.


• For the partition key, enter Department_ID
• Enter Department_Name as the sort key.
• Leave Default settings selected.
• Choose to Create the table.

11
AWS Solutions Architect Professional Week-4

• In the table list, choose the Department table to check the details of Table

Department

12
AWS Solutions Architect Professional Week-4

Step 2:
Write
data to a
table
using
the
console
• Selec
t

Explore Table Items.


• In the Items view, choose to Create item.
• Fill in the details in the Attributes you have created

13
AWS Solutions Architect Professional Week-4

• In order to add more Attributes, click on Add new Attribute- Number_of_Employee


• Repeat this process to create more Attributes.
• Choose to Create item

• Repeat this process and create another item with the following values:
Department_ID: 1002
Department_Name: Marketing
Number_of_Employee: 70
• Do this one more time to create another item.
Department_ID: 1003
Department_Name: Sales
Number_of_Employee: 101

14
AWS Solutions Architect Professional Week-4

Step 3: Read data from a table


• In the navigation pane on the left side of the console, choose Tables.
• Choose the Department table from the table list.
• Select the Explore table items.

15
AWS Solutions Architect Professional Week-4

● On the Items tab, view the list of items stored in the table, sorted by Department_ID and

Department_Name.

16
AWS Solutions Architect Professional Week-4

Step 4:
Update
data in a
table
In this step,
you update
an item
that you
created in
previous
steps.
• Choose View items
• Choose the item whose Department_ID value is 1002 and Department_Name value
is Marketing.
• Update the Number_of_Employee value to Updated Department_Employee, and
then choose Save.

• Updated Attribute

17
AWS Solutions Architect Professional Week-4

Step 5: Query data in a table


• In the navigation pane on the left side of the console, choose Tables.
• Choose the Department table from the table list.
• Choose View items.
• Choose Query
• For the Partition key, enter 1002, and then choose Run

18
AWS Solutions Architect Professional Week-4

19
AWS Solutions Architect Professional Week-4

Step 6:
Create a
global

secondary index
In this step, you create a global secondary index for the Department table.
• In the navigation pane on the left side of the console, choose Tables.
• Choose the Department table from the table list.
• Choose the Indexes tab for the Department table.
• Choose to Create an index.

20
AWS Solutions Architect Professional Week-4

• For the Partition key, enter Department_ID.


• For the Index name, enter Department_ID-index.
• Leave the other settings on their default values and choose to Create index.

Step 7: Query the global secondary index


In this step, you query a global secondary index on the Department table using the Amazon
DynamoDB
• Choose the Department table from the table list.
• Choose View items.
• Choose Query.
• For

Partition key, enter Department_ID, and then choose Run.

21
AWS Solutions Architect Professional Week-4

Step 8: Create a global secondary index


In this step, you create a global secondary index for the Department table that you created.
• In the navigation pane on the left side of the console, choose Tables.
• Choose the Department table from the table list.
• Choose the Indexes tab for the Department table.
• Choose to Create an index.

Global secondary indexes allow you to perform queries on attributes that are not part of a
table's primary key.

• For the Partition key, enter Number_of_Employee.


• For the Index name, enter Number_of_Employee-index.
• Leave the other settings on their default values and choose to Create index.

22
AWS Solutions Architect Professional Week-4

Step9: Query the global secondary index

23
AWS Solutions Architect Professional Week-4

In this step, you


query a global
secondary
index on the
Department
table using the
Amazon
DynamoDB
console
• In the

navigation pane on the left side of the console, choose Tables.


• Choose the Department table from the table list.
• Select the View items.
• Choose Query.
• In the drop-down list under Query, choose Number_of_Employee-index.

24
AWS Solutions Architect Professional Week-4

Step 10: clean up resources


If you no longer need the Amazon DynamoDB table that you created. This step helps ensure
that you aren't charged for resources that you aren't using.
• Choose the Department table from the table list.
• Choose the Indexes tab for the Department table.
• Choose Delete table.

4. Case Study: Real-Time Data Processing with AWS DynamoDB


Introduction:

The case
study
project
revolves
around
a global
e-

25
AWS Solutions Architect Professional Week-4

commerce organization that faced challenges in efficiently processing and analyzing large
volumes of real-time data. The organization needed a scalable and reliable database
solution to handle the increasing data demands while ensuring low latency and high
availability.

Challenges:

The organization's existing database infrastructure struggled to keep up with the rapidly
growing data ingestion rates and complex query patterns. The challenges included:

1. Scalability: The existing database couldn't scale horizontally to accommodate the


increasing data load, leading to performance bottlenecks during peak traffic.

2. Latency: The requirement for real-time data processing necessitated low-latency access
to the database, enabling fast query responses and real-time analytics.

3. High Availability: Downtime or disruptions in the database service were unacceptable due
to the critical nature of the business. The organization needed a solution that offered high
availability and fault tolerance.

Solution:

To address these challenges, the organization opted for AWS DynamoDB, a fully managed
NoSQL database service that provides seamless scalability, low latency, and high availability.
The solution involved the following AWS services and their implementation:

1. DynamoDB: The primary database was migrated to DynamoDB, taking advantage of its
ability to automatically scale throughput capacity in response to changing workloads. This
eliminated the need for manual provisioning and ensured that the database could handle
the increasing data ingestion rates.

2. DynamoDB Streams: To enable real-time data processing, DynamoDB Streams were


leveraged. Streams capture a time-ordered sequence of item-level modifications, allowing
for real-time updates and triggering of downstream processes or services.

3. AWS Lambda: By integrating DynamoDB Streams with AWS Lambda, the organization was
able to process the incoming data streams in real time. Lambda functions were written to
perform specific business logic, such as aggregations, transformations, and analytics, on the
streaming data.

4. AWS Glue and Amazon Redshift: For further analysis and reporting, the processed data
was periodically loaded into the Amazon Redshift data warehouse using AWS Glue, allowing
the organization to run complex queries and generate insights.

26
AWS Solutions Architect Professional Week-4

Results:

The implementation of DynamoDB and associated AWS services resulted in several benefits
for the organization:

1. Scalability: DynamoDB's auto-scaling capability ensured that the database could handle
high data ingestion rates during peak traffic, eliminating performance bottlenecks and
ensuring a seamless user experience.

2. Low Latency: With DynamoDB's single-digit millisecond latency, the organization achieved
real-time data processing and fast query responses. This enabled timely decision-making
and improved customer experience.

3. High Availability: DynamoDB's built-in replication and fault tolerance mechanisms


provided high availability, eliminating downtime and ensuring business continuity even in
the face of hardware failures or infrastructure disruptions.

4. Cost Efficiency: The pay-per-request pricing model of DynamoDB allowed the organization
to optimize costs, paying only for the resources utilized during peak traffic periods. This
eliminated the need for upfront investments in provisioning capacity.

Learnings and Conclusion:

This case study highlights several important lessons:

1. Scalable and managed database services like DynamoDB can effectively handle large
volumes of real-time data, eliminating the need for manual scaling and resource
provisioning.

2. The integration of event-driven services like DynamoDB Streams and AWS Lambda
enables real-time processing of streaming data, allowing organizations to extract immediate
value from their data.

3. Leveraging serverless technologies, such as Lambda, reduces operational overhead and


simplifies the development and deployment of real-time data processing pipelines.

4. By combining multiple AWS services like DynamoDB, Glue, and Redshift, organizations can
build end-to-end data processing and analytics pipelines that drive actionable insights.

In conclusion, the adoption of AWS DynamoDB and associated services empowered the e-
commerce organization to efficiently handle real-time data processing, resulting in improved
scalability, low latency, high availability, and cost efficiency. The lessons learned from this
project can be applied to similar scenarios where real-time data processing and analysis are
crucial for business success.

27
AWS Solutions Architect Professional Week-4

28

You might also like