Database on AWS
Week 5 – Module 5
Hung Nguyen Gia
Senior Solutions Architect
Champion Authorized Instructor
© 2023, Amazon Web Services, Inc. or its Affiliates.
Table of contents
• Relational Databases on AWS
• Database Migration
• Purpose-Built Databases -
DynamoDB
• Data Lake Introduction
• Kahoot Game
• Labs
© 2023, Amazon Web Services, Inc. or its Affiliates.
Amazon RDS
© 2023, Amazon Web Services, Inc. or its Affiliates.
Amazon Relational Database Service (Amazon
RDS) relational database service with a choice of six popular database engines
Managed
Microsoft
SQL
Server
Easy to administer Available and Highly scalable Fast and secure
durable
No need for Automatic Multi-AZ Scale database SSD storage and
infrastructure data replication; compute and storage guaranteed
provisioning or automated backup, with a provisioned I/O; data
installing and snapshots, and failover few clicks with no encryption at rest and
maintaining database application downtime in transit
software
© 2023, Amazon Web Services, Inc. or its Affiliates.
Amazon RDS - fully managed
Spend time innovating & building new apps, not managing
infrastructure
Automatic fail-over
Backup & recovery
Isolation & security
Schema design Industry compliance
Query construction You AWS Push-button scaling
Query optimization Automated patching &
upgrades
Advanced monitoring
Routine maintenance
© 2023, Amazon Web Services, Inc. or its Affiliates.
Monitoring RDS/Aurora databases
Instance Operating System Database Engine
Amazon CloudWatch Amazon RDS Enhanced Amazon RDS Performance
Monitoring Insights
• CPU/ Memory / IOPS / • Process / Thread list • SQL / State / User / Host
Network • Per second metric (“Database Load”)
• Per minute metric storage in Amazon • Per second metric storage
storage in Amazon CloudWatch Logs in Amazon RDS
CloudWatch
© 2023, Amazon Web Services, Inc. or its Affiliates.
Performance Insights increases
productivity
Amazon RDS Performance
Insights measures database load
over time
Easy to identify database
bottlenecks
• Top SQL/most intensive queries
Enables problem discovery
Adjustable timeframe
• Hour, day, week, and longer
Available for all Amazon RDS
database engines
© 2023, Amazon Web Services, Inc. or its Affiliates.
Security and compliance
• Network security
• Amazon Virtual Private Cloud (VPC) security groups act as a virtual
firewall to control inbound and outbound traffic
• Resource access permissions
• AWS Identity and Access Management (IAM) provides resource-level
role permission controls
• Data encryption
• Encryption at rest using AWS KMS or Oracle/Microsoft TDE
• SSL protection for data in transit
• Compliance and assurance programs for finance, healthcare,
government, and more
• HIPAA eligibility under a Business Associate Agreement (BAA) with AWS
• Active Directory / Kerberos integration
• RDS for Oracle, SQL Server, PostgreSQL
© 2023, Amazon Web Services, Inc. or its Affiliates.
Multi-AZ deployments
Enterprise-grade high availability
Application Database
servers failure Standby
Fault tolerance
across multiple data
centers
• Automatic failover New
standby
Availability Zone A
• Synchronous replication
Primary
• Enabled with one click
Availability Zone B
© 2023, Amazon Web Services, Inc. or its Affiliates.
Read Replicas
Read scaling and disaster recovery
Application Database server
RDS for MySQL, PostgreSQL, servers
Read/write Primary
MariaDB, and Oracle
• Relieve pressure on your master
node with additional read capacity
Asynchronous
replication
• Bring data close to your
applications
in different regions Read only
• Promote a read replica to a master
BI/reporting
for faster recovery in the event of application Read replica
disaster server
© 2023, Amazon Web Services, Inc. or its Affiliates.
Automated backups
Point-in-time recovery for your DB instance
• Scheduled daily volume
backup
of entire instance
• Archive database change logs Every day during your backup
• 35–day maximum retention window, RDS creates a storage
volume snapshot of your
• Minimal impact on database instance
performance
• Taken from standby when Every five minutes, RDS backs up
running Multi-AZ the transaction logs of your
database
© 2023, Amazon Web Services, Inc. or its Affiliates.
Database snapshots
Backups of your entire DB instance in Amazon S3
Amazon EBS
Volume
• Always incremental
Amazon S3
• Amazon S3
99.999999999%
durability
• Supports encryption Bucket Snapshot 1 Snapshot 2 Snapshot 3
• Copy across accounts,
A B C C1 D B1 E
across regions
A B A C1 D
© 2023, Amazon Web Services, Inc. or its Affiliates.
Amazon Aurora
© 2023, Amazon Web Services, Inc. or its Affiliates.
You asked for a cost-effective, enterprise
database…
So, we designed Amazon Aurora - enterprise database
at open source price, delivered as a managed service
Speed and availability of high-end commercial
databases
Simplicity and cost-effectiveness of open source
databases
Drop-in compatibility with MySQL and PostgreSQL
Amazon Aurora
Simple pay as you go pricing
© 2021, Amazon Web Services, Inc. or its Affiliates.
Amazon Aurora is fast…
up to 5x the throughput of MySQL; 3x
the throughput of PostgreSQL
© 2023, Amazon Web Services, Inc. or its Affiliates.
Traditional Database Architecture
Compute
Node
SQL
Databases are all about I/O… Transactions
Caching
Design principles over the last 40+ Logging
years:
• Increase I/O bandwidth
• Decrease number of I/Os consumed
Attached
Storage
© 2023, Amazon Web Services, Inc. or its Affiliates.
Scale-out, distributed, multi-tenant storage
architecture
Purpose-built log-structured
distributed storage
Storage volume is striped
across hundreds of storage
nodes
Storage nodes with locally CLUSTER STORAGE VOLUME
attached SSDs
Continuous backup to
Amazon S3.
AZ 1 AZ 2 AZ 3
© 2023, Amazon Web Services, Inc. or its Affiliates.
Amazon
Tolerating compute failures
Cluster Reader
Endpoint Endpoint
Writer Writer Reader
SQL SQL SQL
Transaction Transaction Transaction
s s s
Caching Caching Caching
Any reader node can be 0 0 5
promoted to writer/primary
Failed instances/nodes will be SHARED CLUSTER STORAGE VOLUME
replaced after failover and
come online as readers.
AZ 1 AZ 2 AZ 3
© 2023, Amazon Web Services, Inc. or its Affiliates.
Database Migration
© 2023, Amazon Web Services, Inc. or its Affiliates.
Overview
• Simple to use
• Reliable
• Supports widely used
databases
• Low cost
• Fast and easy to setup
© 2023, Amazon Web Services, Inc. or its Affiliates.
Supports widely used databases
Sources* Targets**
Oracle Oracle
SQL Server SQL Server
Azure SQL PostgreSQL On-premises
database
PostgreSQL MySQL
MySQL Amazon Redshift
SAP ASE SAP ASE
MongoDB Amazon S3
Amazon S3 Amazon DynamoDB
IBM DB2 Amazon Kinesis
Amazon ElasticSearch
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.html
*
**
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.html
© 2023, Amazon Web Services, Inc. or its Affiliates.
Fast and easy to setup
Set up a migration task in
minutes
Create a
Connect to Connect to replication
the source the target instance to Create a task Run the task
database database run the
migration
You can use different tasks with different settings for different environments
© 2023, Amazon Web Services, Inc. or its Affiliates.
AWS Schema Conversion Tool (SCT)
The AWS Schema Conversion Tool helps
automate many database schema and
code conversion tasks when migrating
to a new database engine.
Features
Schema conversion between database engines
Database Migration Assessment report for choosing the best target engine
Code browser that highlights places where manual edits are required
© 2023, Amazon Web Services, Inc. or its Affiliates.
AWS Purpose-Built
Databases
© 2023, Amazon Web Services, Inc. or its Affiliates.
The best tool for a job usually differs by use case
Build new applications with purpose-built databases
© 2023, Amazon Web Services, Inc. or its Affiliates.
Purpose-built databases
Relational Key - Value Document Wide Collumn In-memory Graph Time Series Ledger
Aurora Amazon DynamoDB Amazon Amazon ElastiCache Amazon Amazon Amazon
RDS DocumentDB Keyspaces Neptune Timestream QLDB
© 2023, Amazon Web Services, Inc. or its Affiliates.
Purpose-built databases
Relational Key - Value Document Wide Collumn In-memory Graph Time Series Ledger
Aurora Amazon DynamoDB Amazon Amazon ElastiCache Amazon Amazon Amazon
RDS DocumentDB Keyspaces Neptune Timestream QLDB
© 2023, Amazon Web Services, Inc. or its Affiliates.
Amazon DynamoDB
© 2023, Amazon Web Services, Inc. or its Affiliates.
The Amazon NoSQL journey
Dec 2004: Jan 2012: Today:
Database scalability DynamoDB general Tier 0 service powering
challenges availability most of Amazon
Oct 2007: Q3 2016:
Dynamo paper published DynamoDB leader in
Gartner MQ, Forrester
Wave
© 2023, Amazon Web Services, Inc. or its Affiliates.
Retail
The internal Amazon.com Herd
system supports 100s of
millions of active workflows.
Migrated from Oracle to
DynamoDB
• Improved customer
experience: Workflow
processing delays dropped from
1 second to 100 milliseconds.
• Reduced cost: Scaling and
Amazon DynamoDB supports multiple high-traffic maintenance effort dropped
sites and systems including Alexa, the 10 times.
Amazon.com sites, and 442 Amazon fulfillment • Reduced complexity and risk:
centers. Across the 66-hour 2020 Prime Day, Retired more than 300 Oracle
these sources made 16.4 trillion calls to the hosts.
DynamoDB API, peaking at 80.1 million
requests
© 2023, per
Amazon Web Services, Inc. second.
or its Affiliates. 35
Performance at any scale
https://aws.amazon.com/blogs/database/amazon-dynamodb-auto-scaling-performance-and-cost-optimization-at-any-scale/
High request volume Consistent low latency
Many millions of requests per second
Millisecond variance
per table
© 2023, Amazon Web Services, Inc. or its Affiliates.
You work with tables…
Table 1 Table 2 Table 3
DynamoDB does the rest under the hood…
Server 1 Server N
1K WCU or 3K RCU
T1.p1 T1.pn
up to 10 GB
© 2023, Amazon Web Services, Inc. or its Affiliates.
DynamoDB Table
Table
A1 A2 A3 A4 A5
(partition (sort key)
key)
A1 A2
(partition (sort key)
key)
Items
A1 A2 A6 A4 A5
(partition (sort key)
key)
All items for a partition key
A1 A2 A3 A4 A7
==, <, >, >=, <=
(partition (sort key) “begins with”
key) “between”
sorted results
counts
Partition Key top/bottom N values
SortKey paged responses
Optional
Mandatory Model 1:N relationships
Key-value access pattern Enables rich query capabilities
© 2023, Amazon Web Services, Inc. or its Affiliates.
Determines data distribution
Item Distribution
Aggregates Partition key DynamoDB table
Hash.MIN = 0 Orders
00
Partition A
OrderId: 1
CountryCode: 1 Hash(1) = 7B
ASIN: [B00X4WHP5E] 55
Keyspace
Partition B
OrderId: 2
CountryCode : 1 Hash(2) = 48
ASIN: [B00OQVZDJM]
AA
OrderId: 3 Partition C
CountryCode : 1 Hash(3) = CD
ASIN: [B00U3FPN4U]
FF
Hash.MAX = FF
Related data (aggregate) is stored together for efficient access
© 2023, Amazon Web Services, Inc. or its Affiliates.
Path of a PutItem request
RR RR RR RR RR
AVA IL A B IL IT Y RR RR RR RR RR
ZO N E 1
RR RR RR RR RR
RR RR RR RR RR
AVA IL A B IL IT Y RR RR RR RR RR
ZO N E 2
RR RR RR RR RR
Network
RR RR RR RR RR
AVA IL A B IL IT Y RR RR RR RR RR
ZO N E 3
RR RR RR RR RR
© 2023, Amazon Web Services, Inc. or its Affiliates.
Data Lake Introduction
© 2023, Amazon Web Services, Inc. or its Affiliates.
Companies are increasingly embracing data
driven decision making and fostering an
open culture where the data is not siloed
within departments.
© 2023, Amazon Web Services, Inc. or its Affiliates.
Changing Requirements for Analytics
I WANT SUPPORT FOR . . .
Any scale, concurrency, with low cost, high throughput &
performance
Data from new sources, streaming, batch, real-time
Increasingly diverse types of data
Democratization of data – usage by many people of various
skills, make it easy run & operate
Choice of tools, techniques, and applications
© 2023, Amazon Web Services, Inc. or its Affiliates.
Using the Right Tool for the Task
… … … …
Data Scientist
Amazon
Amazon
Amazon Amazon
Exploration, Integration,
S3 Amazon Amazon
Kinesis Neptune S3 QuickSight SageMaker Predictive Models
Systems of
Record
Amazon Data Experts
S3 Glacier Amazon Amazon Amazon
AWS
Glue Redshift DynamoDB Athena Ad-hoc Reports,
Raw Data Create KPIs
Systems of
Engagement
AWS Amazon
AWS
Database
Amazon
Amazon
Elasticsearch
Amazon
ElastiCache Lambda API Gateway Business Users
Migration Service Service
S3 Dashboarding,
Sensor & Move Data
Prepared Data
Consumable Data Insights Consumption Use KPIs, Slice & Dice
Log Data
…
Downstream
External Data Amazon AWS Amazon Amazon Amazon Amazon Amazon Systems
Athena Glue EMR Transcribe Rekognition Comprehend SageMaker
Data Feeds,
Data Processing, Metadata Management Machine Learning
Information Hub
Analytical Data …
Insights
Applications
AWS AWS AWS Amazon AWS AWS
Data Sources KMS IAM CloudTrail CloudWatch CloudFormation Config Actionable Insights at
Security, Identity and Compliance Management and
the Point of Impact
Data and Insights Applications
Governance
© 2023, Amazon Web Services, Inc. or its Affiliates.
Serverless data lakes and analytics
Web app data
Amazon RDS
AWS Glue AWS Glue Data Amazon
Amazon S3 crawler Amazon
Catalog Athena QuickSight
Other databases
On-premises data
Streaming data
© 2023, Amazon Web Services, Inc. or its Affiliates.
AWS Glue—Data Catalog
Make data discoverable
Glue
Data Catalog
• Automatically discovers data and stores schema
Discover data and • Catalog makes data searchable, and available for
extract schema ETL
• Catalog contains table and job definitions
• Computes statistics to make queries efficient
Compliance
© 2023, Amazon Web Services, Inc. or its Affiliates.
AWS Glue—ETL Service
Make ETL scripting and deployment easy
• Automatically generates ETL code
• Code is customizable with Python
and Spark
• Endpoints provided to edit, debug,
test code
• Jobs are scheduled or event-based
• Serverless
© 2023, Amazon Web Services, Inc. or its Affiliates.
Amazon Athena
Example Query
© 2023, Amazon Web Services, Inc. or its Affiliates.
QuickSight
Create Beautiful, Interactive
Dashboards
• Add rich interactivity like filters, drill
downs, zooming, and more
• Blazing fast navigation
• Accessible on any device
• Data Refresh
• Publish to everyone with a click
© 2023, Amazon Web Services, Inc. or its Affiliates.
Labs
© 2023, Amazon Web Services, Inc. or its Affiliates.
Labs
Bắt buộc – Sử dung account do AWS cung cấp: Lab thử thách – Sử dụng account cá nhân hoặc
AWS cung cấp:
• Amazon RDS
https://000005.awsstudygroup.com/vi/ • Data lake / Data Analytic on AWS
https://000070.awsstudygroup.com/vi/
https://000072.awsstudygroup.com/vi/
https://000073.awsstudygroup.com/vi/
• DMS và SCT
https://000043.awsstudygroup.com/vi/
© 2023, Amazon Web Services, Inc. or its Affiliates.
Thank you!
© 2023, Amazon Web Services, Inc. or its Affiliates.