Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
25 views256 pages

Microservices for Cloud Scaling

Uploaded by

sujithreddy765
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views256 pages

Microservices for Cloud Scaling

Uploaded by

sujithreddy765
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 256

PART IV Cloud Programming Paradigms

Chapter 12 Microservices

Introduction
• The previous chapter describes the MapReduce paradigm that employs parallel computation to solve a single
problem.

• This chapter introduces an alternative way to structure software that takes advantage of multiple computers in a
data center to scale computation. Instead of focusing on one problem, however, the approach divides each
application into pieces, allowing the pieces to scale independently, as needed.

Traditional Monolithic Applications


• Two approaches:

o Build each application as a single, self-contained program (monolithic).

o Divide applications into multiple pieces that work together (microservices).

• A monolithic application is constructed as a single, self-contained piece of software where everything is bundled
into a single executable so a user only has one piece of software to install, configure, and use.

• Bundling eliminates failures that arise if a user fails to install all the needed pieces as well as incompatibility
problems that can arise if a user upgrades or reconfigures one piece without upgrading or reconfiguring the
others.

Figure 12.1 Illustration of a monolithic application with all the functions needed to support online
shopping built into a single program.

Monolithic Applications In A Data Center


• Traditional monolithic applications can run in a data center exactly the way they run on a traditional server by
simply launching and use a VM to run monolithic applications.

• This is not a good approach because monolithic applications cannot be replicated as quickly as cloud-native
applications as starting a VM has higher overhead than starting a container and a monolithic design means all
code must be downloaded when the application starts, even if pieces are not used.

The Microservices Approach


• A microservices architecture divides functionality into multiple, independent applications. Each of the
independent applications is much smaller than a monolithic program, and only handles one function. To
perform a task, the independent applications communicate over a network.
• The microservices approach can be used to implement a new application or to divide an existing monolithic
application.

• Disaggregation refers to the division of a monolithic application into microservices.

Figure 12.2 Illustration of one possible way the shopping application c a n b e disaggregated into a set of
microservices that communicate with one another.

The Advantages Of Microservices


• The microservices approach does introduce the extra overhead of running multiple, small applications and using
network communication among the pieces instead of internal function invocation.

• In a cloud environment, however, the microservices approach has advantages that can outweigh the overhead.
The advantages can be divided into two broad categories: advantages for software development and advantages
for operations and maintenance.

Advantages For Software Developmen


• Smaller scope and better modularity - Software engineers focus on one small piece of the problem at a time and
define clean interfaces. The limited scope encourages better decomposition and allows engineers to understand
each piece completely.

• Smaller teams - A microservice can be designed and implemented independent of other microservices, each
microservice only requires a small development team, meaning that the resulting code will be more uniform and
less prone to errors.

• Less complexity - Complexity leads to errors, and the monolithic approach creates complexity.

• Choice of programming language - When using the monolithic approach, all code must be written in a single
programming language. With the microservices approach, software engineers can choose the best language for
each service.

• More extensive testing - With the microservices approach, each service can be tested independently, allowing
more extensive and thorough assessment.

Advantages For Operations And Maintenance


• Rapid deployment - Because microservices are small, they can be created, tested, and deployed rapidly as well
as be changed easily and quickly.

• Improved fault isolation – Having multiple microservices makes fault isolation easier because a problematic
issue with a microservice can be tested and resolved while allowing applications and other microservices to
continue normal operations.
• Better control of scaling – Microservices can be scaled independently.

• Compatibility with containers and orchestration systems – Because it is small and only performs one task, a
microservice fits best into the container paradigm. Furthermore, using containers means that microservices can be
monitored, scaled, and load balanced by a conventional container orchestration system, such as Kubernetes

• Independent upgrade of each service - Once an improved version of a microservice has been created, the new
version can be introduced without stopping existing applications and without disturbing other microservices.

The Potential Disadvantages Of Microservices


• Cascading errors - One microservice can invoke another, which can invoke another, and so on, so if one of the
microservices fails, the failure may affect many others as well as the applications that use them.

• Duplication of functionality and overlap - Due to the ease with which microservices can be created, when
functionality is needed that differs slightly from an existing microservice, it is often easier to create a completely
new one than to modify the existing microservice resulting in many similar microservies.

• Management complexity - Each microservice must be monitored and when hundreds of microservices are
running simultaneously, it can be difficult to understand their behaviors, interdependencies, and the interactions
among them.

• Replication of data and transmission overhead - Each microservice must obtain a copy of the needed data,
either from a storage server or by being passed a copy when the microservice is invoked.

• Increased security attack surface - A monolithic application represents a single attack point. The microservices
approach has a much larger security attack surface having multiple points that an attacker can try to exploit.

• Workforce training - The microservices approach require software engineers to consider the cost of running
each microservice as well as data communication costs between microservices as well as the need to develop
new skills to create software for microservices.

Microservices Granularity
• The question of microservice size forms one of the key decisions a software engineer faces when following the
microservices approach.

Figure 12.3 Two alternative designs for the payment microservice from Figure 12.2. (a) divides the functionality into four
separate microservices, and (b) places the four under an intermediate microservice.

• The figure only shows two ways to structure the payment microservices and does not show several of the other
microservices from Figure 12.2 that are needed for the application.

• The following are three heuristics that can help developers choose a granularity:
o Business process modeling - Each microservice should be based on a business process. Instead of merely
disaggregating existing applications, development teams identify how the applications are being used and the
steps along the workflow can be transformed into microservices.

o Identification of common functionality - Instead of building a microservice for exactly one application,
consider how related applications might use the service and plan accordingly.

o Adaptive resizing and restructuring - The small size of a microservice means it can be redesigned quickly
and accommodates new applications and new underlying functionality.

Communication Protocols Used For Microservices


• Communication protocols specify the message details that enable microservices to have meaningful and
unambiguous communication.

• Like other applications in a data center, microservices communicate using Internet protocols. Doing so means a
microservice can be reached from inside or outside the data center, subject to security restrictions.

• For the transport layer protocol, most microservices use the Transmission Control Protocol (TCP), with
TCP being sent in Internet Protocol (IP) packets. TCP merely delivers streams of bytes between a pair of
communicating entities.

• When communicating over TCP, the pair must also employ a transfer protocol that defines how bytes are
organized into messages. In essence, the set of transfer protocol messages defines the service being offered.

• Using an existing protocol makes it easier to write code as several transfer protocols exist and satisfy most needs.
Thus, a software engineer merely needs to choose one when designing a microservice. Consider two examples:
o HTTP – The HyperText Transfer Protocol used in the Web

o gRPC – An open source high-performance, universal RPC framework

• HTTP
o When an entity uses HTTP to communicate with a microservice the entity can send data to the
microservice or request that the microservice send data.

o In addition to specifying an operation to be performed, each request message specifies a data item by
giving the item’s name in the form of a Uniform Resource Identifier (URI).

o For some operations, the sender must also supply data to be used for the request. Below are six basic
operations that HTTP supports:
▪ GET - retrieve a copy of the data item specified in the request
▪ HEAD - retrieve metadata for the data item specified in the request (ie. last
modified time)
▪ PUT - replace the specified data item with the data sent with the request
▪ POST - append the data sent with the request onto the specified data item
▪ PATCH - use data sent with request to modify part of the data item specified in
the request
▪ DELETE - remove the data item specified in the request
• gRPC
o Unlike most transfer protocols, gRPC does not define a specific set of operations that can be performed.
Instead, it provides a general framework for communication and allows a specific set of operations to be
defined for each instance (microservice).

o gRPC incorporates the Remote Procedure Call (RPC) approach that has been used to build distributed
systems for decades. The general idea is straightforward: create a program that runs on multiple
computers by placing one or more of the procedures from the program on remote computers that are
invoked with messages containing the arguments for the procedure being called and return reply
messages containing the value returned by the procedure.

o To make RPC easy to use, technologies exist that generate message passing code automatically. RPC
technologies generate code known as stubs. In the program, a stub replaces each procedure that has
been moved to a remote computer; on the remote computer, a stub calls the procedure with a local
procedure call, exactly like a program does.

Figure 12.6 (a) An application program that calls two procedures, and (b) the same application using RPC technology that
allows the procedures to run remotely.

o gRPC extends traditional RPC technology by supporting many programming languages, allowing
the user to define data serialization, and supporting streaming of multiple data items using a
technology known as protocol buffers.

Communication Among Microservices


• A variety of interactions have been used with microservices primarily separated into two broad types:

o Request-response (REST/RESTful interface) - A request-response style of interaction used on the


Web: a web browser sends a request to which a web server sends a response. Typically requires multiple
requests and responses. A REST API implies that the microservice uses HTTP as its transfer protocol.

o Data streaming (continuous interface) - Data streaming interaction avoids repeated requests by allowing
a microservice to combine a small number of items into a single response by providing a stream of data
items as a response. However, all items must be available, and the processing required to create the
combined response must be reasonable. When using a streaming interface, an entity establishes a network
connection with the microservice and sends a single request. The microservice sends a sequence of
one or more data items in response to the request (i.e., the microservice streams a sequence of data
items).

• An important distinction between traditional RPC and gRPC are the interactions they support.
o A traditional RPC follows a request-response interaction where every is accomplished by sending single
messages back and forth over the network to the computer containing the remote procedure and a single
message to travel back.

o gRPC extends remote procedure call to allow a remote procedure to stream multiple data items in
response to a request.

• Microservices have also used variations of basic communication interactions. For example, some microservices
follow the publish-subscribe variant of data streaming.

o subscribe – An entity contacts a microservice and specifies a topic. The network connection remains
open perpetually, and the microservice sends any data items that arrive for the specified topic.

o publish - An entity contacts the microservice and sends data items labeled with a topic.

Using A Service Mesh Proxy


• A service mesh is a software system that handles such tasks as:
o Creating instances of a microservice
o Forwarding requests to a given instance
o Translating requests to an internal form
o Discovering microservices discover another?

• Instead of allowing entities to contact an instance of a microservice directly, a service mesh requires that the
entities contact a proxy that manages a set of instances and forwards each request to one instance.

• The use of a proxy allows a microservice to be scaled and isolates the communication used internally from
the communication used to access the microservice.

Figure 12.7 Illustration of a proxy for a service mesh. To use the service, an external entity contacts the proxy.

The Potential For Deadlock


• Although the microservice approach has many advantages, a distributed system composed of many
microservices can fail in unexpected ways and is to circular dependencies where a set of microservices all depend
on one another. If each microservice in the cycle is waiting for another microservice, a deadlock can result.

• Consider a trivial example of four microservices: a time service, a location service, a file storage service, and an
authentication service. Although the services access one another, the system runs with no problems. The file
storage service uses the time service to obtain the time of day that it uses for timestamps on files. The location
service uses the file storage service to obtain a list of locations. The authentication service uses the location
service to find the location of the file storage service (which it uses to obtain stored encryption keys). If any
service is completely terminated and restarted, the system continues to work correctly as soon as the restart
occurs. However, a deadlock could occur if all four microservices attempt to start at the same time.
Figure 12.8 A dependency cycle among four microservices.

Microservices Technologies
• Many technologies have been created to aid software engineers in the design and operation of microservices:

o Commercial and open source service mesh technologies exist, including Linkerd, a project of the Cloud
Native Computing Foundation (CNCF).

o Istio, a joint project among Google, IBM, and Lyft.

o Many frameworks exist that help developers create and manage microservices, including Spring Boot.

Summary
• The microservices approach disaggregates a monolithic application into multiple services, allowing them to scale
independently.

• The approach has advantages for software development, including smaller scope, smaller teams, less complexity,
a choice of programming language, and more extensive testing.

• The approach has advantages for operations and maintenance, including rapid deployment, improved fault
isolation, better control of scaling, compatibility with containers and orchestration systems, and independent
upgrade of each microservice.

• The approach also has potential disadvantages, including cascading errors, duplication of functionality and
overlap, management complexity, replication of data and transmission overhead, an increased security attack
surface, and the need for workforce training.

• Microservices communicate over a network using transport protocols, such as HTTP and gRPC. HTTP supports
a request-response (REST) interaction. gRPC generalizes conventional RPC to provide a streaming interface in
addition to remote procedure invocation and return.

• Service mesh software automates various aspects of running a microservice. The use of a proxy for the service
hides both the internal structure and internal communication protocols, which allows the protocols used for
external access and internal microservice invocation to differ.

• An important weakness of the microservices approach arises because each microservice is created and
maintained independently. Subtle problems, such as circular dependencies, can arise and remain hidden until an
unusual event occurs, such as a power failure that causes all microservices to restart simultaneously.
PART IV Cloud Programming Paradigms
Chapter 13 Controller-Based Management Software

Introduction
• Previous chapters describe paradigms used to create cloud-native software systems, including the MapReduce
and microservices paradigms.

• This chapter focuses on software that automates the management of resources. The chapter explains controller-
based designs and concepts of declarative specification and automated state transition.

Traditional Distributed Application Management


• An understanding of controller-based designs begins with understanding how traditional distributed systems are
managed.

• Consider an application that maintains information about employees. One way to deploy such an application
requires each department to run an instance of the application that stores data about the employees in the
department. Authorized users run client applications that access the instance.

• Using multiple instances means the system is general because it allows an authorized user in any department to
access information about an employee in other departments, and the system is efficient because most accesses
go to the local copy.

• Because a traditional distributed system runs instances on many physical computers, managing such a system
usually employs a monitoring tool that alerts a human operator when a server application stops responding.

Periodic Monitoring
• To test an instance, many automated monitoring tools send a special management message to which the
application responds. If no response arrives, the tool might retry once and then alert the operator about the
problem.

• To avoid flooding an instance with a continual stream of monitoring requests, most tools check periodically.

• A monitoring tool runs in the background and never terminates. The code is arranged to repeat a set of steps
indefinitely.

Managing Cloud-Native Applications


• Most cloud-native applications use container replication to scale out services.

• Unfortunately, a cloud-native application is much more complex than a traditional distributed system. To
understand the complexity, consider a few differences.

• Difference in instances. A traditional distributed application consists of a fixed set of static instances. By
contrast, cloud-native applications deploy instances dynamically as needed, creating additional instances to scale
out an application. During times of low demand, only a few instances remain active, and during times of high
demand, many instances remain active. Thus, unlike a tool that monitors a traditional distributed system, a tool
that monitors a cloud-native application must avoid sending erroneous alerts about instances that have been shut
down to decrease the scale.

• Difference in instance locations. In a traditional distributed system, the location of each instance is known in
advance and never changes. For a cloud-native application, however, orchestration software uses the current
load on each server and other factors to choose where to deploy an instance.

• The structure of applications. Unlike traditional distributed systems in which each application is constructed
as a monolithic program, cloud-native applications consist of disaggregated programs that run as multiple
microservices. Thus, instead of monitoring multiple copies of a single application, software built to manage a
cloud-native application must monitor multiple instances of multiple microservices.

• Application persistence. A traditional application instance executes as a server process that starts when a
computer boots and remains running until the computer shuts down. In contrast, cloud-native software runs in
containers. Typically, a container is designed to service one request and then terminate; a new container is
created for each request, possibly at a new location. Thus, monitoring a cloud-native service requires handling
extremely rapid changes in the set of active instances.

• Figure 13.2 summarizes differences between a traditional distributed application and a cloud-native application
that make managing a cloud-native application complex.

The number of instances does not change The number of instances changes dynamically an application
scales out

The location of each instance is known in advance The locations of instances are chosen dynamically to balance
Load

Each instance consists of a monolithic application An application is disaggregated into multiple microservices

An instance runs as a process An instance runs as a container that that persists indefinitely
exits after handling one request

Figure 13.2 A comparison of traditional and cloud-native applications.

• Monitoring a cloud-native application involves monitoring multiple microservices that are each composed
of a varying number of instances at changing locations, with each instance only persisting for a short time.

Control Loop Concept


• The term control loop is used in automation systems to refer to a non-terminating conceptual cycle that adjusts a
system to reach a specified state.

• A control loop implements a declarative, intent-based interface in which the user merely specifies the intended
result and the control loop implements the changes needed to achieve the result.

Figure 13.3 The conceptual cycle of a control loop for a thermostat that regulates temperature.
Control Loop Delay, Hysteresis, And Instability
• The delay step in a control loop is optional, but can be important for two reasons. First, taking measurements too
rapidly can waste resources and lead to unexpected behavior.

• Hysteresis to refer to the lag between the time a change is initiated and the time it takes effect. Hysteresis is
important because it can cause unexpected results.

• If a control loop takes measurements before a change has time to take effect, the results can be unexpected;
adding a delay to the loop may be necessary to prevent oscillations and guarantee that the system
convergences on the desired state.

The Kubernetes Controller Paradigm And Control Loop


• A programmer who writes traditional control loop code for an IoT device must plan the loop carefully to
prevent unintended behavior that can result from taking measurements too quickly.

• Cloud orchestration systems, including Kubernetes, employ a variant of a control loop that eliminates periodic
measurements.

• Recall from Chapter 10 that Kubernetes uses a set of controllers to automate various management tasks, such
as scale out. We can now understand how a controller can manage container deployments and microservices:
the software employs an intent-based control loop.

• The term controller pattern to characterize the design. Conceptually, each Kubernetes controller:

o Runs a control loop indefinitely


o Compares the actual and desired states of the system
o Makes adjustments to achieve the desired state

• Algorithm 13.2 captures the essence of a Kubernetes controller by showing how it follows a declarative, intent-
based paradigm to manage a microservice.

-------------------------------------------------------------------------------------------------------------------------------------------------------------
Purpose: Act as a controller that manages a microservice consisting of multiple pods of containers

Given: A specification of a desired state of the service

Method: Continually obtain information about the actual state of the service. Compare the actual state to
the desired state, and make adjustments to move the service toward the desired state
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Algorithm 13.2 A Kubernetes controller for a microservice.

An Event-Driven Implementation Of A Control Loop


• Kubernetes can obtain accurate information about the actual state of the system without repeatedly polling to
take measurements by using an event-driven control loop.

• Instead of arranging for the controller to check status periodically, an event-driven control loop configures
components to inform the controller whenever a change occurs.

• Typically, event-driven systems use message passing. When a change occurs, the controller receives a message
about the change.

• For example, when the specification file changes, a file system component sends the controller a message.
Similarly, the orchestration component sends the controller a message when an instance exits.

• In essence, using an event-driven approach reverses the responsibility for obtaining information about the state
of the system. In a traditional control loop, the controller actively polls the system to obtain the information; in
an event-driven control loop, the controller waits passively to be informed when a change occurs.

• Figure 13.4 shows the steps an event-driven controller follows to react to incoming messages.
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Read and parse the specification file
Use the specification to create pods for the system loop forever {
wait to receive a message
if the specification changed, read and parse the file
If the status of a pod has changed, record the change
if the change leaves the service outside the desired state, adjust the system to move toward the desired state
}
-------------------------------------------------------------------------------------------------------------------------------------------------------------

Figure 13.4 An event-driven implementation of a Kubernetes controller in which the controller waits
passively to be informed when a change occurs.

• Note that using the event-driven paradigm means that a controller does not perform a delay step as part of the
control loop. From a programmer’s point of view, the approach means that there is no need to choose a delay,
and there is never a problem with a delay that is too small or too large.

• The controller paradigm used by Kubernetes follows a declarative, intent-based approach in which a user
specifies the desired state of a service and a controller continually adjusts the system to move toward the desired
state. Using an event-driven implementation for a controller avoids needless polling.

Components Of A Kubernetes Controller


• Conceptually, Kubernetes runs multiple, independent controllers. In practice, Kubernetes implements all
controllers with a single background daemon process, the kube-controller-manager.

• Kubernetes divides each controller into three pieces:

o Informer or SharedInformer (watcher) - looks for changes in the state of Kubernetes objects and sends
an event to the Workqueue whenever a change occurs.

▪ To avoid polling a list to find new items, Kubernetes includes a Listwatcher component that
generates a notification about the creation of a new instance or a change in a current instance.

▪ An Informer component keeps a local cache of the state of resources. Many microservices use
a set of controllers to handle multiple aspects of the service. In such cases, SharedInformer
provides an alternative that allows cached information about the state of the service to be shared
among a set of controllers.

o Workqueue and Dispatcher – contains a list of changes that have occurred in the state of the service.
New items are added when a change occurs, and a Dispatcher removes items as they are processed.
When an item appears on the Workqueue, it may mean that the current state of the service no longer
adheres to the desired state.

o Workers – When it extracts an item from the Workqueue, the Dispatcher checks the specification. If the
system no longer conforms to the desired state, the Dispatcher invokes a worker to make adjustments to
move toward the desired state (e.g., by creating a new instance to replace one that has completed). As
with most cloud facilities, it is possible to scale a controller by creating multiple workers.

• A programmer must create code that specifies the steps that should be taken to align the state of a computation
with the desired state. The programmer writes a function, Reconcile, that Kubernetes calls. When it runs,
Reconcile will have access to the specification, and can adjust the state of the system accordingly.
• Although the description above implies that the controller operates continuously, Kubernetes does provide a way
for a controller to run periodically. A value known as ResyncPeriod specifies when a controller should
revalidate all items in the cache. In essence, ResyncPeriod specifies the time at which the controller compares
the current state to the desired state and takes action accordingly. Of course, a designer must be careful not to
set ResyncPeriod too low or the controller will incur a high computational load needlessly.

Custom Resources And Custom Controllers


• Kubernetes defines a set of core resources and controllers that cover most of the common tasks associated with
deploying and managing a cluster that includes multiple pods.

• In addition to an extensive set of built-in controllers and pre-defined resources, Kubernetes allows a user to
define custom resources and custom controllers.

• Like built-in controllers, a custom controller employs the event-driven approach. To create a custom
controller, a software engineer must write code for the basic components:

o a Workqueue
o Worker(s) to process events
o and one of the following:
▪ SharedInformer, if the controller maintains information about multiple pods
▪ an Informer/watcher, if the controller has small scope.

• Although creating such components may seem complex, the kube-controller-manager handles many of the
details.

Kubernetes Custom Resource Definition (CRD)


• Kubernetes provides a facility, known as the Custom Resource Definition (CRD), that helps a software
engineer create a custom resource. The facility allows one to define new objects and then integrate them with a
Kubernetes cluster. Once it has been defined, a new object can be used exactly like a native Kubernetes object.

• Various approaches have been used that allow an application to access multiple underlying facilities. For
example, Figure 13.5 illustrates a proxy service.

Figure 13.5 Illustration of a proxy service that allows applications to access multiple types of
databases.

• A proxy service fits between applications and underlying databases. Instead of accessing a database directly, an
application invokes the proxy service.

• Each database technology defines a specific set of commands that must be used to communicate with the database
(i.e., a database-specific API). The proxy accepts a generic set of database requests from applications. To fulfill
a request, the proxy translates the request into database-specific commands (e.g., for Redis or MongoDB), and
issues the commands to the underlying database.

• A proxy service offers a single, generic interface that applications use, and only the proxy needs to understand
the details of underlying databases. The key point is that an application can switch from one database to another
without being modified.
• The CRD facility in Kubernetes offers the same benefits as a proxy without requiring a user to create and manage
a separate service.

• To use CRD, a software engineer creates a Custom Resource Definition that accepts generic database commands
just as a proxy does along with code that translates generic requests into database-specific commands.

• A CRD can function like a proxy service, but instead of running and managing a separate service, Kubernetes
can manage instances of the CRD.

Service Mesh Management Tools


• Management functions include service discovery, health checking, secure communication among a set of
services, key-value object storage, and support for deploying an application across multiple data centers.

• Most service mesh tools have been designed to work with Kubernetes. Examples include HashiCorp Consul,
Istio, and Linkerd from the Cloud Native Computing Foundation (CNCF) project.

• One of the main arguments for mesh management tools arises from the need for security. Kubernetes provides a
way to deploy services, and a mesh management tool ensures that communication among services remains
secure. In addition, a mesh management tool may offer additional control functions for Kubernetes clusters.

Reactive Or Dynamic Planning


• The term reactive (dynamic) planning refers to the rapid planning required to accommodate new conditions.

• The idea of reactive/dynamic planning is straightforward: adapt quickly to the constant stream of changes in both
the environment and demand by planning a new desired state and moving the service to the desired state. The
controller paradigm and control loops can be used to implement reactive/dynamic planning.

• Interestingly, constant change may mean that a particular service never reaches a steady state. Instead, the
management system adapts to continual changes by constantly revising the desired state of the service. Provided
controllers are designed to make useful decisions and accommodate any hysteresis introduced by queues, the
service can continue to change without becoming unstable.

A Goal: The Operator Pattern


• Most controllers handle routine, repetitive tasks. The term operator pattern describes an envisioned control
system that can handle the remaining management tasks that normally require a human operator, such as
identifying anomalies, diagnosing problems, understanding how and when to apply and deploy software updates,
and how to delete resources associated with a service when the service is shut down. Software that achieves the
envisioned goal will require AIops.

Summary
• Managing a cloud-native application is inherently more complex than managing a traditional distributed system.
In a cloud-native application, the number and location of instances changes, a single application may be
disaggregated into multiple microservices, and instead of persisting, an instance exits after handling one request.

• A control loop is a non-terminating, intent-based computation that continually measures the state of a system,
compares the actual state to the desired state, and makes adjustments to move the system toward a desired state.

• Kubernetes provides a set of built-in controllers that each run a control loop for one particular aspect of a
cluster. A Kubernetes controller consists of three components that implement an event-driven approach: an
Informer/watcher (or SharedInformer) that sends events when the state of the system changes, a Workqueue that
holds a list of events, and a Worker that handles events.

• Kubernetes offers users the ability to define custom resources and custom controllers. A Custom Resource
Definition (CRD) can be used to create a generic interface for applications and allow each instance to use a
specific underlying technology.
PART IV Cloud Programming Paradigms
Chapter 14 Serverless Computing And Event Processing

Introduction
• Previous chapters describe algorithms, platforms, and technologies that can be used to create cloud-native
software systems.

• This chapter focuses on facilities cloud providers offer that enable programmers to create, deploy, and scale
applications quickly, easily, and at low cost.

Traditional Client-Server Architecture


• When application programs communicate over a network, they follow the client-server paradigm, which
divides applications into two categories

o Server: an application that runs first and waits for contact


o Client: an application the contacts a server

Scaling A Traditional Server To Handle Multiple Clients


• A server scales by using concurrent execution to handle multiple clients at the same time.

• The server repeatedly waits for a client to initiate communication and then creates a concurrent process to handle
the client.

• In addition to the processes created for each active client, a master server process waits for the clients to contact
it.

• Figure 14.2 illustrates a concurrent server handling three clients.

Figure 14.2 An example of traditional client-server communication.

• The application for a concurrent server integrates two aspects of a server into a single application program:
o Fulfilling the service being offered - The chief function of a server lies in interacting with clients to
fulfill clients’ requests.

o Replicating and scaling the server - A traditional concurrent server uses contact from a new client
to trigger the creation of a separate process to handle the client.

• In a traditional server, a single application contains code to replicate the server to handle multiple
simultaneous clients as well as code to interact with a given client and handle the client’s requests.
Scaling A Server In A Cloud Environment
• A traditional concurrent server replicates copies of itself automatically to handle multiple clients simultaneously.
However, all copies must run on the same physical computer.

• The approach does not work well in a cloud environment because cloud systems achieve large scale by
replicating instances across many physical machines.

• Complex software systems must be used to run servers in a cloud data center. The systems handle instance
management by deploying copies as needed and must also handle network communication by arranging to
forward traffic from each client through a proxy to the correct instance.

• Figure 14.3 illustrates management software controlling a deployment.

Figure 14.3 Management software controlling a proxy and replicas of a server on multiple physical
machines.

The Economics Of Servers In The Cloud


• A large set of management technologies exist that can be used to deploy and operate services, including
orchestration systems, proxies, load balancers, and service mesh management software.

• Many of the software technologies follow the open-source model, making them free. Customers only need to
have basic server software, lease a set of VMs, and use open-source software to handle deployment and
scaling at little extra cost.

• Unfortunately, two costs can be significant:

o Unused capacity - a customer must allocate sufficient VMs to handle the expected peak load and as a
result, they must pay for VMs that remain idle during off-peak times.

o Expertise and training – Although free, open-source management systems require a significant amount
of expertise to use effectively and safely. And, because cloud technologies continue to evolve, the
customer must pay for training to keep their staff up to date.

The Serverless Computing Approach


• Serverless computing allows cloud customers to build and run software to fulfill users’ requests without thinking
about the deployment and replication of servers, without configuring network names and addresses for servers,
and without leasing VMs to run the servers.

• Also known as Function as a Service (FaaS), the serverless computing approach allows a cloud customer
to avoid dealing with servers by paying a cloud provider to deploy and scale the customer’s servers.

• Avoids charges for unused capacity because the provider only charges fees for the time a server is being used.
• A provider amortizes the cost of experts and their training over all customers, making it less expensive for each
customer.

• Some providers might charge a minimum monthly fee regardless of whether a service is utilized or not but some
charge a customer nothing is the service is unused over a given period (scale to zero).

• The serverless approach offers great flexibility when accommodating a large number of clients because it
provides arbitrary scale (scale to infinity) so a customer does not have to plan for a peak load,

Stateless Servers And Containers


• To make it feasible to deploy servers quickly, serverless technologies use containers (Chapter 6). To deploy and
manage a server, serverless systems use orchestration (Chapter 10), and to handle scale out, a serverless system
uses the controller-based approach (Chapter 13).

• Despite building on extant technologies, the serverless approach introduces two key features that distinguish
it from the traditional server approach:

o The use of stateless servers

o Adherence to an event-driven paradigm.

The use of stateless servers.


• The term state refers to data related to clients that a server stores internally.

• The term stateful server refers to a server that stores state information

• The term stateless server refers to a server that does not store state information.

• Stateful servers store information for two reasons:

o Allows a server to provide continuity across multiple contacts by the client.

o Allows a server to share information among multiple clients.

• A stateful approach works well for a traditional server because the server runs on a single computer. Therefore,
a traditional server can use mechanisms such as shared memory to allow all instances of the server to access
and update the state information. Furthermore, because a traditional server has a long lifetime, state information
usually persists across many client contacts.

• The stateful approach does not work well for a server that runs in a data center and handles large scale because
the orchestration systems deploy instances on multiple physical computers, making shared memory impossible
and to handle microservices, containers are designed with a short lifetime: the container starts, performs one
function, and exits. Thus, state information does not persist for more than one client connection.

• To capture the idea that serverless computing focuses on running a single, stateless function in each container
(i.e., FaaS), some engineers say that serverless computing runs stateless functions.

• Because it uses containers and can run on multiple physical servers, a serverless computing system
requires server code to be stateless.

• Statefulness only refers to the information a server keeps in memory while the server runs. Stored data (e.g., a
database, a file on NAS, or an object store) does not count as state information because the data is not lost when
the server exits. When a server exits, state information disappears.
• Although serverless computing requires servers to follow a stateless design, a server may store and
retrieve data from a database or persistent storage, such as a file on NAS or an object store.

Adherence to an event-driven paradigm.


• Serverless computing adopts the event-driven paradigm that Kubernetes controllers use and generalizes it.

• The underlying cloud system generates events when changes occur (e.g., a physical server fails). In addition,
serverless systems count each server access as an event.

• Some serverless systems provide management interface or other programmatic interface components that
allow a human user to interact with the system or a computer program to use a protocol other than HTTP. A
contact from any of the interfaces counts as an event.

The Architecture Of A Serverless Infrastructure


• Serverless computing adopts the technology Kubernetes uses for controllers and follows the same general
architecture.

• The chief components include an event queue, a set of interface components that insert events into the queue,
and a dispatcher that repeatedly extracts an event and assigns a worker node to process the event. In the case
of serverless computing, worker nodes run the server code.

• Figure 14.4 illustrates the components in a serverless infrastructure.

Figure 14.4 The architecture that providers use for serverless computing.

An Example Of Serverless Processing


• Netflix uses the AWS Lambda event-driven facility for video transcoding, a step taken to prepare each new
video for customers to download. The arrangement has become a canonical example of serverless computing.

• Below are the basic steps taken to transcode a video.


1. A content provider uploads a new video.
2. A serverless function divides the new video into 5-minute video segments.
3. Each segment is given to a separate serverless function for processing.
4. The processed segments are collected, and the video is available for the customers to access.

• Events trigger each of the serverless processing steps.

o When a content provider uploads a new video, the system places the new video in an Amazon S3
bucket. The S3 object storage system generates an event that triggers a serverless function to divide the
video into segments that are each five minutes long.
o When a segment arrives in an S3 bucket, another event triggers a serverless function that processes and
transcodes the segment. Thus, transcoding can proceed in parallel.

• Figure 14.6 illustrates how video data flows through the system and how events trigger processing.

Figure 14.6 The Netflix transcoding system. A new video event causes the video to be divided into segments;
a new segment event causes the segment to be transcoded.

Potential Disadvantages Of Serverless Computing


• Serverless computing offers three unbeatable advantages: the ability to scale arbitrarily, no need to manage
servers, and lower overall cost.

• Despite its advantages, serverless computing does have potential disadvantages.

o Serverless systems introduce latency. Unlike a traditional server that can always respond immediately
because it always starts before any client initiates contact, management software only launches instances
of the server if it is needed.

o Serverless systems can generate unexpected results. Consider when combined with a disaggregated
microservice, each microservice runs as its own function which must be orchestrated together. An error
in one could result in a cascading result of microservice failures, so not only is there a small cost for
each microservice that has been disaggregated, also the cost for each microservices to issue an alert,
which also involves a function call.

Summary
• Serverless computing follows the traditional client-server paradigm in which one or more clients initiate contact
with a server.

• Unlike a traditional concurrent server that is limited to one computer, serverless computing uses cloud
technologies that allow it to scale arbitrarily and it separates server management from the core function of a
server, allowing cloud providers to offer services that deploy and operate servers for their customers.

• The chief motivation for serverless computing lies in its economic benefits. Because a cloud provider handles
the details of managing server deployment and scaling, a customer does not need to maintain staff with expertise.
Because a cloud provider only charges for the computation actually used, a customer does not pay for idle VMs
or servers.

• In terms of implementation, serverless computing adopts and extends the architecture used for Kubernetes
controller-based systems. Serverless systems use an event- based paradigm in which each change in the cloud
system and each contact from a client becomes an event that is added to a queue. A dispatcher repeatedly
extracts events and assigns them to worker nodes to handle.

• Despite all the advantages, serverless computing has potential disadvantages. Unlike a conventional server that
starts before clients initiate contact, serverless computing creates servers on demand, leading to a small delay.
Unexpectedly high costs can arise from cascades of events and from microservices that divide computation onto
small functions.
AWS Academy Cloud Foundations
Module 08 Student Guide
Version 2.0.12
100-ACCLFO-20-EN-SG
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.

This work may not be reproduced or redistributed, in whole or in part,


without prior written permission from Amazon Web Services, Inc.
Commercial copying, lending, or selling is prohibited.

All trademarks are the property of their owners.


AWS Training and Certification AWS Academy Cloud Foundations

Contents
Module 8: Databases 4

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 3
AWS Training and Certification Module 8: Databases

Module 8: Databases
AWS Academy Cloud Foundations

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Welcome to Module 8: Databases

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 4
AWS Training and Certification Module 8: Databases

Module overview
Topics Lab
• Amazon Relational Database Service • Lab 5: Build Your DB Server and
(Amazon RDS) Interact with Your DB Using an App
• Amazon DynamoDB Activity
• Amazon Redshift • Database case studies
• Amazon Aurora
Demos
• Amazon RDS console
• Amazon DynamoDB console
Knowledge check

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 2

The business world is constantly changing and evolving. By accurately recording, updating, and
tracking data on an efficient and regular basis, companies can use the immense potential from
the insights that they obtain from their data. Database management systems are the crucial link
for managing this data. Like other cloud services, cloud databases offer significant cost
advantages over traditional database strategies.

In this module, you will learn about Amazon Relational Database Service (or Amazon RDS),
Amazon DynamoDB, Amazon Redshift, and Amazon Aurora.

This module will address the following topics:


• Amazon Relational Database Service (Amazon RDS)
• Amazon DynamoDB
• Amazon Redshift
• Amazon Aurora

The module includes two recorded demonstrations that will show you how to access and interact
with Amazon RDS and Amazon DynamoDB by using the AWS Management Console.

The module also includes a hands-on lab where you will set up an Amazon RDS database
solution.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 5
AWS Training and Certification Module 8: Databases

The module also includes an activity that challenges you to select the appropriate database service
for a business case.

Finally, you will be asked to complete a knowledge check that will test your understanding of the key
concepts that are covered in this module.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 6
AWS Training and Certification Module 8: Databases

Module objectives
After completing this module, you should be able to:

• Explain Amazon Relational Database Service (Amazon RDS)

• Identify the functionality in Amazon RDS

• Explain Amazon DynamoDB

• Identify the functionality in Amazon DynamoDB

• Explain Amazon Redshift

• Explain Amazon Aurora

• Perform tasks in an RDS database, such as launching, configuring, and interacting

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 3

In this module, you will learn about key concepts that are related to database solutions,
including:
• Understanding the different database services in the cloud.
• Discovering the differences between unmanaged and managed database solutions.
• Understanding the differences between Structured Query Language (or SQL) and NoSQL
databases.
• Comparing the availability differences of alternative database solutions.

The goal of this module is to help you understand the database resources that are available to
power your solution. You will also review the different service features that are available, so you
can begin to understand how different choices impact things like solution availability

After completing this module, you should be able to:


• Explain Amazon Relational Database Service (Amazon RDS)
• Identify the functionality in Amazon RDS
• Explain Amazon DynamoDB
• Identify the functionality in Amazon DynamoDB
• Explain Amazon Redshift
• Explain Amazon Aurora
• Perform tasks in an RDS database, such as launching, configuring, and interacting

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 7
AWS Training and Certification Module 8: Databases

Section 1: Amazon Relational


Database Service
Module 8: Databases

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introducing Section 1: Amazon Relational Database Service.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 8
AWS Training and Certification Module 8: Databases

Amazon Relational Database Service

Amazon Relational Database


Service (Amazon RDS)

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 5

Welcome to an introduction to the foundational database services that are available on Amazon
Web Services (AWS). This module begins with Amazon Relational Database Service (Amazon
RDS).

This section starts by reviewing the differences between a managed and unmanaged service in
relation to Amazon RDS.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 9
AWS Training and Certification Module 8: Databases

Unmanaged versus managed services

Unmanaged: Managed:
Scaling, fault tolerance, and Scaling, fault tolerance, and
availability are managed by availability are typically
you. built into the service.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 6

AWS solutions typically fall into one of two categories: unmanaged or managed.

Unmanaged services are typically provisioned in discrete portions as specified by the user. You
must manage how the service responds to changes in load, errors, and situations where
resources become unavailable. Say that you launch a web server on an Amazon Elastic Compute
Cloud (Amazon EC2) instance. Because Amazon EC2 is an unmanaged solution, that web server
will not scale to handle increased traffic load or replace unhealthy instances with healthy ones
unless you specify that it use a scaling solution, such as AWS Automatic Scaling. The benefit to
using an unmanaged service is that you have more fine-tuned control over how your solution
handles changes in load, errors, and situations where resources become unavailable.

Managed services require the user to configure them. For example, you create an Amazon Simple
Storage Service (Amazon S3) bucket and then set permissions for it. However, managed services
typically require less configuration. Say that you have a static website that you host in a cloud-
based storage solution, such as Amazon S3. The static website does not have a web server.
However, because Amazon S3 is a managed solution, features such as scaling, fault-tolerance,
and availability would be handled automatically and internally by Amazon S3.

Now, you will look at the challenges of running an unmanaged, standalone relational database.
Then, you will learn how Amazon RDS addresses these challenges.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 10
AWS Training and Certification Module 8: Databases

Challenges of relational databases


• Server maintenance and energy footprint
• Software installation and patches
• Database backups and high availability
• Limits on scalability
• Data security
• Operating system (OS) installation and patches

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 7

When you run your own relational database, you are responsible for several administrative tasks,
such as server maintenance and energy footprint, software, installation and patching, and
database backups. You are also responsible for ensuring high availability, planning for scalability,
data security, and operating system (OS) installation and patching. All these tasks take resources
from other items on your to-do list, and require expertise in several areas.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 11
AWS Training and Certification Module 8: Databases

Amazon RDS

Managed service that sets up and operates a relational database in


the cloud.

AWS Cloud

Users Application
Servers Amazon RDS

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 8

Amazon RDS is a managed service that sets up and operates a relational database in the cloud.

To address the challenges of running an unmanaged, standalone relational database, AWS


provides a service that sets up, operates, and scales the relational database without any ongoing
administration. Amazon RDS provides cost-efficient and resizable capacity, while automating
time-consuming administrative tasks.

Amazon RDS enables you to focus on your application, so you can give applications the
performance, high availability, security, and compatibility that they need. With Amazon RDS, your
primary focus is your data and optimizing your application.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 12
AWS Training and Certification Module 8: Databases

From on-premises databases to Amazon RDS


Database in Amazon Elastic
Database in Amazon RDS
On-premises database Compute Cloud (Amazon
or Amazon Aurora
EC2)

Application optimization Application optimization Application optimization


Scaling Scaling Scaling
High Availability High Availability High Availability
Database backups Database backups Database backups
Database software Database software Database software
patches patches AWS patches
Database software installs Database software installs provides Database software installs
Operation system Operation system Operation system
patches patches patches
Operating system install Operating system install Operating system install
Server maintenance Server maintenance Server maintenance
AWS
Rack and stack servers Rack and stack servers Rack and stack servers
provides
Power, HVAC, network Power, HVAC, network Power, HVAC, network

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 9

What does the term managed services mean?

When your database is on premises, the database administrator is responsible for everything.
Database administration tasks include optimizing applications and queries; setting up the
hardware; patching the hardware; setting up networking and power; and managing heating,
ventilation, and air conditioning (HVAC).

If you move to a database that runs on an Amazon Elastic Compute Cloud (Amazon EC2)
instance, you no longer need to manage the underlying hardware or handle data center
operations. However, you are still responsible for patching the OS and handling all software and
backup operations.

If you set up your database on Amazon RDS or Amazon Aurora, you reduce your administrative
responsibilities. By moving to the cloud, you can automatically scale your database, enable high
availability, manage backups, and perform patching. Thus, you can focus on what really matters
most—optimizing your application.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 13
AWS Training and Certification Module 8: Databases

Managed services responsibilities


You manage:
• Application optimization

AWS manages:
• OS installation and patches
• Database software installation and patches
• Database backups
• High availability
• Scaling
• Power and racking and stacking servers Amazon RDS
• Server maintenance

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 10

With Amazon RDS, you manage your application optimization. AWS manages installing and
patching the operating system, installing and patching the database software, automatic backups,
and high availability.

AWS also scales resources, manages power and servers, and performs maintenance.

Offloading these operations to the managed Amazon RDS service reduces your operational
workload and the costs that are associated with your relational database. You will now go
through a brief overview of the service and a few potential use cases.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 14
AWS Training and Certification Module 8: Databases

Amazon RDS DB instances

Amazon RDS
=
MySQL

DB Instance Class Amazon Aurora


• CPU

M
• Memory Microsoft SQL Server
• Network performance
PostgreSQL
Amazon RDS DB DB Instance Storage
main instance MariaDB
• Magnetic
• General Purpose (solid state drive, or SSD)
• Provisioned IOPS Oracle

DB engines

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 11

The basic building block of Amazon RDS is the database instance. A database instance is an
isolated database environment that can contain multiple user-created databases. It can be
accessed by using the same tools and applications that you use with a standalone database
instance. The resources in a database instance are determined by its database instance class, and
the type of storage is dictated by the type of disks.

Database instances and storage differ in performance characteristics and price, which enable you
to customize your performance and cost to the needs of your database. When you choose to
create a database instance, you must first specify which database engine to run. Amazon RDS
currently supports six databases: MySQL, Amazon Aurora, Microsoft SQL Server, PostgreSQL,
MariaDB, and Oracle.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 15
AWS Training and Certification Module 8: Databases

Amazon RDS in a virtual private cloud (VPC)

AWS Cloud
Availability Zone

VPC
Public subnet

Internet
Users gateway Amazon EC2

Private subnet

Amazon RDS

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 12

You can run an instance by using Amazon Virtual Private Cloud (Amazon VPC). When you use a
virtual private cloud (VPC), you have control over your virtual networking environment.

You can select your own IP address range, create subnets, and configure routing and access
control lists (ACLs). The basic functionality of Amazon RDS is the same whether or not it runs in a
VPC. Usually, the database instance is isolated in a private subnet and is only made directly
accessible to indicated application instances. Subnets in a VPC are associated with a single
Availability Zone, so when you select the subnet, you are also choosing the Availability Zone (or
physical location) for your database instance.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 16
AWS Training and Certification Module 8: Databases

High availability with Multi-AZ deployment (1 of 2)


AWS Cloud
Availability Zone 1 Availability Zone 2

VPC
Public subnet

Amazon EC2 Application

Private subnet Private subnet

M Synchronous S
Amazon RDS
RDS Standby
instance
instance

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 13

One of the most powerful features of Amazon RDS is the ability to configure your database
instance for high availability with a Multi-AZ deployment. After a Multi-AZ deployment is
configured, Amazon RDS automatically generates a standby copy of the database instance in
another Availability Zone within the same VPC. After seeding the database copy, transactions are
synchronously replicated to the standby copy. Running a database instance in a Multi-AZ
deployment can enhance availability during planned system maintenance, and it can help protect
your databases against database instance failure and Availability Zone disruption.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 17
AWS Training and Certification Module 8: Databases

High availability with Multi-AZ deployment (2 of 2)


AWS Cloud
Availability Zone 1 Availability Zone 2

VPC
Public subnet

Amazon EC2 Application

Private subnet Private subnet

M Synchronous S
Amazon RDS
RDS Standby
instance
instance

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 14

Therefore, if the main database instance fails in a Multi-AZ deployment, Amazon RDS
automatically brings the standby database instance online as the new main instance. The
synchronous replication minimizes the potential for data loss. Because your applications
reference the database by name by using the Amazon RDS Domain Name System (DNS) endpoint,
you don't need to change anything in your application code to use the standby copy for failover.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 18
AWS Training and Certification Module 8: Databases

Amazon RDS read replicas


Features AWS Cloud
Availability Zone
• Offers asynchronous replication
VPC
• Can be promoted to primary if Public subnet

needed
Amazon EC2 Application

Functionality Private subnet

• Use for read-heavy database


P R
workloads Amazon RDS Read replica
primary instance
• Offload read queries instance

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 15

Amazon RDS also supports the creation of read replicas for MySQL, MariaDB, PostgreSQL, and
Amazon Aurora. Updates that are made to the source database instance are asynchronously
copied to the read replica instance. You can reduce the load on your source database instance by
routing read queries from your applications to the read replica. Using read replicas, you can also
scale out beyond the capacity constraints of a single database instance for read-heavy database
workloads. Read replicas can also be promoted to become the primary database instance, but
this requires manual action because of asynchronous replication.

Read replicas can be created in a different Region than the primary database. This feature can
help satisfy disaster recovery requirements or reduce latency by directing reads to a read replica
that is closer to the user.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 19
AWS Training and Certification Module 8: Databases

Use cases

✓ High throughput
Web and mobile applications ✓ Massive storage scalability
✓ High availability

✓ Low-cost database
Ecommerce applications ✓ Data security
✓ Fully managed solution

✓ Rapidly grow capacity


Mobile and online games ✓ Automatic scaling
✓ Database monitoring
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 16

Amazon RDS works well for web and mobile applications that need a database with high
throughput, massive storage scalability, and high availability. Because Amazon RDS does not have
any licensing constraints, it fits the variable usage pattern of these applications. For small and
large ecommerce businesses, Amazon RDS provides a flexible, secure, and low-cost database
solution for online sales and retailing. Mobile and online games require a database platform with
high throughput and availability. Amazon RDS manages the database infrastructure, so game
developers do not need to worry about provisioning, scaling, or monitoring database servers.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 20
AWS Training and Certification Module 8: Databases

When to Use Amazon RDS


Use Amazon RDS when your • Do not use Amazon RDS when your
application requires: application requires:
• Complex transactions or complex • Massive read/write rates (for
queries example, 150,000 write/second)

• A medium to high query or write • Sharding due to high data size or


rate – Up to 30,000 IOPS (15,000 throughput demands
reads + 15,000 writes) • Simple GET or PUT requests and
queries that a NoSQL database can
• No more than a single worker node handle
or shard
• Relational database management
• High durability system (RDBMS) customization

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 17

Use Amazon RDS when your application requires:


• Complex transactions or complex queries
• A medium to high query or write rate – up to 30,000 IOPS (15,000 reads + 15,000 writes)
• No more than a single worker node or shard
• High durability

Do not use Amazon RDS when your application requires:


• Massive read/write rates (for example 150,000 writes per second)
• Sharding due to high data size or throughput demands
• Simple GET or PUT requests and queries that a NoSQL database can handle
• Or, relational database management system (RDBMS) customization

For circumstances when you should not use Amazon RDS, consider either using a NoSQL
database solution (such as DynamoDB) or running your relational database engine on Amazon
EC2 instances instead of Amazon RDS (which will provide you with more options for customizing
your database).

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 21
AWS Training and Certification Module 8: Databases

Amazon RDS: Clock-hour billing and database characteristics


Clock-hour billing –
• Resources incur charges when running

Database characteristics –
• Physical capacity of database:
• Engine
• Size
• Memory class

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 18

When you begin to estimate the cost of Amazon RDS, you must consider the clock hours of
service time, which are resources that incur charges when they are running (for example, from
the time you launch a database instance until you terminate the instance).

Database characteristics should also be considered. The physical capacity of the database you
choose will affect how much you are charged. Database characteristics vary depending on the
database engine, size, and memory class.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 22
AWS Training and Certification Module 8: Databases

Amazon RDS: DB purchase type and multiple DB instances

DB purchase type –
• On-Demand Instances
• Compute capacity by the hour
• Reserved Instances
• Low, one-time, upfront payment for database instances that are
reserved with a 1-year or 3-year term

Number of DB instances –
• Provision multiple DB instances to handle peak loads

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 19

Consider the database purchase type. When you use On-Demand Instances, you pay for compute
capacity for each hour that your database instance runs, with no required minimum
commitments. With Reserved Instances, you can make a low, one-time, upfront payment for each
database instance you want to reserve for a 1-year or 3-year term.

Also, you must consider the number of database instances. With Amazon RDS, you can provision
multiple database instances to handle peak loads.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 23
AWS Training and Certification Module 8: Databases

Amazon RDS: Storage


Provisioned storage –
• No charge
• Backup storage of up to 100 percent of database storage for an active
database
• Charge (GB/month)
• Backup storage for terminated DB instances

Additional storage –
• Charge (GB/month)
• Backup storage in addition to provisioned storage
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 20

Consider provisioned storage. There is no additional charge for backup storage of up to 100
percent of your provisioned database storage for an active database instance. After the database
instance is terminated, backup storage is billed per GB, per month.

Also consider the amount of backup storage in addition to the provisioned storage amount,
which is billed per GB, per month.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 24
AWS Training and Certification Module 8: Databases

Amazon RDS: Deployment type and data transfer


Requests –
• The number of input and output requests that are made to the
database

Deployment type—Storage and I/0 charges vary, depending on whether you


deploy to –
• Single Availability Zone
• Multiple Availability Zones

Data transfer –
• No charge for inbound data transfer
• Tiered charges for outbound data transfer

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 21

Also consider the number of input and output requests that are made to the database.

Consider the deployment type. You can deploy your DB instance to a single Availability Zone
(which is analogous to a standalone data center) or to multiple Availability Zones (which is
analogous to a secondary data center for enhanced availability and durability). Storage and I/O
charges vary, depending on the number of Availability Zones that you deploy to.

Finally, consider data transfer. Inbound data transfer is free, and outbound data transfer costs are
tiered.

Depending on the needs of your application, it’s possible to optimize your costs for Amazon RDS
database instances by purchasing Reserved Instances. To purchase Reserved Instances, you make
a low, one-time payment for each instance that you want to reserve. As a result, you receive a
significant discount on the hourly usage charge for that instance.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 25
AWS Training and Certification Module 8: Databases

Recorded demo:
Amazon RDS
console

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 22

Now, take a moment to watch the Amazon RDS console demonstration. The demonstration
shows how to perform the following tasks using the AWS Management Console:
• Configure an Amazon RDS installation running the MySQL database engine.
• Connect to the database using a MySQL client.

You can find this video within the module 8 section of the course with the title: Console
Demonstration - RDS . If you are unable to locate this video demonstration please reach out to
your educator for assistance.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 26
AWS Training and Certification Module 8: Databases

Build Your DB
Server and
Interact with
Your DB Using
an App

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 23

You will now complete Lab 5: Build Your DB Server and Interact with Your DB Using an App.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 27
AWS Training and Certification Module 8: Databases

Lab 5: Scenario
This lab is designed to show you how to use an AWS managed
database instance to solve a need for a relational database.
AWS Cloud
Availability Zone A Availability Zone B
VPC: 10.0.0.0/16
Public subnet 1: Public subnet 2:
10.0.0.0/24 10.0.2.0/24
Internet Security group
Internet NAT Web
gateway gateway Server

Private subnet 1: Private subnet 2:


10.0.1.0/24 10.0.3.0/24

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 24

This lab is designed to show you how to use an AWS managed database instance to solve a need
for a relational database. With Amazon RDS, you can set up, operate, and scale a relational
database in the cloud. It provides cost-efficient and resizable capacity while managing time-
consuming database administration tasks, which enables you to focus on your applications and
your business. Amazon RDS provides six familiar database engines to choose from: Amazon
Aurora, Oracle, Microsoft SQL Server, PostgreSQL, MySQL, and MariaDB.

Amazon RDS Multi-AZ deployments provide enhanced availability and durability for DB instances,
which make them a good fit for production database workloads. When you provision a Multi-AZ
DB instance, Amazon RDS automatically creates a primary DB instance and synchronously
replicates the data to a standby instance in a different Availability Zone.

After completing this lab, you should be able to:


• Launch an Amazon RDS DB instance with high availability.
• Configure the DB instance to permit connections from your web server.
• Open a web application and interact with your database.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 28
AWS Training and Certification Module 8: Databases

Lab 5: Tasks

Security group
Create a VPC security group.
Private subnet
Create a DB subnet group.

Create an Amazon RDS DB instance and


Amazon RDS
interact with your database.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 25

Your goal in completing this lab is to:


• Create a VPC security group.
• Create a DB subnet group.
• Create an Amazon RDS DB instance and interact with your database.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 29
AWS Training and Certification Module 8: Databases

Lab 5: Final product

AWS Cloud
Availability Zone A Availability Zone B
VPC: 10.0.0.0/16
Public subnet 1: Public subnet 2:
10.0.1.0/24 10.0.2.0/24
Internet Security group
NAT Web
Internet
gateway gateway Server

Private subnet 1: Private subnet 2:


10.0.3.0/24 10.0.4.0/24
Security group Security group
RDS DB RDS DB
Primary Secondary

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 26

In this lab, you:


• Launched an Amazon RDS DB instance with high availability.
• Configured the DB instance to permit connections from your web server.
• Opened a web application and interacted with your database

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 30
AWS Training and Certification Module 8: Databases

~ 30 minutes

Begin Lab 5: Build


your DB server and
interact with your
DB using an
application

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 27

It is now time to start the lab.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 31
AWS Training and Certification Module 8: Databases

Lab debrief:
key takeaways

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 28

In this lab you:


• Created a VPC security group.
• Created a DB subnet group.
• Created an Amazon RDS DB instance
• Interacted with your database

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 32
AWS Training and Certification Module 8: Databases

Section 1 key • With Amazon RDS, you can set up,


operate, and scale relational databases in
takeaways the cloud.
• Features –
• Managed service
• Accessible via the console, AWS Command Line Interface
(AWS CLI), or application programming interface (API)
calls
• Scalable (compute and storage)
• Automated redundancy and backup are available
• Supported database engines:
• Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, Microsoft SQL
Server

29
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Amazon RDS is a web service that makes it easy to set up, operate, and scale a relational
database in the cloud. It provides cost-efficient and resizable capacity while managing time-
consuming database administration tasks so you can focus on your applications and your
business. Features include that it is a managed service, and that it can be accessed via the
console, AWS Command Line Interface (AWS CLI), or application programming interface (API)
calls. Amazon RDS is scalable for compute and storage, and automated redundancy and backup is
available. Supported database engines include Amazon Aurora, PostgreSQL, MySQL, MariaDB,
Oracle, and Microsoft SQL Server.

Amazon RDS supports demanding database applications. You can choose between two solid state
drive (SSD)-backed storage options: one option is optimized for high-performance Online
Transactional Processing (OLTP) applications, and the other option works well for cost-effective,
general-purpose use.

With Amazon RDS, you can scale your database’s compute and storage resources with no
downtime. Amazon RDS runs on the same highly reliable infrastructure that is used by other AWS
services. It also enables you to run your database instances and Amazon VPC, which is designed
to provide you with control and security.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 33
AWS Training and Certification Module 8: Databases

Section 2: Amazon DynamoDB


Module 8: Databases

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Welcome to Section 2: Amazon DynamoDB.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 34
AWS Training and Certification Module 8: Databases

Relational versus non-relational databases


Relational (SQL) Non-Relational
Data Storage Rows and columns Key-value, document, graph
Schemas Fixed Dynamic
Focuses on collection of
Querying Uses SQL
documents
Scalability Vertical Horizontal

ISBN Title Author Format {


ISBN: 3111111223439,
Withering Jackson, Title: “Withering Depths”,
Example 3111111223439
Depths Mateo
Paperback
Author: ”Jackson, Mateo”,
Format: “Paperback”
Wang,
3122222223439 Wily Willy Ebook }
Xiulan

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 31

With DynamoDB, this module transitions from relational databases to non-relational databases.
Here is a review of the differences between these two types of databases:

• A relational database (RDB) works with structured data that is organized by tables, records,
and columns. RDBs establish a well-defined relationship between database tables. RDBs use
structured query language (SQL), which is a standard user application that provides a
programming interface for database interaction. Relational databases might have difficulties
scaling out horizontally or working with semistructured data, and might also require many
joins for normalized data.

• A non-relational database is any database that does not follow the relational model that is
provided by traditional relational database management systems (RDBMS). Non-relational
databases have grown in popularity because they were designed to overcome the limitations
of relational databases for handling the demands of variable structured data. Non-relational
databases scale out horizontally, and they can work with unstructured and semistructured
data.

Here is a look at what DynamoDB offers.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 35
AWS Training and Certification Module 8: Databases

What is Amazon DynamoDB?

Fast and flexible NoSQL database service for any scale

• NoSQL database tables

• Virtually unlimited storage

• Items can have differing


attributes

• Low-latency queries

Amazon DynamoDB • Scalable read/write throughput

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 32

DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent,
single-digit-millisecond latency at any scale.

Amazon manages all the underlying data infrastructure for this service and redundantly stores
data across multiple facilities in a native US Region as part of the fault-tolerant architecture. With
DynamoDB, you can create tables and items. You can add items to a table. The system
automatically partitions your data and has table storage to meet workload requirements. There is
no practical limit on the number of items that you can store in a table. For instance, some
customers have production tables that contain billions of items.

One of the benefits of a NoSQL database is that items in the same table can have different
attributes. This gives you the flexibility to add attributes as your application evolves. You can store
newer format items side by side with older format items in the same table without needing to
perform schema migrations.

As your application becomes more popular and as users continue to interact with it, your storage
can grow with your application's needs. All the data in DynamoDB is stored on solid state drives
(SSDs) and its simple query language enables consistent low-latency query performance. In
addition to scaling storage, DynamoDB also enables you to provision the amount of read or write
throughput that you need for your table. As the number of application users grows, DynamoDB
tables can be scaled to handle the increased numbers of read/write requests with manual
provisioning. Alternatively, you can enable automatic scaling so that DynamoDB monitors the
load on the table and automatically increases or decreases the provisioned throughput.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 36
AWS Training and Certification Module 8: Databases

Some additional key features include global tables that enable you to automatically replicate across
your choice of AWS Regions, encryption at rest, and item Time-to-Live (TTL).

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 37
AWS Training and Certification Module 8: Databases

Amazon DynamoDB core components


• Tables, items, and attributes are the core DynamoDB components

• DynamoDB supports two different kinds of primary keys: Partition


key and partition and sort key

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 33

The core DynamoDB components are tables, items, and attributes.


• A table is a collection of data.
• Items are a group of attributes that is uniquely identifiable among all the other items.
• Attributes are a fundamental data element, something that does not need to be broken down
any further.

DynamoDB supports two different kinds of primary keys.


The partition key is a simple primary key, which is composed of one attribute called the sort key.
The partition key and sort key are also known as the composite primary key, which is composed
of two attributes.

To learn more about how DynamoDB works, see table item attributes at
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComp
onents.html#HowItWorks.CoreComponents.TablesItemsAttributes.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 38
AWS Training and Certification Module 8: Databases

Partitioning

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 34

As data grows, table data is partitioned and indexed by the primary key.

You can retrieve data from a DynamoDB table in two different ways:
• In the first method, the query operation takes advantage of partitioning to effectively locate
items by using the primary key.
• The second method is via a scan, which enables you to locate items in the table by matching
conditions on non-key attributes. The second method gives you the flexibility to locate items
by other attributes. However, the operation is less efficient because DynamoDB will scan
through all the items in the table to find the ones that match your criteria.

For accessibility: Partitioning allows large tables to be scanned and queried quickly. As data
grows, table is partitioned by key. QUERY by Key to find items by any attribute. End of
accessibility description.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 39
AWS Training and Certification Module 8: Databases

Items in a table must have a key

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 35

To take full advantage of query operations and DynamoDB, it's important to think about the key
that you use to uniquely identify items in the DynamoDB table. You can set up a simple primary
key that is based on a single attribute of the data values with a uniform distribution, such as the
Globally Unique Identifier (GUID) or other random identifiers.

For example, if you wanted to model a table with products, you could use some attributes like
the product ID. Alternatively, you can specify a compound key, which is composed of a partition
key and a secondary key. In this example, if you had a table with books, you might use the
combination of author and title to uniquely identify table items. This method could be useful if
you expect to frequently look at books by author because you could then use query.

For accessibility: The two different types of keys. A single key means the data is identified by an
item in the data that uniquely identifies each record. A compound key is made up of a partition
key and a second key that can be used for sorting data. End of accessibility description.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 40
AWS Training and Certification Module 8: Databases

Section 2 key Amazon DynamoDB:


takeaways • Runs exclusively on SSDs.
• Supports document and key-value store models.
• Replicates your tables automatically across your
choice of AWS Regions.
• Works well for mobile, web, gaming, adtech, and
Internet of Things (IoT) applications.
• Is accessible via the console, the AWS CLI, and API
calls.
• Provides consistent, single-digit millisecond
latency at any scale.
• Has no limits on table size or throughput.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 36

DynamoDB runs exclusively on SSDs, and it supports document and key-value store models.

DynamoDB works well for mobile, web, gaming, ad tech, and Internet of Things (IoT) applications.
It’s accessible via the console, the AWS CLI, and API calls.

The ability to scale your tables in terms of both storage and provision throughput makes
DynamoDB a good fit for structured data from the web, mobile, and IoT applications. For
instance, you might have a large number of clients that continuously generate data and make
large numbers of requests per second. In this case, the throughput scaling of DynamoDB enables
consistent performance for your clients. DynamoDB is also used in latency-sensitive applications.
The predictable query performance—even in large tables—makes it useful for cases where
variable latency could cause significant impact to the user experience or to business goals, such
as adtech or gaming.

The DynamoDB Global Tables feature reduces the work of replicating data between Regions and
resolving update conflicts. It replicates your DynamoDB tables automatically across your choice of
AWS Regions. Global Tables can help applications stay available and performant for business
continuity.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 41
AWS Training and Certification Module 8: Databases

Recorded demo:
Amazon
DynamoDB
console

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 37

Now, take a moment to watch the Amazon DynamoDB demonstration. The recording runs a little
over 2 minutes, and it reinforces many of the concepts that were discussed in this section of the
module.

The demonstration shows how to create a table running in Amazon DynamoDB by using the AWS
Management Console. It also demonstrates how to interact with the table using the AWS
Command Line Interface. The demonstration shows how you can query the table, and add data to
the table.

You can find this video within the module 8 section of the course with the title: Console
Demonstration - DynamoDB . If you are unable to locate this video demonstration please reach
out to your educator for assistance.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 42
AWS Training and Certification Module 8: Databases

Amazon DynamoDB demonstration

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 38

Review the demonstration: Amazon DynamoDB console demo.

You can access this recorded demonstration in the learning management system.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 43
AWS Training and Certification Module 8: Databases

Section 3: Amazon Redshift


Module 8: Databases

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Welcome to Section 3: Amazon Redshift.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 44
AWS Training and Certification Module 8: Databases

Amazon Redshift

Amazon Redshift

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 40

Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective
to analyze all your data by using standard SQL and your existing business intelligence (BI) tools.
Here is a look at Amazon Redshift and how you can use it for analytic applications.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 45
AWS Training and Certification Module 8: Databases

Introduction to Amazon Redshift

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 41

Analytics is important for businesses today, but building a data warehouse is complex and
expensive. Data warehouses can take months and significant financial resources to set up.

Amazon Redshift is a fast and powerful, fully managed data warehouse that is simple and cost-
effective to set up, use, and scale. It enables you to run complex analytic queries against
petabytes of structured data by using sophisticated query optimization, columnar storage on
high-performance local disks, and massively parallel data processing. Most results come back in
seconds.

You will next review a slightly more detailed exploration of key Amazon Redshift features and
some common use cases.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 46
AWS Training and Certification Module 8: Databases

Parallel processing architecture

SQL Clients/BI tools

Amazon Redshift
Leader node

Dense compute
node Dense compute nodes
• Virtual Core
• RAM
• Local disk Amazon DynamoDB
Amazon S3

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 42

The leader node manages communications with client programs and all communication with
compute nodes. It parses and develops plans to carry out database operations—specifically, the
series of steps that are needed to obtain results for complex queries. The leader node compiles
code for individual elements of the plan and assigns the code to individual compute nodes. The
compute nodes run the compiled code and send intermediate results back to the leader node for
final aggregation.

Like other AWS services, you only pay for what you use. You can get started for as little as 25
cents per hour and, at scale, Amazon Redshift can deliver storage and processing for
approximately $1,000 dollars per terabyte per year (with 3-Year Partial Upfront Reserved Instance
pricing).

The Amazon Redshift Spectrum feature enables you to run queries against exabytes of data
directly in Amazon S3.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 47
AWS Training and Certification Module 8: Databases

Automation and scaling

Manage

Monitor

Scale

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 43

It is straightforward to automate most of the common administrative tasks to manage, monitor,


and scale your Amazon Redshift cluster—which enables you to focus on your data and your
business.

Scalability is intrinsic in Amazon Redshift. Your cluster can be scaled up and down as your needs
change with a few clicks in the console.

Security is the highest priority for AWS. With Amazon Redshift, security is built in, and it is
designed to provide strong encryption of your data both at rest and in transit.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 48
AWS Training and Certification Module 8: Databases

Compatibility

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 44

Finally, Amazon Redshift is compatible with the tools that you already know and use. Amazon
Redshift supports standard SQL. It also provides high-performance Java Database Connectivity
(JDBC) and Open Database Connectivity (ODBC) connectors, which enable you to use the SQL
clients and BI tools of your choice.

Next, you will review some common Amazon Redshift use cases.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 49
AWS Training and Certification Module 8: Databases

Amazon Redshift use cases (1 of 2)


• Enterprise data warehouse (EDW)
• Migrate at a pace that customers are comfortable with
• Experiment without large upfront cost or commitment
• Respond faster to business needs

• Big data
• Low price point for small customers
• Managed service for ease of deployment and maintenance
• Focus more on data and less on database management

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 45

This slide discusses some Amazon Redshift use cases.

Many customers migrate their traditional enterprise data warehouses to Amazon Redshift with
the primary goal of agility. Customers can start at whatever scale they want and experiment with
their data without needing to rely on complicated processes with their IT departments to procure
and prepare their software.

Big data customers have one thing in common: massive amounts of data that stretch their
existing systems to a breaking point. Smaller customers might not have the resources to procure
the hardware and expertise that is needed to run these systems. With Amazon Redshift, smaller
customers can quickly set up and use a data warehouse at a comparatively low price point.

As a managed service, Amazon Redshift handles many of the deployment and ongoing
maintenance tasks that often require a database administrator. This enables customers to focus
on querying and analyzing their data.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 50
AWS Training and Certification Module 8: Databases

Amazon Redshift use cases (2 of 2)


• Software as a service (SaaS)
• Scale the data warehouse capacity as demand grows
• Add analytic functionality to applications
• Reduce hardware and software costs

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 46

Software as a service (SaaS) customers can take advantage of the scalable, easy-to-manage
features that Amazon Redshift provides. Some customers use the Amazon Redshift to provide
analytic capabilities to their applications. Some users deploy a cluster per customer, and use
tagging to simplify and manage their service level agreements (SLAs) and billing. Amazon Redshift
can help you reduce hardware and software costs.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 51
AWS Training and Certification Module 8: Databases

Section 3 key Amazon Redshift features:


takeaways • Fast, fully managed data warehouse
service
• Easily scale with no downtime
• Columnar storage and parallel processing
architectures
• Automatically and continuously monitors
cluster
• Encryption is built in

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 47

In summary, Amazon Redshift is a fast, fully managed data warehouse service. As a business
grows, you can easily scale with no downtime by adding more nodes. Amazon Redshift
automatically adds the nodes to your cluster and redistributes the data for maximum
performance.

Amazon Redshift is designed to consistently deliver high performance. Amazon Redshift uses
columnar storage and a massively parallel processing architecture. These features parallelize and
distribute data and queries across multiple nodes. Amazon Redshift also automatically monitors
your cluster and backs up your data so that you can easily restore if needed. Encryption is built
in—you only need to enable it.

To learn more about Amazon Redshift, see https://aws.amazon.com/redshift/.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 52
AWS Training and Certification Module 8: Databases

Section 4: Amazon Aurora


Module 8: Databases

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introducing Section 4: Amazon Aurora.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 53
AWS Training and Certification Module 8: Databases

Amazon Aurora

• Enterprise-class relational database

• Compatible with MySQL or PostgreSQL

• Automate time-consuming tasks (such

as provisioning, patching, backup,


Amazon Aurora recovery, failure detection, and repair).

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 49

Amazon Aurora is a MySQL- and PostgreSQL-compatible relational database that is built for the
cloud. It combines the performance and availability of high-end commercial databases with the
simplicity and cost-effectiveness of open-source databases. Using Amazon Aurora can reduce
your database costs while improving the reliability and availability of the database. As a fully
managed service, Aurora is designed to automate time-consuming tasks like provisioning,
patching, backup, recovery, failure detection, and repair.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 54
AWS Training and Certification Module 8: Databases

Amazon Aurora service benefits

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 50

This slide covers some of the benefits of Amazon Aurora. It is highly available and it offers a fast,
distributed storage subsystem. Amazon Aurora is straightforward to set up and uses SQL queries.
It is designed to have drop-in compatibility with MySQL and PostgreSQL database engines so that
you can use most of your existing database tools with little or no change.

Amazon Aurora is a pay-as-you-go service, which means that you only pay for the services and
features that you use. It’s a managed service that integrates with features such as AWS Database
Migration Service (AWS DMS) and the AWS Schema Conversion Tool. These features are designed
to help you move your dataset into Amazon Aurora.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 55
AWS Training and Certification Module 8: Databases

High availability

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 51

Why might you use Amazon Aurora over other options, like SQL with Amazon RDS? Most of that
decision involves the high availability and resilient design that Amazon Aurora offers.

Amazon Aurora is designed to be highly available: it stores multiple copies of your data across
multiple Availability Zones with continuous backups to Amazon S3. Amazon Aurora can use up to
15 read replicas can be used to reduce the possibility of losing your data. Additionally, Amazon
Aurora is designed for instant crash recovery if your primary database becomes unhealthy.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 56
AWS Training and Certification Module 8: Databases

Resilient design

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 52

After a database crash, Amazon Aurora does not need to replay the redo log from the last
database checkpoint. Instead, it performs this on every read operation. This reduces the restart
time after a database crash to less than 60 seconds in most cases.

With Amazon Aurora, the buffer cache is moved out of the database process, which makes it
available immediately at restart. This reduces the need for you to throttle access until the cache
is repopulated to avoid brownouts.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 57
AWS Training and Certification Module 8: Databases

Section 4 key Amazon Aurora features:


takeaways • High performance and scalability
• High availability and durability
• Multiple levels of security
• Compatible with MySQL and PostgreSQL
• Fully managed

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 53

In summary, Amazon Aurora is a highly available, performant, and cost-effective managed


relational database.

Aurora offers a distributed, high-performance storage subsystem. Using Amazon Aurora can
reduce your database costs while improving the reliability of the database.

Aurora is also designed to be highly available. It has fault-tolerant and self-healing storage built
for the cloud. Aurora replicates multiple copies of your data across multiple Availability Zones,
and it continuously backs up your data to Amazon S3.

Multiple levels of security are available, including network isolation by using Amazon VPC;
encryption at rest by using keys that you create and control through AWS Key Management
Service (AWS KMS); and encryption of data in transit by using Secure Sockets Layer (SSL).

The Amazon Aurora database engine is compatible with existing MySQL and PostgreSQL open
source databases, and adds compatibility for new releases regularly.

Finally, Amazon Aurora is fully managed by Amazon RDS. Aurora automates database
management tasks, such as hardware provisioning, software patching, setup, configuration, or
backups.

To learn more about Amazon Aurora, see https://aws.amazon.com/rds/aurora/.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 58
AWS Training and Certification Module 8: Databases

The right tool for the right job

What are my requirements?

Enterprise-class relational database Amazon RDS

Fast and flexible NoSQL database service for any scale Amazon DynamoDB

Operating system access or application features that Databases on Amazon


are not supported by AWS database services EC2
Specific case-driven requirements (machine learning, AWS purpose-built
data warehouse, graphs) database services
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 54

As you saw in this module, the cloud continues to drive down the cost of storage and compute. A
new generation of applications has emerged, which created a new set of requirements for
databases. These applications need databases to store terabytes to petabytes of new types of
data, provide access to the data with millisecond latency, process millions of requests per second,
and scale to support millions of users anywhere in the world. To support these requirements, you
need both relational and non-relational databases that are purpose-built to handle the specific
needs of your applications. AWS offers a broad range of databases that are built for your specific
application use cases.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 59
AWS Training and Certification Module 8: Databases

Database case study activity (1 of 3)


Case 1: A data protection and management company that provides services to enterprises. They must provide database
services for over 55 petabytes of data. They have two types of data that require a database solution. First, they need a
relational database store for configuration data. Second, they need a store for unstructured metadata to support a de-
duplication service. After the data is de-duplicated, it is stored in Amazon S3 for quick retrieval, and eventually moved to
Amazon S3 Glacier for long-term storage. The following diagram illustrates their architecture.

AWS Cloud
??? Metadata database
???
Configuration
Corporate
data
database
center
Amazon EC2 Amazon Simple Amazon Simple
Storage Service Storage Service
(Amazon S3) Glacier

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 55

In this activity, you will review one of three business scenarios that were taken from actual AWS
customers. Break into groups of four or five.

Review the assigned case study. Create a presentation that describes the best database solution
for the organization that is described in your group’s case. Your presentation should include the
key factors that you considered when you selected the database technology, in addition to any
factors that could change your recommendation.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 60
AWS Training and Certification Module 8: Databases

Database case study activity (2 of 3)


Case 2: A commercial shipping company that uses an on-premises legacy data management system. They
must migrate to a serverless ecosystem while they continue to use their existing database system, which is
based on Oracle. They are also in the process of decomposing their highly structured relational data into
semistructured data. The following diagram illustrates their architecture .

Database

???
AWS AppSync
AWS Lambda AWS Lambda

Corporate Oracle AWS Lambda


data center database Amazon Simple
Notification Service
(Amazon SNS)

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 56

Review the assigned case study. Create a presentation that describes the best database solution
for the organization that is described in your group’s case. Your presentation should include the
key factors that you considered when you selected the database technology, in addition to any
factors that could change your recommendation.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 61
AWS Training and Certification Module 8: Databases

Database case study activity 3


Case 3: An online payment processing company that processes over1 million transactions per day. They must
provide services to ecommerce customers who offer flash sales (sales that offer greatly reduced prices for a
limited time), where demand can increase by 30 times in a short time period. They use IAM and AWS KMS to
authenticate transactions with financial institutions. They need high throughput for these peak loads. The
following diagram illustrates their architecture.

AWS Cloud

Database
??? AWS Identity and
Access Management
Elastic Load (IAM)
Internet Balancing
Banks
AWS SDK
AWS Key Management
Read replicas Service (AWS KMS)
Amazon EC2
instances

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 57

Review the assigned case study. Create a presentation that describes the best database solution
for the organization that is described in your group’s case. Your presentation should include the
key factors that you considered when you selected the database technology, in addition to any
factors that could change your recommendation.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 62
AWS Training and Certification Module 8: Databases

Module wrap-up
Module 8: Databases

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

It’s now time to review the module, and wrap up with a knowledge check and discussion of a
practice certification exam question.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 63
AWS Training and Certification Module 8: Databases

Module summary
In summary, in this module, you learned how to:
• Explain Amazon Relational Database Service (Amazon RDS)

• Identify the functionality in Amazon RDS

• Explain Amazon DynamoDB

• Identify the functionality in Amazon DynamoDB

• Explain Amazon Redshift

• Explain Amazon Aurora

• Perform tasks in an RDS database, such as launching, configuring, and interacting

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 59

In summary, in this module, you learn how to:


• Explain Amazon Relational Database Service (Amazon RDS)
• Identify the functionality in Amazon RDS
• Explain Amazon DynamoDB
• Identify the functionality in Amazon DynamoDB
• Explain Amazon Redshift
• Explain Amazon Aurora
• Perform tasks in an RDS database, such as launching, configuring, and interacting

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 64
AWS Training and Certification Module 8: Databases

Complete the knowledge check

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 60

The instructor might choose to lead a conversation about the key takeaways from the lab after
you complete it.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 65
AWS Training and Certification Module 8: Databases

Sample exam question


Which of the following is a fully-managed NoSQL database service?

Choice Response

A Amazon Relational Database Service (Amazon RDS)

B Amazon DynamoDB

C Amazon Aurora

D Amazon Redshift

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 61

Look at the answer choices and rule them out based on the keywords.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 66
AWS Training and Certification Module 8: Databases

Sample exam question answer


Which of the following is a fully-managed NoSQL database service?

The correct answer is B.


The keywords in the question are “NoSQL database service”.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 62

The following are the keywords to recognize: “No SQL database service”.

The correct answer is B.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 67
AWS Training and Certification Module 8: Databases

Additional resources
• AWS Database page: https://aws.amazon.com/products/databases/
• Amazon RDS page: https://aws.amazon.com/rds/
• Overview of Amazon database services:
https://docs.aws.amazon.com/whitepapers/latest/aws-
overview/database.html
• Getting started with AWS databases:
https://aws.amazon.com/products/databases/learn/

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 63

If you want to learn more about the topics covered in this module, you might find the following
additional resources helpful:
• AWS Database page: https://aws.amazon.com/products/databases/
• Amazon RDS page: https://aws.amazon.com/rds/
• Overview of Amazon database services:
https://docs.aws.amazon.com/whitepapers/latest/aws-overview/database.html
• Getting started with AWS databases: https://aws.amazon.com/products/databases/learn/

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 68
AWS Training and Certification Module 8: Databases

Thank you

Corrections, feedback, or other questions?


Contact us at https://support.aws.amazon.com/#/contacts/aws-academy.
All trademarks are the property of their owners.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 64

Thank you for completing this module.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 69
AWS Academy Cloud Foundations
Module 06 Student Guide
Version 2.0.12
100-ACCLFO-20-EN-SG
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.

This work may not be reproduced or redistributed, in whole or in part,


without prior written permission from Amazon Web Services, Inc.
Commercial copying, lending, or selling is prohibited.

All trademarks are the property of their owners.


AWS Training and Certification AWS Academy Cloud Foundations

Contents
Module 6: Compute 4

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 3
AWS Training and Certification Module 6: Compute

Module 6: Compute
AWS Academy Cloud Foundations

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Welcome to Module 6: Compute

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 4
AWS Training and Certification Module 6: Compute

Module overview
Topics Activities
• Compute services overview • Amazon EC2 versus Managed Service

• Amazon EC2 • Hands-on with AWS Lambda


• Hands-on with AWS Elastic Beanstalk
• Amazon EC2 cost optimization
• Container services
Demo
• Introduction to AWS Lambda
• Recorded demonstration of Amazon EC2
• Introduction to AWS Elastic Beanstalk

Lab
• Introduction to Amazon EC2

Knowledge check

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 2

This module will address the following topics:


• Compute services overview
• Amazon EC2
• Amazon EC2 cost optimization
• Container services
• Introduction to AWS Lambda
• Introduction to AWS Elastic Beanstalk

Section 2 includes a recorded Amazon EC2 demonstration. The end of this same section includes
a hands-on lab, where you will practice launching an EC2 instance by using the AWS
Management Console. There is also an activity in this section that has you compare the
advantages and disadvantages of running a database deployment on Amazon EC2, versus running
it on Amazon Relational Database Service (RDS).

Section 5 includes a hands-on AWS Lambda activity and section 6 includes a hands-on Elastic
Beanstalk activity.

Finally, you will be asked to complete a knowledge check that will test your understanding of the
key concepts that are covered in this module.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 5
AWS Training and Certification Module 6: Compute

Module objectives
After completing this module, you should be able to:
• Provide an overview of different AWS compute services in the cloud
• Demonstrate why to use Amazon Elastic Compute Cloud (Amazon EC2)
• Identify the functionality in the EC2 console
• Perform basic functions in Amazon EC2 to build a virtual computing
environment
• Identify Amazon EC2 cost optimization elements
• Demonstrate when to use AWS Elastic Beanstalk
• Demonstrate when to use AWS Lambda
• Identify how to run containerized applications in a cluster of managed servers

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 3

After completing this module, you should be able to:


• Provide an overview of different AWS compute services in the cloud
• Demonstrate why to use Amazon Elastic Compute Cloud (Amazon EC2)
• Identify the functionality in the EC2 console
• Perform basic functions in EC2 to build a virtual computing environment
• Identify EC2 cost optimization elements
• Demonstrate when to use AWS Elastic Beanstalk
• Demonstrate when to use AWS Lambda
• Identify how to run containerized applications in a cluster of managed servers

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 6
AWS Training and Certification Module 6: Compute

Section 1: Compute services


overview
Module 6: Compute

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introducing Section 1: Compute services overview.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 7
AWS Training and Certification Module 6: Compute

AWS compute services

Amazon Web Services (AWS) offers many compute services. This module will discuss the highlighted services.

Amazon EC2 Amazon EC2 Amazon Elastic Amazon Elastic VMware Cloud
Auto Scaling Container Registry Container Service on AWS
(Amazon ECR) (Amazon ECS)

AWS Elastic AWS Lambda Amazon Elastic Amazon Lightsail AWS Batch
Beanstalk Kubernetes Service
(Amazon EKS)

AWS Fargate AWS Outposts AWS Serverless


Application Repository

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 5

Amazon Web Services (AWS) offers many compute services. Here is a brief summary of what
each compute service offers:
• Amazon Elastic Compute Cloud (Amazon EC2) provides resizable virtual machines.
• Amazon EC2 Auto Scaling supports application availability by allowing you to define conditions
that will automatically launch or terminate EC2 instances.
• Amazon Elastic Container Registry (Amazon ECR) is used to store and retrieve Docker images.
• Amazon Elastic Container Service (Amazon ECS) is a container orchestration service that
supports Docker.
• VMware Cloud on AWS enables you to provision a hybrid cloud without custom hardware.
• AWS Elastic Beanstalk provides a simple way to run and manage web applications.
• AWS Lambda is a serverless compute solution. You pay only for the compute time that you
use.
• Amazon Elastic Kubernetes Service (Amazon EKS) enables you to run managed Kubernetes on
AWS.
• Amazon Lightsail provides a simple-to-use service for building an application or website.
• AWS Batch provides a tool for running batch jobs at any scale.
• AWS Fargate provides a way to run containers that reduce the need for you to manage servers
or clusters.
• AWS Outposts provides a way to run select AWS services in your on-premises data center.
• AWS Serverless Application Repository provides a way to discover, deploy, and publish
serverless applications.

This module will discuss details of the services that are highlighted on the slide.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 8
AWS Training and Certification Module 6: Compute

Categorizing compute services


Services Key Concepts Characteristics Ease of Use
• Amazon EC2 • Infrastructure as a service • Provision virtual machines that A familiar concept to many
(IaaS) you can manage as you choose IT professionals.
• Instance-based
• Virtual machines
• AWS Lambda • Serverless computing • Write and deploy code that runs A relatively new concept for
• Function-based on a schedule or that can be many IT staff members, but
• Low-cost triggered by events easy to use after you learn
• Use when possible (architect for how.
the cloud)
• Amazon ECS • Container-based computing • Spin up and run jobs more AWS Fargate reduces
• Amazon EKS • Instance-based quickly administrative overhead, but
• AWS Fargate you can use options that give
• Amazon ECR you more control.
• AWS Elastic • Platform as a service (PaaS) • Focus on your code (building Fast and easy to get started.
Beanstalk • For web applications your application)
• Can easily tie into other
services—databases, Domain
Name System (DNS), etc.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 6

You can think of each AWS compute service as belonging to one of four broad categories: virtual
machines (VMs) that provide infrastructure as a service (IaaS), serverless, container-based, and
platform as a service (PaaS).

Amazon EC2 provides virtual machines, and you can think of it as infrastructure as a service
(IaaS). IaaS services provide flexibility and leave many of the server management responsibilities
to you. You choose the operating system, and you also choose the size and resource capabilities
of the servers that you launch. For IT professionals who have experience using on-premises
computing, virtual machines are a familiar concept. Amazon EC2 was one of the first AWS
services, and it remains one of the most popular services.

AWS Lambda is a zero-administration compute platform. AWS Lambda enables you to run code
without provisioning or managing servers. You pay only for the compute time that is consumed.
This serverless technology concept is relatively new to many IT professionals. However, it is
becoming more popular because it supports cloud-native architectures, which enable massive
scalability at a lower cost than running servers 24/7 to support the same workloads.

Container-based services—including Amazon Elastic Container Service, Amazon Elastic


Kubernetes Service, AWS Fargate, and Amazon Elastic Container Registry—enable you to run
multiple workloads on a single operating system (OS). Containers spin up more quickly than
virtual machines, thus offering responsiveness. Container-based solutions continue to grow in
popularity.

Finally, AWS Elastic Beanstalk provides a platform as a service (PaaS). It facilitates the quick
deployment of applications that you create by providing all the application services that you
need. AWS manages the OS, the application server, and the other infrastructure components so
© 2022 Amazon
thatWeb
youServices, Inc. oron
can focus its affiliates.
developing All rights reserved.
your application code. 9
AWS Training and Certification Module 6: Compute

Choosing the optimal compute service


• The optimal compute service or services that you use will depend on
your use case
• Some aspects to consider –
• What is your application design?
• What are your usage patterns?
• Which configuration settings will you want to manage?
• Selecting the wrong compute solution for an architecture can lead to
lower performance efficiency
• A good starting place—Understand the available compute options

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 7

AWS offers many compute services because different use cases benefit from different compute
environments. The optimal compute service or services that you use will depend on your use
case.

Often, the compute architecture that you use is determined by legacy code. However, that does
not mean that you cannot evolve the architecture to take advantage of proven cloud-native
designs.

Best practices include:


• Evaluate the available compute options
• Understand the available compute configuration options
• Collect computer-related metrics
• Use the available elasticity of resources
• Re-evaluate compute needs based on metrics

Sometimes, a customer will start with one compute solution and decide to change the design
based on their analysis of metrics. If you are interested in seeing an example of how a customer
modified their choice of compute services for a particular use case, view this Inventory Tracking
solution video at https://www.youtube.com/watch?v=zr3Kib0i-
OQ&feature=youtu.be&did=ta_card&trk=ta_card.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 10
AWS Training and Certification Module 6: Compute

Section 2: Amazon EC2


Module 6: Compute

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introducing Section 2: Amazon EC2.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 11
AWS Training and Certification Module 6: Compute

Amazon Elastic Compute Cloud (Amazon EC2)

Example uses of
Amazon EC2 instances

✓ Application server
✓ Web server
✓ Database server
✓ Game server
✓ Mail server
✓ Media server
✓ Catalog server
Photo by Taylor Vick on Unsplash
✓ File server
✓ Computing server
✓ Proxy server

Photo by panumas nikhomkhai from Pexels

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 9

Running servers on-premises is an expensive undertaking. Hardware must be procured, and this
procurement can be based on project plans instead of the reality of how the servers are used.
Data centers are expensive to build, staff, and maintain. Organizations also need to permanently
provision a sufficient amount of hardware to handle traffic spikes and peak workloads. After
traditional on-premises deployments are built, server capacity might be unused and idle for a
significant portion of the time that the servers are running, which is wasteful.

Amazon Elastic Compute Cloud (Amazon EC2) provides virtual machines where you can host the
same kinds of applications that you might run on a traditional on-premises server. It provides
secure, resizable compute capacity in the cloud. EC2 instances can support a variety of
workloads. Common uses for EC2 instances include, but are not limited to:
• Application servers
• Web servers
• Database servers
• Game servers
• Mail servers
• Media servers
• Catalog servers
• File servers
• Computing servers
• Proxy servers

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 12
AWS Training and Certification Module 6: Compute

Amazon EC2 overview


• Amazon Elastic Compute Cloud (Amazon EC2)
• Provides virtual machines—referred to as EC2
instances—in the cloud.
• Gives you full control over the guest operating system
(Windows or Linux) on each instance.
• You can launch instances of any size into an
Availability Zone anywhere in the world.
• Launch instances from Amazon Machine Images (AMIs).
Amazon
EC2 • Launch instances with a few clicks or a line of code, and
they are ready in minutes.
• You can control traffic to and from instances.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 10

The EC2 in Amazon EC2 stands for Elastic Compute Cloud:


• Elastic refers to the fact that you can easily increase or decrease the number of servers you run
to support an application automatically, and you can also increase or decrease the size of
existing servers.
• Compute refers to reason why most users run servers in the first place, which is to host
running applications or process data—actions that require compute resources, including
processing power (CPU) and memory (RAM).
• Cloud refers to the fact that the EC2 instances that you run are hosted in the cloud.

Amazon EC2 provides virtual machines in the cloud and gives you full administrative control over
the Windows or Linux operating system that runs on the instance. Most server operating systems
are supported, including: Windows 2008, 2012, 2016, and 2019, Red Hat, SuSE, Ubuntu, and
Amazon Linux.

An operating system that runs on a virtual machine is often called a guest operating system to
distinguish it from the host operating system. The host operating system is directly installed on
any server hardware that hosts one or more virtual machines.

With Amazon EC2, you can launch any number of instances of any size into any Availability Zone
anywhere in the world in a matter of minutes. Instances launch from Amazon Machine Images
(AMIs), which are effectively virtual machine templates. AMIs are discussed in more detail later
in this module.

You can control traffic to and from instances by using security groups. Also, because the servers
run in the AWS Cloud, you can build solutions that take use multiple AWS services.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 13
AWS Training and Certification Module 6: Compute

Launching an Amazon EC2 instance

This section of the module


walks through nine key
decisions to make when you
create an EC2 instance by using
the AWS Management Console
Launch Instance Wizard.

➢ Along the way, essential


Amazon EC2 concepts will be
explored.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 11

The first time you launch an Amazon EC2 instance, you will likely use the AWS Management
Console Launch Instance Wizard. You will have the opportunity to experience using the Launch
Wizard in the lab that is in this module.

The Launch Instance Wizard makes it easy to launch an instance. For example, if you choose to
accept all the default settings, you can skip most of the steps that are provided by the wizard and
launch an EC2 instance in as few as six clicks. An example of this process is shown in the
demonstration at the end of this section.

However, for most deployments you will want to modify the default settings so that the servers
you launch are deployed in a way that matches your specific needs.

The next series of slides introduce you to the essential choices that you must make when you
launch an instance. The slides cover essential concepts that are good to know when you make
these choices. These concepts are described to help you understand the options that are
available, and the effects of the decisions that you will make.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 14
AWS Training and Certification Module 6: Compute

1. Select an AMI

Choices made using the Launch


instance
Launch Instance Wizard:
AMI Instance

1. AMI
2. Instance Type
• Amazon Machine Image (AMI)
3. Network settings • Is a template that is used to create an EC2 instance (which is a virtual
4. IAM role machine, or VM, that runs in the AWS Cloud)
5. User data • Contains a Windows or Linux operating system
6. Storage options
• Often also has some software pre-installed
7. Tags
8. Security group • AMI choices:
9. Key pair
• Quick Start – Linux and Windows AMIs that are provided by AWS
• My AMIs – Any AMIs that you created
• AWS Marketplace – Pre-configured templates from third parties
• Community AMIs – AMIs shared by others; use at your own risk

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 12

An Amazon Machine Image (AMI) provides information that is required to launch an EC2
instance. You must specify a source AMI when you launch an instance. You can use different AMIs
to launch different types of instances. For example, you can choose one AMI to launch an
instance that will become a web server and another AMI to deploy an instance that will host an
application server. You can also launch multiple instances from a single AMI.

An AMI includes the following components:


• A template for the root volume of the instance. A root volume typically contains an operating
system (OS) and everything that was installed in that OS (applications, libraries, etc.). Amazon
EC2 copies the template to the root volume of a new EC2 instance, and then starts it.
• Launch permissions that control which AWS accounts can use the AMI.
• A block device mapping that specifies the volumes to attach to the instance (if any) when it is
launched.

You can choose many AMIs:


• Quick Start – AWS offers a number of pre-built AMIs for launching your instances. These AMIs
include many Linux and Windows options.
• My AMIs – These AMIs are AMIs that you created.
• AWS Marketplace – The AWS Marketplace offers a digital catalog that lists thousands of
software solutions. These AMIs can offer specific use cases to help you get started quickly.
• Community AMIs – These AMIs are created by people all around the world. These AMIs are
not checked by AWS, so use them at your own risk. Community AMIs can offer many different
solutions to various problems, but use them with care. Avoid using them in any production or
corporate environment.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 15
AWS Training and Certification Module 6: Compute

Creating a new AMI: Example


AWS Cloud
AMI details
Region A
Connect to the instance
and manually modify it
Quick or run a script that
modifies the instance
Start or Capture as
Launch an (for example, upgrade
other
Starter instance installed software) a new AMI
existing
AMI AMI
1 2 3
Unmodified Modified New
Instance Instance AMI

(Optional) Import MyAMI


a virtual machine
Region B Copy the AMI to any other Regions
where you want to use it
New 4
AMI

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 13

An AMI is created from an EC2 instance. You can import a virtual machine so that it becomes an
EC2 instance, and then save the EC2 instance as an AMI. You can then launch an EC2 instance
from that AMI. Alternatively, you can start with an existing AMI—such as of the Quick Start AMIs
provided by AWS—and create an EC2 instance from it.

Regardless of which options you chose (step 1), you will have what the diagram refers to as an
unmodified instance. From that instance, you might then create a golden instance—that is, a
virtual machine that you configured with the specific OS and application settings that you want
(step 2)—and then capture that as a new AMI (step 3). When you create an AMI, Amazon EC2
stops the instance, creates a snapshot of its root volume, and finally registers the snapshot as an
AMI.

After an AMI is registered, the AMI can be used to launch new instances in the same AWS Region.
The new AMI can now be thought of as a new starter AMI. You might want to also copy the AMI
to other Regions (step 4), so that EC2 instances can also be launched in those locations.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 16
AWS Training and Certification Module 6: Compute

2. Select an instance type


• Consider your use case
Choices made using the • How will the EC2 instance you create be used?
Launch Instance Wizard: • The instance type that you choose determines –
1. AMI • Memory (RAM)
2. Instance Type • Processing power (CPU)
3. Network settings • Disk space and disk type (Storage)
4. IAM role • Network performance
5. User data
6. Storage options • Instance type categories –
7. Tags • General purpose
8. Security group • Compute optimized
9. Key pair
• Memory optimized
• Storage optimized
• Accelerated computing
• Instance types offer family, generation, and size

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 14

After you choose the AMI for launching the instance, you must choose on an instance type.

Amazon EC2 provides a selection of instance types that optimized to fit different use cases.
Instance types comprise varying combinations of CPU, memory, storage, and networking capacity.
The different instance types give you the flexibility to choose the appropriate mix of resources for
your applications. Each instance type includes one or more instance sizes, which enable you to
scale your resources to the requirements of your target workload.

Instance type categories include general purpose, compute optimized, memory optimized,
storage optimized, and accelerated computing instances. Each instance type category offers many
instance types to choose from.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 17
AWS Training and Certification Module 6: Compute

EC2 instance type naming and sizes

Example instance sizes


Instance
Instance type naming Name
vCPU Memory (GB) Storage

t3.nano 2 0.5 EBS-Only


• Example: t3.large
t3.micro 2 1 EBS-Only
• T is the family name
• 3 is the generation number t3.small 2 2 EBS-Only

• Large is the size


t3.medium 2 4 EBS-Only

t3.large 2 8 EBS-Only

t3.xlarge 4 16 EBS-Only

t3.2xlarge 8 32 EBS-Only

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 15

When you look at an EC2 instance type, you will see that its name has several parts. For example,
consider the T type.

T is the family name, which is then followed by a number. Here, that number is 3.

The number is the generation number of that type. So, a t3 instance is the third generation of the
T family. In general, instance types that are of a higher generation are more powerful and provide
a better value for the price.

The next part of the name is the size portion of the instance. When you compare sizes, it is
important to look at the coefficient portion of the size category.

For example, a t3.2xlarge has twice the vCPU and memory of a t3.xlarge. The t3.xlarge has, in
turn, twice the vCPU and memory of a t3.large.

It is also important to note that network bandwidth is also tied to the size of the Amazon EC2
instance. If you will run jobs that will be very network-intensive, you might be required to
increase the instance specifications to meet your needs.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 18
AWS Training and Certification Module 6: Compute

Select instance type: Based on use case

General Compute Memory Accelerated Storage


Purpose Optimized Optimized Computing Optimized

a1, m4, m5, r4, r5, f1, g3, g4,


Instance Types c4, c5 d2, h1, i3
t2, t3 x1, z1 p2, p3

High In-memory Machine Distributed file


Use Case Broad
performance databases learning systems

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 16

Instance types vary in several ways, including: CPU type, CPU or core count, storage type, storage
amount, memory amount, and network performance. The chart provides a high-level view of the
different instance categories, and which instance type families and generation numbers fit into
each category type. Consider a few of the instance types in more detail:
• T3 instances provide burstable performance general purpose instances that provide a baseline
level of CPU performance with the ability to burst above the baseline. Use cases for this type
of instance include websites and web applications, development environments, build servers,
code repositories, microservices, test and staging environments, and line-of-business
applications.
• C5 instances are optimized for compute-intensive workloads, and deliver cost-effective high
performance at a low price per compute ratio. Use cases include scientific modeling, batch
processing, ad serving, highly scalable multiplayer gaming, and video encoding.
• R5 instances are optimized for memory-intensive applications. Use cases include high-
performance databases, data mining and analysis, in-memory databases, distributed web-scale
in-memory caches, applications that perform real-time processing of unstructured big data,
Apache Hadoop or Apache Spark clusters, and other enterprise applications.

To learn more about each instance type, see the Amazon EC2 Instance Types documentation at
https://aws.amazon.com/ec2/instance-types/.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 19
AWS Training and Certification Module 6: Compute

Instance types: Networking features


• The network bandwidth (Gbps) varies by instance type.
• See Amazon EC2 Instance Types to compare.
• To maximize networking and bandwidth performance of your instance type:
• If you have interdependent instances, launch them into a cluster placement group.
• Enable enhanced networking.
• Enhanced networking types are supported on most instance types.
• See the Networking and Storage Features documentation for details.
• Enhanced networking types –
• Elastic Network Adapter (ENA): Supports network speeds of up to 100 Gbps.
• Intel 82599 Virtual Function interface: Supports network speeds of up to 10 Gbps.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 17

In addition to considering the CPU, RAM, and storage needs of your workloads, it is also
important to consider your network bandwidth requirements.

Each instance type provides a documented network performance level. For example, an
a1.medium instance will provide up to 10 Gbps, but a p3dn.24xlarge instance provides up to 100
Gbps. Choose an instance type that meets your requirements.

When you launch multiple new EC2 instances, Amazon EC2 attempts to place the instances so
that they are spread out across the underlying hardware by default. It does this to minimize
correlated failures. However, if you want to specify specific placement criteria, you can
use placement groups to influence the placement of a group of interdependent instances to
meet the needs of your workload. For example, you might specify that three instances should all
be deployed in the same Availability Zone to ensure lower network latency and higher network
throughput between instances. See the Placement Group documentation at
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html for details.

Many instance types also enable you to configure enhanced networking to get significantly higher
packet per second (PPS) performance, lower delay variation in the arrival of packets over the
network (network jitter), and lower latencies. See the Elastic Network Adapter (ENA)
documentation at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-
networking-ena.htmlfor details.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 20
AWS Training and Certification Module 6: Compute

3. Specify network settings


• Where should the instance be deployed?
Choices made by using the • Identify the VPC and optionally the subnet
Launch Instance Wizard:
• Should a public IP address be automatically assigned?
1. AMI • To make it internet-accessible
2. Instance Type
3. Network settings
AWS Cloud
4. IAM role
Region
5. User data
Availability Zone 1 Availability Zone 2
6. Storage options
VPC
7. Tags
Public subnet
8. Security group Example: specify
9. Key pair to deploy the
instance here Instance
Private
subnet

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 18

After you have choose an AMI and an instance type, you must specify the network location
where the EC2 instance will be deployed. The choice of Region must be made before you start
the Launch Instance Wizard. Verify that you are in the correct Region page of the Amazon EC2
console before you choose Launch Instance.

When you launch an instance in a default VPC, AWS will assign it a public IP address by default.
When you launch an instance into a nondefault VPC, the subnet has an attribute that determines
whether instances launched into that subnet receive a public IP address from the public IPv4
address pool. By default, AWS will not assign a public IP address to instances that are launched in
a nondefault subnet. You can control whether your instance receives a public IP address by either
modifying the public IP addressing attribute of your subnet, or by enabling or disabling the public
IP addressing feature during launch (which overrides the subnet's public IP addressing attribute).

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 21
AWS Training and Certification Module 6: Compute

4. Attach IAM role (optional)


• Will software on the EC2 instance need to interact with other
Choices made by using the AWS services?
Launch Instance Wizard: • If yes, attach an appropriate IAM Role.
1. AMI • An AWS Identity and Access Management (IAM) role that is
2. Instance Type attached to an EC2 instance is kept in an instance profile.
3. Network settings
• You are not restricted to attaching a role only at instance
4. IAM role
launch.
5. User data
6. Storage options • You can also attach a role to an instance that already exists.
7. Tags
8. Security group Example: Application on
9. Key pair attached to instance can
access
Role that grants Amazon
S3 bucket
Simple Storage Service Instance
with objects
(Amazon S3) bucket
access permissions

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 19

It is common to use EC2 instances to run an application that must make secure API calls to other
AWS services. To support these use cases, AWS enables you to attach an AWS Identity and
Access Management (IAM) role to an EC2 instance. Without this feature, you might be tempted
to place AWS credentials on an EC2 instance so an application that runs on that instance to use.
However, you should never store AWS credentials on an EC2 instance. It is highly insecure.
Instead, attach an IAM role to the EC2 instance. The IAM role then grants permission to make
application programming interface (API) requests to the applications that run on the EC2
instance.

An instance profile is a container for an IAM role. If you use the AWS Management Console to
create a role for Amazon EC2, the console automatically creates an instance profile and gives it
the same name as the role. When you then use the Amazon EC2 console to launch an instance
with an IAM role, you can select a role to associate with the instance. In the console, the list that
displays is actually a list of instance profile names.

In the example, you see that an IAM role is used to grant permissions to an application that runs
on an EC2 instance. The application must access a bucket in Amazon S3.

You can attach an IAM role when you launch the instance, but you can also attach a role to an
already running EC2 instance. When you define a role that can be used by an EC2 instance, you
define which accounts or AWS services can assume the role. You also define which API actions

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 22
AWS Training and Certification Module 6: Compute

and resources the application can use after it assumes the role. If you change a role, the change is
propagated to all instances that have the role attached to them.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 23

19
AWS Training and Certification Module 6: Compute

5. User data script (optional)

Choices made by using the User data


Launch Instance Wizard:
#!/bin/bash
1. AMI yum update –y
2. Instance Type yum install -y wget
3. Network settings AMI Running
4. IAM role EC2 instance
5. User data • Optionally specify a user data script at instance launch
6. Storage options
7. Tags
• Use user data scripts to customize the runtime environment of
8. Security group
your instance
9. Key pair • Script runs the first time the instance starts
• Can be used strategically
• For example, reduce the number of custom AMIs that you build and
maintain

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 20

When you create your EC2 instances, you have the option of passing user data to the instance.
User data can automate the completion of installations and configurations at instance launch. For
example, a user data script might patch and update the instance's operating system, fetch and
install software license keys, or install additional software.

In the example user data script, you see a simple three-line Linux Bash shell script. The first line
indicates that the script should be run by the Bash shell. The second line invokes the Yellowdog
Updater, Modified (YUM) utility, which is commonly used in many Linux distributions—such as
Amazon Linux, CentOS, and Red Hat Linux—to retrieve software from an online repository and
install it. In line two of the example, that command tells YUM to update all installed packages to
the latest versions that are known to the software repository that it is configured to access. Line
three of the script indicates that the Wget utility should be installed. Wget is a common utility for
downloading files from the web.

For a Windows instance, the user data script should be written in a format that is compatible
with a Command Prompt window (batch commands) or with Windows PowerShell. See the
Windows User Data Scripts documentation for details at
https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/ec2-windows-user-data.html.

When the EC2 instance is created, the user data script will run with root privileges during the
final phases of the boot process. On Linux instances, it is run by the cloud-init service. On
Windows instances, it is run by the EC2Config or EC2Launch utility. By default, user data only
runs the first time that the instance starts up. However, if you would like your user data script to
run every time the instance is booted, you can create a Multipurpose Internet Mail Extensions
(MIME) multipart file user data script (this process is not commonly done). See
https://aws.amazon.com/premiumsupport/knowledge-center/execute-user-data-ec2/ for more
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 24
AWS Training and Certification Module 6: Compute

information.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 25

20
AWS Training and Certification Module 6: Compute

6. Specify storage
• Configure the root volume
Choices made by using the
Launch Instance Wizard: • Where the guest operating system is installed

1. AMI
• Attach additional storage volumes (optional)
2. Instance Type • AMI might already include more than one volume
3. Network settings
4. IAM role • For each volume, specify:
5. User data • The size of the disk (in GB)
6. Storage options
7. Tags • The volume type
8. Security group • Different types of solid state drives (SSDs) and hard
9. Key pair disk drives (HDDs) are available
• If the volume will be deleted when the instance is
terminated
• If encryption should be used

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 21

When you launch an EC2 instance, you can configure storage options. For example, you can
configure the size of the root volume where the guest operating system is installed. You can also
attach additional storage volumes when you launch the instance. Some AMIs are also configured
to launch more than one storage volume by default to provide storage that is separate from the
root volume.

For each volume that your instance will have, you can specify the size of the disks, the volume
types, and whether the storage will be retained if the instance is terminated. You can also specify
if encryption should be used.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 26
AWS Training and Certification Module 6: Compute

Amazon EC2 storage options


• Amazon Elastic Block Store (Amazon EBS) –
• Durable, block-level storage volumes.
• You can stop the instance and start it again, and the data will still be there.
• Amazon EC2 Instance Store –
• Ephemeral storage is provided on disks that are attached to the host computer where the EC2
instance is running.
• If the instance stops, data stored here is deleted.
• Other options for storage (not for the root volume) –
• Mount an Amazon Elastic File System (Amazon EFS) file system.
• Connect to Amazon Simple Storage Service (Amazon S3).

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 22

Amazon Elastic Block Store (Amazon EBS) is an easy-to-use, high-performance durable block
storage service that is designed to be used with Amazon EC2 for both throughput- and
transaction-intensive workloads. With Amazon EBS, you can choose from four different volume
types to balance the optimal price and performance. You can change volume types or increase
volume size without disrupting your critical applications, so you can have cost-effective storage
when you need it.

Amazon EC2 Instance Store provides ephemeral, or temporary, block-level storage for your
instance. This storage is located on disks that are physically attached to the host computer.
Instance Store works well when you must temporarily store information that changes frequently,
such as buffers, caches, scratch data, and other temporary content. You can also use Instance
Store for data that is replicated across a fleet of instances, such as a load balanced pool of web
servers. If the instances are stopped—either because of user error or a malfunction—the data on
the instance store will be deleted.

Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic
Network File System (NFS) file system for use with AWS Cloud services and on-premises
resources. It is built to scale on-demand to petabytes without disrupting applications. It grows
and shrinks automatically as you add and remove files, which reduces the need to provision and
manage capacity to accommodate growth.

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers scalability,
data availability, security, and performance. You can store and protect any amount of data for a
variety of use cases, such as websites, mobile apps, backup and restore, archive, enterprise
applications, Internet of Things (IoT) devices, and big data analytics.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 27
AWS Training and Certification Module 6: Compute

Example storage options


• Instance 1 characteristics –
Amazon Elastic Host computer
• It has an Amazon EBS root volume
Block Store
type for the operating system.
(Amazon EBS) Instance Store
• What will happen if the instance is
Attached as Attached as
stopped and then started again?
Root volume Storage volume
20-GB volume Instance 1 Ephemeral
volume 1
• Instance 2 characteristics – Attached as
Storage volume
• It has an Instance Store root Attached as
volume type for the operating 500-GB volume Root volume
system. Instance 2 Ephemeral
volume 2
• What will happen if the instance
stops (because of user error or a
system malfunction)?

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 23

Here, you see two examples of how storage options could be configured for EC2 instances.

The Instance 1 example shows that the root volume—which contains the OS and possibly other
data—is stored on Amazon EBS. This instance also has two attached volumes. One volume is a
500-GB Amazon EBS storage volume, and the other volume is an Instance Store volume. If this
instance was stopped and then started again, the OS would survive and any data that was stored
on either the 20-GB Amazon EBS volume or the 500-GB Amazon EBS volume would remain intact.
However, any data that was stored on Ephemeral volume 1 would be permanently lost. Instance
Store works well for temporarily storing information that changes frequently, such as buffers,
caches, scratch data, and other temporary content.

The Instance 2 example shows that the root volume is on an instance store (Ephemeral volume
2). An instance with an Instance Store root volume cannot be stopped by an Amazon EC2 API
call. It can only be terminated. However, it could be stopped from within the instance's OS (for
example, by issuing a shutdown command)—or it could stop because of OS or disk failure—which
would cause the instance to be terminated. If the instance was terminated, all the data that was
stored on Ephemeral volume 2 would be lost, including the OS. You would not be able to start the
instance again. Therefore, do not rely on Instance Store for valuable, long-term data. Instead, use
more durable data storage, such as Amazon EBS, Amazon EFS, or Amazon S3.

If an instance reboots (intentionally or unintentionally), data on the instance store root volume
does persist.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 28
AWS Training and Certification Module 6: Compute

7. Add tags
• A tag is a label that you can assign to an AWS
Choices made by using the resource.
Launch Instance Wizard:
• Consists of a key and an optional value.
1. AMI
2. Instance Type
• Tagging is how you can attach metadata to an EC2
3. Network settings instance.
4. IAM role
5. User data • Potential benefits of tagging—Filtering, automation,
6. Storage options cost allocation, and access control.
7. Tags
8. Security group
9. Key pair Example:

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 24

A tag is a label that you assign to an AWS resource. Each tag consists of a key and an
optional value, both of which you define. Tags enable you to categorize AWS resources, such as
EC2 instances, in different ways. For example, you might tag instances by purpose, owner, or
environment.

Tagging is how you can attach metadata to an EC2 instance.

Tag keys and tag values are case-sensitive. For example, a commonly used tag for EC2 instances is
a tag key that is called Name and a tag value that describes the instance, such as My Web Server.
The Name tag is exposed by default in the Amazon EC2 console Instances page. However, if you
create a key that is called name (with lower-case n), it will not appear in the Name column for
the list of instances (though it will still appear in the instance details panel in the Tags tab).

It is a best practice to develop tagging strategies. Using a consistent set of tag keys makes it
easier for you to manage your resources. You can also search and filter the resources based on
the tags that you add. See https://d1.awsstatic.com/whitepapers/aws-tagging-best-practices.pdf
for more information.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 29
AWS Training and Certification Module 6: Compute

8. Security group settings


• A security group is a set of firewall rules that control
Choices made by using the traffic to the instance.
Launch Instance Wizard:
• It exists outside of the instance's guest OS.
1. AMI • Create rules that specify the source and which ports
2. Instance Type
that network communications can use.
3. Network settings
4. IAM role • Specify the port number and the protocol, such as
5. User data Transmission Control Protocol (TCP), User Datagram Protocol
6. Storage options (UDP), or Internet Control Message Protocol (ICMP).
7. Tags
8. Security group
• Specify the source (for example, an IP address or another
9. Key pair
security group) that is allowed to use the rule.
Example rule:

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 25

A security group acts as a virtual firewall that controls network traffic for one or more instances.
When you launch an instance, you can specify one or more security groups; otherwise, the
default security group is used.

You can add rules to each security group. Rules allow traffic to or from its associated instances.
You can modify the rules for a security group at any time, and the new rules will be automatically
applied to all instances that are associated with the security group. When AWS decides whether
to allow traffic to reach an instance, all the rules from all the security groups that are associated
with the instance are evaluated. When you launch an instance in a virtual private cloud (VPC), you
must either create a new security group or use one that already exists in that VPC. After you
launch an instance, you can change its security groups.

When you define a rule, you can specify the allowable source of the network communication
(inbound rules) or destination (outbound rules). The source can be an IP address, an IP address
range, another security group, a gateway VPC endpoint, or anywhere (which means that all
sources will be allowed). By default, a security group includes an outbound rule that allows all
outbound traffic. You can remove the rule and add outbound rules that only allow specific
outbound traffic. If your security group has no outbound rules, no outbound traffic that
originates from your instance is allowed.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 30
AWS Training and Certification Module 6: Compute

In the example rule, the rule allows Secure Shell (SSH) traffic over Transmission Control Protocol
(TCP) port 22 if the source of the request is My IP. The My IP IP address is calculated by determining
what IP address you are currently connected to the AWS Cloud from when you define the rule.

Network access control lists (network ACLs) can also be used are firewalls to protect subnets in a
VPC.

For accessibility: Screenshot of the EC2 console screen where you can define a security group rule. It
shows a rule with type SSH, protocol TCP, port range 22, source My IP, and a CIDR block that shows
an example My IP address. End of accessibility description.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 31

25
AWS Training and Certification Module 6: Compute

9. Identify or create the key pair


• At instance launch, you specify an existing key
Choices made by using the pair or create a new key pair.
Launch Instance Wizard:
• A key pair consists of –
1. AMI mykey.pem
• A public key that AWS stores.
2. Instance Type
3. Network settings • A private key file that you store.
4. IAM role
5. User data • It enables secure connections to the instance.
6. Storage options
• For Windows AMIs –
7. Tags
8. Security group • Use the private key to obtain the administrator
9. Key pair password that you need to log in to your instance.
• For Linux AMIs –
• Use the private key to use SSH to securely connect
to your instance.
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 26

After you specify all the required configurations to launch an EC2 instance, and after you
customize any optional EC2 launch wizard configuration settings, you are presented with a
Review Instance Launch window. If you then choose Launch, a dialog asks you to choose an
existing key pair, proceed without a key pair, or create a new key pair before you can choose
Launch Instances and create the EC2 instance.

Amazon EC2 uses public–key cryptography to encrypt and decrypt login information. The
technology uses a public key to encrypt a piece of data, and then the recipient uses the private
key to decrypt the data. The public and private keys are known as a key pair. Public-key
cryptography enables you to securely access your instances by using a private key instead of a
password.

When you launch an instance, you specify a key pair. You can specify an existing key pair or a new
key pair that you create at launch. If you create a new key pair, download it and save it in a safe
location. This opportunity is the only chance you get to save the private key file.

To connect to a Windows instance, use the private key to obtain the administrator password, and
then log in to the EC2 instance's Windows Desktop by using Remote Desktop Protocol (RDP). To
establish an SSH connection from a Windows machine to an Amazon EC2 instance, you can use a
tool such as PuTTY, which will require the same private key.

With Linux instances, at boot time, the public key content is placed on the instance. An entry is
created in within ~/.ssh/authorized_keys. To log in to your Linux instance (for
example, by using SSH), you must provide the private key when you establish the connection.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 32
AWS Training and Certification Module 6: Compute

Amazon EC2 console view of a running EC2 instance

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 27

After you choose Launch Instances and then choose View Instances, you will be presented with a
screen that looks similar to the example.

Many of the settings that you specified during launch are visible in the Description panel.

Information about the available instance includes IP address and DNS address information, the
instance type, the unique instance ID that was assigned to the instance, the AMI ID of the AMI
that you used to launch the instance, the VPC ID, the subnet ID, and more.

Many of these details provide hyperlinks that you can choose to learn more information about
the resources that are relevant to the EC2 instance you launched.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 33
AWS Training and Certification Module 6: Compute

Another option: Launch an EC2 instance with the AWS


Command Line Interface
• EC2 instances can also be created
programmatically.
AWS Command Line
Interface (AWS CLI)

• This example shows how simple the


Example command:
command can be.
aws ec2 run-instances \
• This command assumes that the key pair and
--image-id ami-1a2b3c4d \
security group already exist.
--count 1 \
--instance-type c3.large \
• More options could be specified. See the AWS --key-name MyKeyPair \
CLI Command Reference for details. --security-groups MySecurityGroup \
--region us-east-1

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 28

You can also launch EC2 instances programmatically, either by using the AWS Command Line
Interface (AWS CLI) or one of the AWS software development kits (SDKs).

In the example AWS CLI command, you see a single command that specifies the minimal
information that is needed to launch an instance. The command includes the following
information:
• aws – Specifies an invocation of the aws command line utility.
• ec2 – Specifies an invocation of the ec2 service command.
• run-instances – Is the subcommand that is being invoked.

The rest of the command specifies several parameters, including:


• image-id – This parameter is followed by an AMI ID. All AMIs have a unique AMI ID.
• count – You can specify more than one.
• instance-type – You can specify the instance type to create (for example) a c3.large instance
• key-name – In the example, assume that MyKeyPair already exists.
• security-groups - In this example, assume that MySecurityGroup already exists.
• region - AMIs exist in an AWS Region, so you must specify the Region where the AWS CLI will
find the AMI and launch the EC2 instance.

The command should successfully create an EC2 instance if:


• The command is properly formed
• The resources that the command needs already exist
• You have sufficient permissions to run the command
• You have sufficient capacity in the AWS account
If the command is successful, the API responds to the command with the instance ID and other
relevant data for your application to use in subsequent API requests.
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 34
AWS Training and Certification Module 6: Compute

Amazon EC2 instance lifecycle

Only instances backed by Amazon EBS

Launch Start
pending

AMI

Reboot Stop
rebooting running stopping stopped
Stop-
Hibernate
Terminat
e
shutting-
down

Terminate
terminated

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 29

Here, you see the lifecycle of an instance. The arrows show actions that you can take and the
boxes show the state the instance will enter after that action. An instance can be in one of the
following states:
• Pending – When an instance is first launched from an AMI, or when you start a stopped
instance, it enters the pending state when the instance is booted and deployed to a host
computer. The instance type that you specified at launch determines the hardware of the host
computer for your instance.
• Running – When the instance is fully booted and ready, it exits the pending state and enters
the running state. You can connect over the internet to your running instance.
• Rebooting – AWS recommends you reboot an instance by using the Amazon EC2 console, AWS
CLI, or AWS SDKs instead of invoking a reboot from within the guest operating system (OS). A
rebooted instance stays on the same physical host, maintains the same public DNS name and
public IP address, and if it has instance store volumes, it retains the data on those volumes.
• Shutting down – This state is an intermediary state between running and terminated.
• Terminated – A terminated instance remains visible in the Amazon EC2 console for a while
before the virtual machine is deleted. However, you can’t connect to or recover a terminated
instance.
• Stopping – Instances that are backed by Amazon EBS can be stopped. They enter the stopping
state before they attain the fully stopped state.
• Stopped – A stopped instance will not incur the same cost as a running instance. Starting a
stopped instance puts it back into the pending state, which moves the instance to a new host

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 35
AWS Training and Certification Module 6: Compute

machine.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 36

29
AWS Training and Certification Module 6: Compute

Consider using an Elastic IP address


• Rebooting an instance will not • If you require a persistent public IP
change any IP addresses or DNS address –
hostnames. • Associate an Elastic IP address with the
instance.

• When an instance is stopped and


then started again – • Elastic IP address characteristics –
• The public IPv4 address and external DNS • Can be associated with instances in the
hostname will change. Region as needed.

• The private IPv4 address and internal DNS • Remains allocated to your account until
hostname do not change. you choose to release it.

Elastic IP
Address

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 31

A public IP address is an IPv4 address that is reachable from the internet. Each instance that
receives a public IP address is also given an external DNS hostname. For example, if the public IP
address assigned to the instance is 203.0.113.25, then the external DNS hostname might
be ec2-203-0-113-25.compute-1.amazonaws.com.

If you specify that a public IP address should be assigned to your instance, it is assigned from the
AWS pool of public IPv4 addresses. The public IP address is not associated with your AWS
account. When a public IP address is disassociated from your instance, it is released back into the
public IPv4 address pool, and you will not be able to specify that you want to reuse it. AWS
releases your instance's public IP address when the instance is stopped or terminated. Your
stopped instance receives a new public IP address when it is restarted.

If you require a persistent public IP address, you might want to associate an Elastic IP address
with the instance. To associate an Elastic IP address, you must first allocate a new Elastic IP
address in the Region where the instance exists. After the Elastic IP address is allocated, you can
associate the Elastic IP address with an EC2 instance.

By default, all AWS accounts are limited to five (5) Elastic IP addresses per Region because public
(IPv4) internet addresses are a scarce public resource. However, this is a soft limit, and you can
request a limit increase (which might be approved).

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 37
AWS Training and Certification Module 6: Compute

EC2 instance metadata


• Instance metadata is data about your instance.
• While you are connected to the instance, you can view it –
• In a browser: http://169.254.169.254/latest/meta-data/
• In a terminal window: curl http://169.254.169.254/latest/meta-data/
• Example retrievable values –
• Public IP address, private IP address, public hostname, instance ID, security groups, Region,
Availability Zone.
• Any user data specified at instance launch can also be accessed at:
http://169.254.169.254/latest/user-data/
• It can be used to configure or manage a running instance.
• For example, author a configuration script that reads the metadata and uses it to configure
applications or OS settings.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 32

Instance metadata is data about your instance. You can view it while you are connected to the
instance. To access it in a browser, go to the following URL:
http://169.254.169.254/latest/meta-data/. The data can also be read
programmatically, such as from a terminal window that has the cURL utility. In the terminal
window, run curl http://169.254.169.254/latest/meta-data/ to retrieve it.
The IP address 169.254.169.254 is a link-local address and it is valid only from the instance.
Instance metadata provides much of the same information about the running instance that you
can find in the AWS Management Console. For example, you can discover the public IP address,
private IP address, public hostname, instance ID, security groups, Region, Availability Zone, and
more.
Any user data that is specified at instance launch can also be accessed at the following URL:
http://169.254.169.254/latest/user-data.
EC2 instance metadata can be used to configure or manage a running instance. For example, you
can author a configuration script that accesses the metadata information and uses it to configure
applications or OS settings.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 38
AWS Training and Certification Module 6: Compute

Amazon CloudWatch for monitoring


• Use Amazon CloudWatch to monitor EC2 instances
• Provides near-real-time metrics
Amazon CloudWatch Instance with CloudWatch
• Provides charts in the Amazon EC2 console
Monitoring tab that you can view
• Maintains 15 months of historical data

• Basic monitoring
• Default, no additional cost
• Metric data sent to CloudWatch every 5 minutes

• Detailed monitoring
• Fixed monthly rate for seven pre-selected metrics
• Metric data delivered every 1 minute

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 33

You can monitor your instances by using Amazon CloudWatch, which collects and processes raw
data from Amazon EC2 into readable, near-real-time metrics. These statistics are recorded for a
period of 15 months, so you can access historical information and gain a better perspective on
how your web application or service is performing.

By default, Amazon EC2 provides basic monitoring, which sends metric data to CloudWatch in 5-
minute periods. To send metric data for your instance to CloudWatch in 1-minute periods, you
can enable detailed monitoring on the instance. For more information, see Enable or Disable
Detailed Monitoring for Your Instances at
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-cloudwatch-new.html.

The Amazon EC2 console displays a series of graphs based on the raw data from Amazon
CloudWatch. Depending on your needs, you might prefer to get data for your instances from
Amazon CloudWatch instead of through the graphs in the console. By default, Amazon
CloudWatch does not provide RAM metrics for EC2 instances, though that is an option that you
can configure if you want to CloudWatch to collect that data.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 39
AWS Training and Certification Module 6: Compute

Section 2 key • Amazon EC2 enables you to run Windows and Linux virtual
machines in the cloud.
takeaways • You launch EC2 instances from an AMI template into a VPC
in your account.
• You can choose from many instance types. Each instance
type offers different combinations of CPU, RAM, storage,
and networking capabilities.
• You can configure security groups to control access to
instances (specify allowed ports and source).
• User data enables you to specify a script to run the first
time that an instance launches.
• Only instances that are backed by Amazon EBS can be
stopped.
• You can use Amazon CloudWatch to capture and review
metrics on EC2 instances.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 34

Some key takeaways from this section of the module include:


• Amazon EC2 enables you to run Windows and Linux virtual machines in the cloud.
• You launch EC2 instances from an AMI template into a VPC in your account.
• You can choose from many instance types. Each instance type offers different combinations of
CPU, RAM, storage, and networking capabilities.
• You can configure security groups to control access to instances (specify allowed ports and
source).
• User data enables you to specify a script to run the first time that an instance launches.
• Only instances that are backed by Amazon EBS can be stopped.
• You can use Amazon CloudWatch to capture and review metrics on EC2 instances.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 40
AWS Training and Certification Module 6: Compute

Recorded
Amazon EC2
demonstration

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 35

Now, take a moment to watch the EC2 Demo at https://aws-tc-largeobjects.s3-us-west-


2.amazonaws.com/ILT-TF-100-ACFNDS-20-EN/Module_6_EC2+v2.0.mp4. The recording runs just
over 3 minutes and reinforces some of the concepts that were discussed in this section of the
module.

The demonstration shows:


• How to use the AWS Management Console to launch an Amazon EC2 instance (with all the
default instance settings accepted).
• How to connect to the Windows instance by using a Remote Desktop client and the key pair
that was identified during instance launch to decrypt the Windows password for login.
• How to terminate the instance after it is no longer needed.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 41
AWS Training and Certification Module 6: Compute

Lab 3:
Introduction to
Amazon EC2

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 36

Introducing Lab 3: Introduction to Amazon EC2. This lab provides hands-on practice with
launching, resizing, managing, and monitoring an Amazon EC2 instance.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 42
AWS Training and Certification Module 6: Compute

Lab 3 scenario
In this lab, you will launch and configure your first virtual machine
that runs on Amazon EC2.
AWS Cloud
Region
Availability Zone
1
Lab VPC
Public subnet

Web server
instance

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 37

Introducing Lab 3: Introduction to Amazon EC2.

In this lab, you will launch and configure a virtual machine that runs on Amazon EC2.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 43
AWS Training and Certification Module 6: Compute

Lab 3: Tasks
• Task 1 – Launch Your Amazon EC2 Instance

• Task 2 – Monitor Your Instance

• Task 3 – Update Your Security Group and Access the Web Server

• Task 4 – Resize Your Instance: Instance Type and EBS Volume

• Task 5 – Explore EC2 Limits

• Task 6 – Test Termination Protection

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 38

In this hands-on lab, you will:


• Launch Your Amazon EC2 Instance
• Monitor Your Instance
• Update Your Security Group and Access the Web Server
• Resize Your Instance: Instance Type and EBS Volume
• Explore EC2 Limits
• Test Termination Protection

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 44
AWS Training and Certification Module 6: Compute

Lab 3: Final product


By the end of the lab, you will have: Amazon EC2
1. Launched an instance that is configured as
a web server
2. Viewed the instance system log VPC
AMI
3. Reconfigured a security group Security
group
4. Modified the instance type and root
volume size t2.micro t2.small
instance instance

Amazon Elastic
Block Store
(Amazon EBS)

8-GB root 10-GB root


volume volume

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 39

By the end of the lab, you will have:


1. Launched an instance that is configured as a web server
2. Viewed the instance system log
3. Reconfigured a security group
4. Modified the instance type and root volume size

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 45
AWS Training and Certification Module 6: Compute

~ 35 minutes

Begin Lab 3:
Introduction to Amazon
EC2

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 40

It is now time to start the lab.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 46
AWS Training and Certification Module 6: Compute

Lab debrief:
Key takeaways

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 41

The instructor will lead a conversation about the key takeaways from the lab after you have
completed it.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 47
AWS Training and Certification Module 6: Compute

Activity: Amazon
EC2

Photo by Pixabay from Pexels.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 42

In this educator-led activity, you will discuss the advantages and disadvantages of using Amazon
EC2 versus using a managed service like Amazon Relational Database Service (Amazon RDS).

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 48
AWS Training and Certification Module 6: Compute

Activity: Gather information

Amazon EC2 Amazon RDS

AWS Cloud
Availability Zone 1 Availability Zone 2
MS SQL Server MS SQL Server secondary
primary DB instance DB instance
Always-on
mirroring

Volume Volume Volume Volume


replica replica

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 43

The objective of this activity is to demonstrate that you understand the differences between
building a deployment that uses Amazon EC2 and using a fully managed service, such as Amazon
RDS, to deploy your solution. At the end of this activity, you should be prepared to discuss the
advantages and disadvantages of deploying Microsoft SQL Server on Amazon EC2 versus
deploying it on Amazon RDS.

The educator will ask you to:


1. Watch an 8-minute video at https://www.youtube.com/watch?v=UYy-
UeQ29jo&did=ta_card&trk=ta_card that explains the benefits of deploying Microsoft SQL Server
on Amazon EC2 by using the AWS Quick Start – SQL Server Reference Architecture:
https://aws.amazon.com/quickstart/architecture/sql/ deployment. You are encouraged to take
notes.
2. Read a blog post at https://aws.amazon.com/blogs/publicsector/the-scoop-on-moving-your-
microsoft-sql-server-to-aws/about the benefits of running Microsoft SQL Server on Amazon RDS.
You are again encouraged to take notes.
3. Participate in the class conversation about the questions posed on the next slide.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 49
AWS Training and Certification Module 6: Compute

Activity: Check your understanding


1. Between Amazon EC2 or Amazon RDS, which provides a managed service? What does managed
service mean?
• ANSWER: Amazon RDS provides a managed service. Amazon RDS handles provisioning, installation and
patching, automated backups, restoring snapshots from points in time, high availability, and monitoring.
2. Name at least one advantage of deploying Microsoft SQL Server on Amazon EC2 instead of Amazon
RDS.
• ANSWER: Amazon EC2 offers complete control over every configuration, the OS, and the software stack.
3. What advantage does the Quick Start provide over a manual installation on Amazon EC2?
• ANSWER: The Quick Start is a reference architecture with proven best practices built into the design.
4. Which deployment option offers the best approach for all use cases?
• ANSWER: Neither. The correct deployment option depends on your specific needs.
5. Which approach costs more: using Amazon EC2 or using Amazon RDS?
• ANSWER: It depends. Managing the database deployment on Amazon EC2 requires more customer oversight
and time. If time is your priority, then Amazon RDS might be less expensive. If you have in-house expertise,
Amazon EC2 might be more cost-effective.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 44

The educator will lead the class in a conversation as each question is revealed. Then, the
educator will display the written suggested responses and you can discuss these points further.

Regarding question 5, the answer was based on the information that is listed on the AWS Pricing
pages as of October, 2019.
For Amazon RDS, you pay $0.977 per hour if you run Microsoft SQL Server based on these
parameters:
• Instance – Standard (Single-AZ) instance
• Instance size – db.m5.large
• Region – US East (Ohio)
• Pricing – On-Demand Instance
For Amazon EC2, you pay $0.668 per hour if you run Microsoft SQL Server based on these
parameters:
• Instance – Windows instance
• Instance size – m5.large
• Region – US East (Ohio)
• Pricing – On-Demand Instance

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 50
AWS Training and Certification Module 6: Compute

As you consider cost, do not forget to include the cost of labor. For example, keep in mind that with
a standard Single-AZ Amazon RDS deployment—which is the basis of the example price reference—
automated backups are provided. With Amazon RDS, if a DB instance component failed and a user-
initiated restore operation is required, you would have a restorable backup that you could use. If you
run the database on Amazon EC2, you could configure an equally robust backup procedure for
Microsoft SQL Server. However, it would take time, knowledge, and technical skill to build the
solution. You would also need to pre-configure the solution before you encounter the situation
where you need it. For these reasons, when you consider the needs of your deployments holistically,
you might find that using Amazon RDS is less expensive than using Amazon EC2. However, if you
have skilled database administrators on staff—and you also have very specific deployment
requirements that make it preferable for you to have total control over all aspects of the
deployment—you could use Amazon EC2. In this case, you might find Amazon EC2 to be the more
cost-effective solution.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 51

44
AWS Training and Certification Module 6: Compute

Section 3: Amazon EC2 cost


optimization
Module 6: Compute

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introducing Section 3: Amazon EC2 cost optimization.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 52
AWS Training and Certification Module 6: Compute

Amazon EC2 pricing models


On-Demand Instances Reserved Instances Spot Instances
• Pay by the hour • Full, partial, or no upfront • Instances run as long as they are
payment for instance you reserve. available and your bid is above the
• No long-term commitments. Spot Instance price.
• Discount on hourly charge for that
• Eligible for the AWS Free Tier. • They can be interrupted by AWS
instance.
with a 2-minute notification.
• 1-year or 3-year term.
• Interruption options include
Dedicated Hosts terminated, stopped or hibernated.
Scheduled Reserved
• A physical server with EC2 instance • Prices can be significantly less
capacity fully dedicated to your use. Instances expensive compared to On-Demand
• Purchase a capacity reservation Instances
that is always available on a • Good choice when you have
Dedicated Instances recurring schedule you specify. flexibility in when your applications
can run.
• Instances that run in a VPC on • 1-year term.
hardware that is dedicated to a
single customer.

Per second billing available for On-Demand Instances, Reserved Instances, and
Spot Instances that run Amazon Linux or Ubuntu.
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 46

Amazon offers different pricing models to choose from when you want to run EC2 instances.
• Per second billing is only available for On-Demand Instances, Reserved Instances, and Spot
Instances that run Amazon Linux or Ubuntu.
• On-Demand Instances are eligible for the AWS Free Tier (https://aws.amazon.com/free/). They
have the lowest upfront cost and the most flexibility. There are no upfront commitments or
long-term contracts. It is a good choice for applications with short-term, spiky, or
unpredictable workloads.
• Dedicated Hosts are physical servers with instance capacity that is dedicated to your use. They
enable you to use your existing per-socket, per-core, or per-VM software licenses, such as for
Microsoft Windows or Microsoft SQL Server.
• Dedicated Instances are instances that run in a virtual private cloud (VPC) on hardware that’s
dedicated to a single customer. They are physically isolated at the host hardware level from
instances that belong to other AWS accounts.
• Reserved Instance enable you to reserve computing capacity for 1-year or 3-year term with
lower hourly running costs. The discounted usage price is fixed for as long as you own the
Reserved Instance. If you expect consistent, heavy use, they can provide substantial savings
compared to On-Demand Instances.
• Scheduled Reserved Instances enable you to purchase capacity reservations that recur on a
daily, weekly, or monthly basis, with a specified duration, for a 1-year term. You pay for the
time that the instances are scheduled, even if you do not use them.
• Spot Instances enable you to bid on unused EC2 instances, which can lower your costs. The
hourly price for a Spot Instance fluctuates depending on supply and demand. Your Spot
Instance runs whenever your bid exceeds the current market price.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 53
AWS Training and Certification Module 6: Compute

Amazon EC2 pricing models: Benefits

On-Demand Instances Spot Instances Reserved Instances Dedicated Hosts


• Low cost and • Large scale, dynamic • Predictability ensures • Save money on licensing
flexibility workload compute capacity is costs
available when • Help meet compliance
needed and regulatory
requirements

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 47

Each Amazon EC2 pricing model provides a different set of benefits.

On-Demand Instances offer the most flexibility, with no long-term contract and low rates.

Spot Instances provide large scale at a significantly discounted price.

Reserved Instances are a good choice if you have predictable or steady-state compute needs (for
example, an instance that you know you want to keep running most or all of the time for months
or years).

Dedicated Hosts are a good choice when you have licensing restrictions for the software you
want to run on Amazon EC2, or when you have specific compliance or regulatory requirements
that preclude you from using the other deployment options.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 54
AWS Training and Certification Module 6: Compute

Amazon EC2 pricing models: Use cases

Spiky Workloads Time-Insensitive Steady-State Workloads Highly Sensitive


Workloads Workloads

On-Demand Instances Spot Instances Reserved Instances Dedicated Hosts


• Short-term, spiky, or • Applications with flexible • Steady state or predictable • Bring your own license
unpredictable workloads start and end times usage workloads (BYOL)
• Application development or • Applications only feasible • Applications that require • Compliance and regulatory
testing at very low compute prices reserved capacity, including restrictions
• Users with urgent disaster recovery
computing needs for large • Users able to make upfront • Usage and licensing
amounts of additional payments to reduce total tracking
capacity computing costs even • Control instance placement
further

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 48

Here is a review of some use cases for the various pricing options.

On-Demand Instance pricing works well for spiky workloads or if you only need to test or run an
application for a short time (for example, during application development or testing). Sometimes,
your workloads are unpredictable, and On-Demand Instances are a good choice for these cases.

Spot Instances are a good choice if your applications can tolerate interruption with a 2-minute
warning notification. By default, instances are terminated, but you can configure them to stop or
hibernate instead. Common use cases include fault-tolerant applications such as web servers, API
backends, and big data processing. Workloads that constantly save data to persistent storage
(such as Amazon S3) are also good candidates.

Reserved Instances are a good choice when you have long-term workloads with predictable
usage patterns, such as servers that you know you will want to run in a consistent way over many
months.

Dedicated Hosts are a good choice when you have existing per-socket, per-core, or per-VM
software licenses, or when you must address specific corporate compliance and regulatory
requirements.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 55
AWS Training and Certification Module 6: Compute

The four pillars of cost optimization

Cost Optimization

Increase Optimal Optimize


Right size
elasticity pricing storage
model choices

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 49

To optimize costs, you must consider four consistent, powerful drivers:


• Right-size – Choose the right balance of instance types. Notice when servers can be either
sized down or turned off, and still meet your performance requirements.
• Increase elasticity – Design your deployments to reduce the amount of server capacity that is
idle by implementing deployments that are elastic, such as deployments that use automatic
scaling to handle peak loads.
• Optimal pricing model – Recognize the available pricing options. Analyze your usage patterns
so that you can run EC2 instances with the right mix of pricing options.
• Optimize storage choices – Analyze the storage requirements of your deployments. Reduce
unused storage overhead when possible, and choose less expensive storage options if they can
still meet your requirements for storage performance.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 56
AWS Training and Certification Module 6: Compute

Pillar 1: Right size

✓Provision instances to match the need


Pillars: • CPU, memory, storage, and network throughput
1. Right size   • Select appropriate instance types for your use
2. Increase elasticity
3. Optimal pricing model ✓Use Amazon CloudWatch metrics
4. Optimize storage choices
• How idle are instances? When?
• Downsize instances

✓Best practice: Right size, then reserve

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 50

First, consider right-sizing. AWS offers approximately 60 instance types and sizes. The wide choice
of options enables customers to select the instance that best fits their workload. It can be
difficult to know where to start and what instance choice will prove to be the best, from both a
technical perspective and a cost perspective. Right-sizing is the process of reviewing deployed
resources and looking for opportunities to downsize when possible.

To right-size:
• Select the cheapest instance available that still meets your performance requirements.
• Review CPU, RAM, storage, and network utilization to identify instances that could be
downsized. You might want to provision a variety of instance types and sizes in a test
environment, and then test your application on those different test deployments to identify
which instances offer the best performance-to-cost ratio. For right-sizing, use techniques such
as load testing to your advantage.
• Use Amazon CloudWatch metrics and set up custom metrics. A metric represents a time-
ordered set of values that are published to CloudWatch (for example, the CPU usage of a
particular EC2 instance). Data points can come from any application or business activity for
which you collect data.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 57
AWS Training and Certification Module 6: Compute

Pillar 2: Increase elasticity


✓Stop or hibernate Amazon EBS-backed
instances that are not actively in use
Pillars: • Example: non-production development or test
1. Right-Size instances
2. Increase Elasticity 
3. Optimal pricing model
 ✓Use automatic scaling to match needs based
4. Optimize storage choices on usage
• Automated and time-based elasticity

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 51

One form of elasticity is to create, start, or use EC2 instances when they are needed, but then to
turn them off when they are not in use. Elasticity is one of the central tenets of the cloud, but
customers often go through a learning process to operationalize elasticity to drive cost savings.

The easiest way for large customers to embrace elasticity is to look for resources that look like
good candidates for stopping or hibernating, such as non-production environments, development
workloads, or test workloads. For example, if you run development or test workloads in a single
time zone, you can easily turn off those instances outside of business hours and thus reduce
runtime costs by perhaps 65 percent. The concept is similar to why there is a light switch next to
the door, and why most offices encourage employees to turn off the lights on their way out of the
office each night.

For production workloads, configuring more precise and granular automatic scaling policies can
help you take advantage of horizontal scaling to meet peak capacity needs and to not pay for
peak capacity all the time.

As a rule of thumb, you should target 20–30 percent of your Amazon EC2 instances to run as On-
Demand Instances or Spot Instances, and you should also actively look for ways to maximize
elasticity.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 58
AWS Training and Certification Module 6: Compute

Pillar 3: Optimal pricing model


✓Leverage the right pricing model for your use case
Pillars: • Consider your usage patterns
1. Right-Size
2. Increase Elasticity ✓Optimize and combine purchase types
3. Optimal pricing model 
4. Optimize storage choices

✓Examples:
• Use On-Demand Instance and Spot Instances for variable
workloads

• Use Reserved Instances for predictable workloads

✓Consider serverless solutions (AWS Lambda)

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 52

AWS provides a number of pricing models for Amazon EC2 to help customers save money. The
models available were discussed in detail earlier in this module. Customers can combine multiple
purchase types to optimize pricing based on their current and forecast capacity needs.

Customers are also encouraged to consider their application architecture. For example, does the
functionality provided by your application need to run on an EC2 virtual machine? Perhaps by
making use of the AWS Lambda service instead, you could significantly decrease your costs.

AWS Lambda is discussed later in this module.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 59
AWS Training and Certification Module 6: Compute

Pillar 4: Optimize storage choices


✓ Reduce costs while maintaining storage performance and
availability
Pillars:
✓ Resize EBS volumes
1. Right-Size
2. Increase Elasticity
3. Optimal pricing model ✓ Change EBS volume types
4. Optimize storage choices   ✓ Can you meet performance requirements with less expensive storage?
✓ Example: Amazon EBS Throughput Optimized HDD (st1) storage typically
costs half as much as the default General Purpose SSD (gp2) storage option.

✓ Delete EBS snapshots that are no longer needed

✓ Identify the most appropriate destination for specific types of


data
✓ Does the application need the instance to reside on Amazon EBS?
✓ Amazon S3 storage options with lifecycle policies can reduce costs

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 53

Customers can also reduce storage costs. When you launch EC2 instances, different instance
types offer different storage options. It is a best practice to try to reduce costs while also
maintaining storage performance and availability.

One way you can accomplish this is by resizing EBS volumes. For example, if you originally
provisioned a 500-GB volume for an EC2 instance that will only need a maximum of 20 GB of
storage space, you can reduce the size of the volume and save on costs.

There are also a variety of EBS volume types. Choose the least expensive type that still meets
your performance requirements. For example, Amazon EBS Throughput Optimized HDD (st1)
storage typically costs half as much as the default General Purpose SSD (gp2) storage option. If an
st1 drive will meet the needs of your workload, take advantage of the cost savings.

Customers often use EBS snapshots to create data backups. However, some customers forget to
delete snapshots that are no longer needed. Delete these unneeded snapshots to save on costs.

Finally, try to identify the most appropriate destination for specific types of data. Does your
application need the data it uses to reside on Amazon EBS? Would the application run equally as
well if it used Amazon S3 for storage instead? Configuring data lifecycle policies can also reduce
costs. For example, you might automate the migration of older infrequently accessed data to
cheaper storage locations, such as Amazon Simple Storage Service Glacier.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 60
AWS Training and Certification Module 6: Compute

Measure, monitor, and improve


• Cost optimization is an ongoing process.

• Recommendations –
• Define and enforce cost allocation tagging.

• Define metrics, set targets, and review regularly.

• Encourage teams to architect for cost.

• Assign the responsibility of optimization to an individual or


to a team.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 54

If it is done correctly, cost optimization is not a one-time process that a customer completes.
Instead, by routinely measuring and analyzing your systems, you can continually improve and
adjust your costs.

Tagging helps provide information about what resources are being used by whom and for what
purpose. You can activate cost allocation tags in the Billing and Cost Management console, and
AWS can generate a cost allocation report with usage and costs grouped by your active tags.
Apply tags that represent business categories (such as cost centers, application names, or
owners) to organize your costs across multiple services.

Encourage teams to architect for cost. AWS Cost Explorer is a free tool that you can use to view
graphs of your costs. You can use Cost Explorer to see patterns in how much you spend on AWS
resources over time, identify areas that need further inquiry, and see trends that you can use to
understand your costs.

Use AWS services such as AWS Trusted Advisor, which provides real-time guidance to help you
provision resources that follow AWS best practices.

Cost-optimization efforts are typically more successful when the responsibility for cost
optimization is assigned to an individual or to a team.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 61
AWS Training and Certification Module 6: Compute

Section 3 key • Amazon EC2 pricing models include On-Demand Instances,


Reserved Instances, Spot Instances, Dedicated Instances,
takeaways and Dedicated Hosts.

• Spot Instances can be interrupted with a 2-minute


notification. However, they can offer significant cost savings
over On-Demand Instances.

• The four pillars of cost optimization are:


• Right size
• Increase elasticity
• Optimal pricing model
• Optimize storage choices

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 55

Some key takeaways from this section of the module are:


• Amazon EC2 pricing models include On-Demand Instances, Reserved Instances, Spot
Instances, Dedicated Instances, and Dedicated Hosts. Per second billing is available for On-
Demand Instances, Reserved Instances, and Spot Instances that use only Amazon Linux and
Ubuntu.
• Spot Instances can be interrupted with a 2-minute notification. However, they can offer
significant cost savings over On-Demand Instances.
• The four pillars of cost optimization are –
• Right size
• Increase elasticity
• Optimal pricing model
• Optimize storage choices

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 62
AWS Training and Certification Module 6: Compute

Section 4: Container services


Module 6: Compute

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introducing Section 4: Container services.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 63
AWS Training and Certification Module 6: Compute

Container basics
• Containers are a method of
operating system virtualization. Your Container

• Benefits – Your application

• Repeatable.
Dependencies
• Self-contained environments.
• Software runs the same in different Configurations
environments.
• Developer's laptop, test, production. Hooks into OS

• Faster to launch and stop or terminate


than virtual machines
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 57

Containers are a method of operating system virtualization that enables you to run an
application and its dependencies in resource-isolated processes. By using containers, you can
easily package an application's code, configurations, and dependencies into easy-to-use building
blocks that deliver environmental consistency, operational efficiency, developer productivity, and
version control.

Containers are smaller than virtual machines, and do not contain an entire operating system.
Instead, containers share a virtualized operating system and run as resource-isolated processes,
which ensure quick, reliable, and consistent deployments. Containers hold everything that the
software needs to run, such as libraries, system tools, code, and the runtime.

Containers deliver environmental consistency because the application’s code, configurations,


and dependencies are packaged into a single object.

In terms of space, container images are usually an order of magnitude smaller than virtual
machines. Spinning up a container happens in hundreds of milliseconds. Thus, by using
containers, you can use a fast, portable, and infrastructure-agnostic environments.

Containers can help ensure that applications deploy quickly, reliably, and consistently, regardless
of deployment environment. Containers also give you more granular control over resources,
which gives your infrastructure improved efficiency.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 64
AWS Training and Certification Module 6: Compute

What is Docker?
• Docker is a software platform
that enables you to build, test,
and deploy applications quickly. Container

• You run containers on Docker.


Containers have everything the
• Containers are created from a software needs to run:
template called an image.
• A container has everything a System
Libraries Code Runtime
software application needs to tools

run.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 58

Docker is a software platform that packages software (such as applications) into containers.

Docker is installed on each server that will host containers, and it provides simple commands that
you can use to build, start, or stop containers.

By using Docker, you can quickly deploy and scale applications into any environment.

Docker is best used as a solution when you want to:


• Standardize environments
• Reduce conflicts between language stacks and versions
• Use containers as a service
• Run microservices using standardized code deployments
• Require portability for data processing

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 65
AWS Training and Certification Module 6: Compute

Containers versus virtual machines


Three virtual machines on three EC2 instances
Example
VM 1 VM 2 VM 3
Three containers on one EC2 instance Container
App 1 App 2 App 3
Container Container Container
Bins/Libs Bins/Libs Bins/Libs
instance 1 instance 2 instance 3

Docker App 1 App 2 App 3


EC2 EC2 EC2
engine
Bins/Libs Bins/Libs Bins/Libs instance instance instance
guest OS guest OS guest OS
EC2 instance guest OS

Hypervisor
Part of
Host operating system AWS Global
Infrastructure
Physical server

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 59

Many people who are first introduced to the concept of a container think that containers are
exactly like virtual machines. However, the differences are in the details. One significant
difference is that virtual machines run directly on a hypervisor, but containers can run on any
Linux OS if they have the appropriate kernel feature support and the Docker daemon is present.
This makes containers very portable. Your laptop, your VM, your EC2 instance, and your bare
metal server are all potential hosts where you can run a container.

The right of the diagram has a virtual machine (VM)-based deployment. Each of the three EC2
instances runs directly on the hypervisor that is provided by the AWS Global Infrastructure. Each
EC2 instance runs a virtual machine. In this VM-based deployment, each of the three apps runs
on its own VM, which provides process isolation.

The left of the diagram has a container-based deployment. There is only one EC2 instance that
runs a virtual machine. The Docker engine is installed on the Linux guest OS of the EC2 instance,
and there are three containers. In this container-based deployment, each app runs in its own
container (which provides process isolation), but all the containers run on a single EC2 instance.
The processes that run in the containers communicate directly to the kernel in the Linux guest OS
and are largely unaware of their container silo. The Docker engine is present to manage how the
containers run on the Linux guest OS, and it also provides essential management functions
throughout the container lifecycle.

In an actual container-based deployment, a large EC2 instance could run hundreds of containers.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 66
AWS Training and Certification Module 6: Compute

Amazon Elastic Container Service (Amazon ECS)


• Amazon Elastic Container Service (Amazon ECS) –
• A highly scalable, fast, container management service

• Key benefits –
Amazon Elastic
• Orchestrates the running of Docker containers Container Service
• Maintains and scales the fleet of nodes that run your containers
• Removes the complexity of standing up the infrastructure

• Integrated with features that are familiar to Amazon EC2 service users –
• Elastic Load Balancing
• Amazon EC2 security groups
• Amazon EBS volumes
• IAM roles
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 60

Given what you now know about containers, you might think that you could launch one or more
Amazon EC2 instances, install Docker on each instance, and manage and run the Docker
containers on those Amazon EC2 instances yourself. While that is an option, AWS provides a
service called Amazon Elastic Container Service (Amazon ECS) that simplifies container
management.

Amazon Elastic Container Service (Amazon ECS) is a highly scalable, high-performance container
management service that supports Docker containers. Amazon ECS enables you to easily run
applications on a managed cluster of Amazon EC2 instances.

Essential Amazon ECS features include the ability to:


• Launch up to tens of thousands of Docker containers in seconds
• Monitor container deployment
• Manage the state of the cluster that runs the containers
• Schedule containers by using a built-in scheduler or a third-party scheduler (for example,
Apache Mesos or Blox)

Amazon ECS clusters can also use Spot Instances and Reserved Instances.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 67
AWS Training and Certification Module 6: Compute

Amazon ECS orchestrates containers

EC2 instance
Requests to run containers
x3 x2

Container A
EC2 instance

Container B
Amazon Elastic Container
Service (Amazon ECS)

ECS cluster
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 61

To prepare your application to run on Amazon ECS, you create a task definition which is a text file
that describes one or more containers, up to a maximum of ten, that form your application. It
can be thought of as a blueprint for your application. Task definitions specify parameters for your
application, for example which containers to use, which ports should be opened for your
application, and what data volumes should be used with the containers in the task.

A task is the instantiation of a task definition within a cluster. You can specify the number of tasks
that will run on your cluster. The Amazon ECS task scheduler is responsible for placing tasks
within your cluster. A task will run anywhere from one to ten containers, depending on the task
definition you defined.

When Amazon ECS runs the containers that make up your task, it places them on an ECS cluster.
The cluster (when you choose the EC2 launch type) consists of a group of EC2 instances each of
which is running an Amazon ECS container agent.

Amazon ECS provides multiple scheduling strategies that will place containers across your clusters
based on your resource needs (for example, CPU or RAM) and availability requirements.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 68
AWS Training and Certification Module 6: Compute

Amazon ECS cluster options


• Key question: Do you want to manage the Amazon ECS cluster that runs the containers?

• If yes, create an Amazon ECS cluster backed by Amazon EC2 (provides more granular control
over infrastructure)
• If no, create an Amazon ECS cluster backed by AWS Fargate (easier to maintain, focus on your
applications)
Containers
Amazon ECS Container Container Container Amazon ECS cluster
cluster backed by instance 1 instance 2 instance 3 backed by Fargate
Amazon EC2 App 1 App 2 App 3
You manage
Bins/Libs Bins/Libs Bins/Libs
You manage
Docker engines (one per OS in the cluster)
AWS manages
VM guest operating systems in the Amazon ECS cluster

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 62

When you create an Amazon ECS cluster, you have three options:
• A Networking Only cluster (powered by AWS Fargate)
• An EC2 Linux + Networking cluster
• An EC2 Windows + Networking cluster

If you choose one of the two EC2 launch type options, you will then be prompted to choose
whether the cluster EC2 instances will run as On-Demand Instances or Spot Instances. In addition,
you will need to specify many details about the EC2 instances that will make up your cluster—the
same details that you must specify when you launch a stand lone EC2 instance. In this way, the
EC2 launch type provides more granular control over the infrastructure that runs your container
applications because you manage the EC2 instances that make up the cluster.
Amazon ECS keeps track of all the CPU, memory, and other resources in your cluster. Amazon ECS
also finds the best server for your container on based on your specified resource requirements.

If you choose the networking-only Fargate launch type, then the cluster that will run your
containers will be managed by AWS. With this option, you only need to package your application
in containers, specify the CPU and memory requirements, define networking and IAM policies,
and launch the application. You do not need to provision, configure, or scale the cluster. It
removes the need to choose server types, decide when to scale your clusters, or optimize cluster
packing. The Fargate option enables you to focus on designing and building your applications.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 69
AWS Training and Certification Module 6: Compute

What is Kubernetes?
• Kubernetes is open source software for container orchestration.
• Deploy and manage containerized applications at scale.
• The same toolset can be used on premises and in the cloud.
• Complements Docker.
• Docker enables you to run multiple containers on a single OS host.
• Kubernetes orchestrates multiple Docker hosts (nodes).
• Automates –
• Container provisioning.
• Networking.
• Load distribution.
• Scaling.
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 63

Kubernetes is open source software for container orchestration. Kubernetes can work with many
containerization technologies, including Docker. Because it is a popular open source project, a
large community of developers and companies build extensions, integrations, and plugins that
keep the software relevant, and new and in-demand features are added frequently.

Kubernetes enables you to deploy and manage containerized applications at scale. With
Kubernetes, you can run any type of containerized application by using the same toolset in both
on-premises data centers and the cloud. Kubernetes manages a cluster of compute instances
(called nodes). It runs containers on the cluster, which are based on where compute resources
are available and the resource requirements of each container. Containers are run in logical
groupings called pods. You can run and scale one or many containers together as a pod. Each pod
is given an IP address and a single Domain Name System (DNS) name, which Kubernetes uses to
connect your services with each other and external traffic.

A key advantage of Kubernetes is that you can use it to run your containerized applications
anywhere without needing to change your operational tooling. For example, applications can be
moved from local on-premises development machines to production deployments in the cloud by
using the same operational tooling.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 70
AWS Training and Certification Module 6: Compute

Amazon Elastic Kubernetes Service (Amazon EKS)


• Amazon Elastic Kubernetes Service (Amazon EKS)
• Enables you to run Kubernetes on AWS
• Certified Kubernetes conformant (supports easy migration)
• Supports Linux and Windows containers Amazon Elastic
Kubernetes Service
• Compatible with Kubernetes community tools and supports
popular Kubernetes add-ons

• Use Amazon EKS to –


• Manage clusters of Amazon EC2 compute instances
• Run containers that are orchestrated by Kubernetes on those
instances
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 64

You might think that you could launch one or more Amazon EC2 instances, install Docker on each
instance, install Kubernetes on the cluster, and manage and run Kubernetes yourself. While that is
an option, AWS provides a service called Amazon Elastic Kubernetes Service (Amazon EKS) that
simplifies the management of Kubernetes clusters.

Amazon Elastic Kubernetes Service (Amazon EKS) is a managed Kubernetes service that makes it
easy for you to run Kubernetes on AWS without needing to install, operate, and maintain your
own Kubernetes control plane. It is certified Kubernetes conformant, so existing applications that
run on upstream Kubernetes are compatible with Amazon EKS.

Amazon EKS automatically manages the availability and scalability of the cluster nodes that are
responsible for starting and stopping containers, scheduling containers on virtual machines,
storing cluster data, and other tasks. It automatically detects and replaces unhealthy control
plane nodes for each cluster. You can take advantage of the performance, scale, reliability, and
availability of the AWS Cloud, which includes AWS networking and security services like
Application Load Balancers for load distribution, IAM for role-based access control, and VPC for
pod networking.

You may be wondering why Amazon offers both Amazon ECS and Amazon EKS, since they are
both capable of orchestrating Docker containers. The reason that both services exist is to provide
customers with flexible options. You can decide which option best matches your needs.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 71
AWS Training and Certification Module 6: Compute

Amazon Elastic Container Registry (Amazon ECR)


Amazon ECR is a fully managed Docker container registry
that makes it easy for developers to store, manage, and
deploy Docker container images.
Amazon ECS integration

Docker support

Team collaboration

Amazon Elastic
Container Registry Access control

Third-party integrations
Image Registry

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 65

Amazon Elastic Container Registry (Amazon ECR) is a fully managed Docker container registry
that makes it easy for developers to store, manage, and deploy Docker container images. It is
integrated with Amazon ECS, so you can store, run, and manage container images for
applications that run on Amazon ECS. Specify the Amazon ECR repository in your task definition,
and Amazon ECS will retrieve the appropriate images for your applications.

Amazon ECR supports Docker Registry HTTP API version 2, which enables you to interact with
Amazon ECR by using Docker CLI commands or your preferred Docker tools. Thus, you can
maintain your existing development workflow and access Amazon ECR from any Docker
environment—whether it is in the cloud, on premises, or on your local machine.

You can transfer your container images to and from Amazon ECS via HTTPS. Your images are also
automatically encrypted at rest using Amazon S3 server-side encryption.

It is also possible to use Amazon ECR images with Amazon EKS. See the Using Amazon ECR
Images with Amazon EKS documentation at
https://docs.aws.amazon.com/AmazonECR/latest/userguide/ECR_on_EKS.html for details.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 72
AWS Training and Certification Module 6: Compute

Section 4 key • Containers can hold everything that an application needs to


run.
takeaways • Docker is a software platform that packages software into
containers.
• A single application can span multiple containers.
• Amazon Elastic Container Service (Amazon ECS)
orchestrates the running of Docker containers.
• Kubernetes is open source software for container
orchestration.
• Amazon Elastic Kubernetes Service (Amazon EKS) enables
you to run Kubernetes on AWS
• Amazon Elastic Container Registry (Amazon ECR) enables
you to store, manage, and deploy your Docker containers.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 66

Some key takeaways from this section include:


• Containers can hold everything that an application needs to run.
• Docker is a software platform that packages software into containers.
• A single application can span multiple containers.
• Amazon Elastic Container Service (Amazon ECS) orchestrates the running of Docker containers.
• Kubernetes is open source software for container orchestration.
• Amazon Elastic Kubernetes Service (Amazon EKS) enables you to run Kubernetes on AWS
• Amazon Elastic Container Registry (Amazon ECR) enables you to store, manage, and deploy
your Docker containers.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 73
AWS Training and Certification Module 6: Compute

Section 5: Introduction to AWS


Lambda
Module 6: Compute

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introducing Section 5: Introduction to AWS Lambda.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 74
AWS Training and Certification Module 6: Compute

AWS Lambda: Run code without servers

AWS Lambda is a serverless compute service.

The code you run


is a Lambda function
Upload your code

AWS HTTP
services endpoints Your code Pay only for the
Mobile apps
runs only when it is compute time that
Run your code on a schedule
triggered you use
or in response to events

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 68

As you saw in the earlier sections of this module, AWS offers many compute options. For
example, Amazon EC2 provides virtual machines. As another example, Amazon ECS and Amazon
EKS are container-based compute services.

However, there is another approach to compute that does not require you to provision or
manage servers. This third approach is often referred to as serverless computing.

AWS Lambda is an event-driven, serverless compute service. Lambda enables you to run code
without provisioning or managing servers.

You create a Lambda function, which is the AWS resource that contains the code that you
upload. You then set the Lambda function to be triggered, either on a scheduled basis or in
response to an event. Your code only runs when it is triggered.

You pay only for the compute time you consume—you are not charged when your code is not
running.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 75
AWS Training and Certification Module 6: Compute

Benefits of Lambda

It supports multiple programming languages


Completely automated administration
Built-in fault tolerance
AWS It supports the orchestration of multiple functions
Lambda
Pay-per-use pricing

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 69

With Lambda, there are no new languages, tools, or frameworks to learn. Lambda supports
multiple programming languages, including Java, Go, PowerShell, Node.js, C#, Python, and Ruby.
Your code can use any library, either native or third-party.

Lambda completely automates the administration. It manages all the infrastructure to run your
code on highly available, fault-tolerant infrastructure, which enables you to focus on building
differentiated backend services. Lambda seamlessly deploys your code; does all the
administration, maintenance, and security patches; and provides built-in logging and monitoring
through Amazon CloudWatch.

Lambda provides built-in fault tolerance. It maintains compute capacity across multiple
Availability Zones in each Region to help protect your code against individual machine failures or
data center failures. There are no maintenance windows or scheduled downtimes.

You can orchestrate multiple Lambda functions for complex or long-running tasks by building
workflows with AWS Step Functions. Use Step Functions to define workflows. These workflows
trigger a collection of Lambda functions by using sequential, parallel, branching, and error-
handling steps. With Step Functions and Lambda, you can build stateful, long-running processes
for applications and backends.

With Lambda, you pay only for the requests that are served and the compute time that is
required to run your code. Billing is metered in increments of 100 milliseconds, which make it
cost-effective and easy to scale automatically from a few requests per day to thousands of
requests per second.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 76
AWS Training and Certification Module 6: Compute

AWS Lambda event sources

Event sources Configure other AWS services as event sources to


invoke your function as shown here.

Amazon S3 Alternatively, invoke a Lambda function from the


Lambda console, AWS SDK, or AWS CLI.
Amazon DynamoDB

Amazon Simple Notification


Service (Amazon SNS)
Lambda Running of your code
Amazon Simple Queue function (only when triggered)
Service (Amazon SQS)
AWS Lambda
Amazon API Gateway
Logging,
monitoring, and
Application Load Balancer metrics
Amazon
Many more… CloudWatch

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 70

An event source is an AWS service or a developer-created application that produces events that
trigger an AWS Lambda function to run.

Some services publish events to Lambda by invoking the Lambda function directly. These services
that invoke Lambda functions asynchronously include, but are not limited to, Amazon S3,
Amazon Simple Notification Service (Amazon SNS), and Amazon CloudWatch Events.

Lambda can also poll resources in other services that do not publish events to Lambda. For
example, Lambda can pull records from an Amazon Simple Queue Service (Amazon SQS) queue
and run a Lambda function for each fetched message. Lambda can similarly read events from
Amazon DynamoDB.

Some services, such as Elastic Load Balancing (Application Load Balancer) and Amazon API
Gateway can invoke your Lambda function directly.

You can invoke Lambda functions directly with the Lambda console, the Lambda API, the AWS
software development kit (SDK), the AWS CLI, and AWS toolkits. The direct invocation approach
can be useful, such as when you are developing a mobile app and want the app to call Lambda
functions. See the Using Lambda with Other Services documentation at
https://docs.aws.amazon.com/lambda/latest/dg/lambda-services.html for further details about
all supported services.

AWS Lambda automatically monitors Lambda functions by using Amazon CloudWatch. To help
you troubleshoot failures in a function, Lambda logs all requests that are handled by your
function. It also automatically stores logs that are generated by your code through Amazon
CloudWatch Logs.
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 77
AWS Training and Certification Module 6: Compute

AWS Lambda function configuration

Lambda function configuration

Function
code
Running of your code (only
AWS when it is triggered)
Lambda
Dependencies AWS Lambda
function
(code libraries, etc.)
Logging, monitoring,
and metrics

Amazon
Execution role CloudWatch

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 71

Remember that a Lambda function is the custom code that you write to process events, and that
Lambda runs the Lambda function on your behalf.

When you use the AWS Management Console to create a Lambda function, you first give the
function a name. Then, you specify:
• The runtime environment the function will use (for example, a version of Python or Node.js)
• An execution role (to grant IAM permission to the function so that it can interact with other
AWS services as necessary)

Next, after you click Create Function, you configure the function. Configurations include:
• Add a trigger (specify one of the available event sources from the previous slide)
• Add your function code (use the provided code editor or upload a file that contains your code)
• Specify the memory in MB to allocate to your function (128 MB to 10,240 MB)
• Optionally specify environment variables, description, timeout, the specific virtual private
cloud (VPC) to run the function in, tags you would like to use, and other settings. For more
information, see Configuring functions in the AWS Lambda console
https://docs.aws.amazon.com/lambda/latest/dg/configuration-console.html in the AWS
Documentation.

All of the above settings end up in a Lambda deployment package which is a ZIP archive that
contains your function code and dependencies. When you use the Lambda console to author
your function, the console manages the package for you. However, you need to create a
deployment package if you use the Lambda API to manage functions.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 78
AWS Training and Certification Module 6: Compute

Schedule-based Lambda function example:


Start and stop EC2 instances
Stop instances example

Stop
IAM role

Time-based
1 CloudWatch 2 Lambda function 3 EC2 instances
event triggered stopped

Start instances example

Start
IAM role

Time-based
4 5 Lambda function 6 EC2 instances
CloudWatch
triggered started
event

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 72

Consider an example use case for a schedule-based Lambda function. Say that you are in a
situation where you want to reduce your Amazon EC2 usage. You decide that you want to stop
instances at a predefined time (for example, at night when no one is accessing them) and then
you want to start the instances back up in the morning (before the workday starts).

In this situation, you could configure AWS Lambda and Amazon CloudWatch Events to help you
accomplish these actions automatically.

Here is what happens at each step in the example:


1. A CloudWatch event is scheduled to run a Lambda function to stop your EC2 instances at (for
example) 22:00 GMT.
2. The Lambda function is triggered and runs with the IAM role that gives the function
permission to stop the EC2 instances.
3. The EC2 instances enter the stopped state.
4. Later, at (for example) 05:00 AM GMT, a CloudWatch event is scheduled to run a Lambda
function to start the EC2 instances.
5. The Lambda function is triggered and runs with the IAM role that gives it permission to start
the EC2 instances.
6. The EC2 instances enter the running state.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 79
AWS Training and Certification Module 6: Compute

Event-based Lambda function example:


Create thumbnail images
AWS Cloud
1
2 3

User

Source Lambda 4
bucket
Execution
5
role
Access
policy
Target
bucket Lambda
function

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 73

Now, consider an example use case for an event-based Lambda function. Suppose that you want
to create a thumbnail for each image (.jpg or .png object) that is uploaded to an S3 bucket.

To build a solution, you can create a Lambda function that Amazon S3 invokes when objects are
uploaded. Then, the Lambda function reads the image object from the source bucket and creates
a thumbnail image in a target bucket. Here’s how it works:
1. A user uploads an object to the source bucket in Amazon S3 (object-created event).
2. Amazon S3 detects the object-created event.
3. Amazon S3 publishes the object-created event to Lambda by invoking the Lambda function
and passing event data.
4. Lambda runs the Lambda function by assuming the execution role that you specified when
you created the Lambda function.
5. Based the event data that the Lambda function receives, it knows the source bucket name
and object key name. The Lambda function reads the object and creates a thumbnail by using
graphics libraries, and saves the thumbnail to the target bucket.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 80
AWS Training and Certification Module 6: Compute

AWS Lambda quotas


Soft limits per Region:
• Concurrent executions = 1,000
• Function and layer storage = 75 GB

Hard limits for individual functions:


• Maximum function memory allocation = 10,240 MB
• Function timeout = 15 minutes
• Deployment package size = 250 MB unzipped, including layers
• Container image code package size = 10 GB

Additional limits also exist. Details are in the AWS Lambda quotas documentation at
https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 74

AWS Lambda does have some quotas that you should know about when you create and deploy
Lambda functions.

AWS Lambda limits the amount of compute and storage resources that you can use to run and
store functions. For example, as of this writing, the maximum memory allocation for a single
Lambda function is 10,240 MB. It also has limits of 1,000 concurrent executions in a Region.
Lambda functions can be configured to run up to 15 minutes per run. You can set the timeout to
any value between 1 second and 15 minutes. If you are troubleshooting a Lambda deployment,
keep these limits in mind.

There are limits on the deployment package size of a function (250 MB). A layer is a ZIP archive
that contains libraries, a custom runtime, or other dependencies. With layers, you can use
libraries in your function without needing to include them in your deployment package. Using
layers can help avoid reaching the size limit for deployment package. Layers are also a good way
to share code and data between Lambda functions.

For larger workloads that rely on sizable dependencies, such as machine learning or data
intensive workloads, you can deploy your Lambda function to a container image up to 10 GB in
size.

Limits are either soft or hard. Soft limits on an account can potentially be relaxed by submitting a
support ticket and providing justification for the request. Hard limits cannot be increased.

For the details on current AWS Lambda quotas, refer to the AWS Lambda quotas documentation
at https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 81
AWS Training and Certification Module 6: Compute

Section 5 key • Serverless computing enables you to build and run


applications and services without provisioning or managing
takeaways servers.

• AWS Lambda is a serverless compute service that provides


built-in fault tolerance and automatic scaling.

• An event source is an AWS service or developer-created


application that triggers a Lambda function to run.

• The maximum memory allocation for a single Lambda


function is 10,240 MB.

• The maximum run time for a Lambda function is 15


minutes.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 75

Some key takeaways from this section of the module include:


• Serverless computing enables you to build and run applications and services without
provisioning or managing servers.
• AWS Lambda is a serverless compute service that provides built-in fault tolerance and
automatic scaling.
• An event source is an AWS service or developer-created application that triggers a Lambda
function to run.
• The maximum memory allocation for a single Lambda function is 10,240 MB.
• The maximum run time for a Lambda function is 15 minutes.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 82
AWS Training and Certification Module 6: Compute

Activity: Create an To complete this activity:


AWS Lambda
Stopinator • Go to the hands-on lab environment and launch
Function the AWS Lambda activity.

• Follow the instructions that are provided in the


hands-on lab environment.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 76

In this hands-on activity, you will create a basic Lambda function that stops an EC2 instance.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 83
AWS Training and Certification Module 6: Compute

Activity debrief:
key takeaways

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 77

The instructor will lead a conversation about the key takeaways from the activity after students
have completed it.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 84
AWS Training and Certification Module 6: Compute

Section 6: Introduction to AWS


Elastic Beanstalk
Module 6: Compute

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introducing Section 6: Introduction to AWS Elastic Beanstalk.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 85
AWS Training and Certification Module 6: Compute

AWS Elastic Beanstalk


• An easy way to get web applications up and running

• A managed service that automatically handles –


• Infrastructure provisioning and configuration
• Deployment
• Load balancing
• Automatic scaling
AWS Elastic • Health monitoring
Beanstalk • Analysis and debugging
• Logging

• No additional charge for Elastic Beanstalk


• Pay only for the underlying resources that are used
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 79

AWS Elastic Beanstalk is another AWS compute service option. It is a platform as a service (or
PaaS) that facilitates the quick deployment, scaling, and management of your web applications
and services.

You remain in control. The entire platform is already built, and you only need to upload your
code. Choose your instance type, your database, set and adjust automatic scaling, update your
application, access the server log files, and enable HTTPS on the load balancer.

You upload your code and Elastic Beanstalk automatically handles the deployment, from capacity
provisioning and load balancing to automatic scaling and monitoring application health. At the
same time, you retain full control over the AWS resources that power your application, and you
can access the underlying resources at any time.

There is no additional charge for AWS Elastic Beanstalk. You pay for the AWS resources (for
example, EC2 instances or S3 buckets) you create to store and run your application. You only pay
for what you use, as you use it. There are no minimum fees and no upfront commitments.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 86
AWS Training and Certification Module 6: Compute

AWS Elastic Beanstalk deployments


• It supports web applications
written for common platforms
• Java, .NET, PHP, Node.js, Python,
Ruby, Go, and Docker You Your code
manage
HTTP server
• You upload your code Application server

• Elastic Beanstalk automatically AWS Language interpreter


manages
handles the deployment Operating system
• Deploys on servers such as Apache, Host
NGINX, Passenger, Puma, and
Microsoft Internet Information
Services (IIS)

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 80

AWS Elastic Beanstalk enables you to deploy your code through the AWS Management Console,
the AWS Command Line Interface (AWS CLI), Visual Studio, and Eclipse. It provides all the
application services that you need for your application. The only thing you must create is your
code. Elastic Beanstalk is designed to make deploying your application a quick and easy process.

Elastic Beanstalk supports a broad range of platforms. Supported platforms include Docker, Go,
Java, .NET, Node.js, PHP, Python, and Ruby.

AWS Elastic Beanstalk deploys your code on Apache Tomcat for Java applications; Apache HTTP
Server for PHP and Python applications; NGINX or Apache HTTP Server for Node.js applications;
Passenger or Puma for Ruby applications; and Microsoft Internet Information Services (IIS) for
.NET applications, Java SE, Docker, and Go.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 87
AWS Training and Certification Module 6: Compute

Benefits of Elastic Beanstalk

Fast and simple Developer Difficult to Complete resource


to start using productivity outgrow control

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 81

Elastic Beanstalk is fast and simple to start using. Use the AWS Management Console, a Git
repository, or an integrated development environment (IDE) such as Eclipse or Visual Studio to
upload your application. Elastic Beanstalk automatically handles the deployment details of
capacity provisioning, load balancing, automatic scaling, and monitoring application health.

You can improve your developer productivity by focusing on writing code instead of managing
and configuring servers, databases, load balancers, firewalls, and networks. AWS updates the
underlying platform that runs your application with patches and updates.

Elastic Beanstalk is difficult to outgrow. With Elastic Beanstalk, your application can handle peaks
in workload or traffic while minimizing your costs. It automatically scales your application up or
down based on your application's specific needs by using easily adjustable automatic scaling
settings. You can use CPU utilization metrics to trigger automatic scaling actions.

You have the freedom to select the AWS resources—such as Amazon EC2 instance type—that
are optimal for your application. Elastic Beanstalk enables you to retain full control over the AWS
resources that power your application. If you decide that you want to take over some (or all) of
the elements of your infrastructure, you can do so seamlessly by using the management
capabilities that are provided by Elastic Beanstalk.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 88
AWS Training and Certification Module 6: Compute

Activity: AWS To complete this activity:


Elastic Beanstalk
• Go to the hands-on lab environment and launch
the AWS Elastic Beanstalk activity.

• Follow the instructions that are provided in the


hands-on lab environment.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 82

In this hands-on activity, you will gain an understanding of why you might want to use Elastic
Beanstalk to deploy a web application on AWS.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 89
AWS Training and Certification Module 6: Compute

Activity debrief:
Key takeaways

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 83

The instructor might choose to lead a conversation about the key takeaways from the activity
after you have completed it.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 90
AWS Training and Certification Module 6: Compute

Section 6 key • AWS Elastic Beanstalk enhances developer


productivity.
takeaways • Simplifies the process of deploying your application.
• Reduces management complexity.

• Elastic Beanstalk supports Java, .NET, PHP,


Node.js, Python, Ruby, Go, and Docker

• There is no charge for Elastic Beanstalk. Pay only


for the AWS resources that you use.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 84

Some key takeaways from this section of the module include:


• AWS Elastic Beanstalk enhances developer productivity.
• Simplifies the process of deploying your application.
• Reduces management complexity.
• Elastic Beanstalk supports Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker.
• There is no charge for Elastic Beanstalk. Pay only for the AWS resources you use.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 91
AWS Training and Certification Module 6: Compute

Module wrap-up
Module 6: Compute

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

It’s now time to review the module and wrap up with a knowledge check and discussion of a
practice certification exam question.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 92
AWS Training and Certification Module 6: Compute

Module summary
In summary, in this module, you learned how to:
• Provide an overview of different AWS compute services in the cloud
• Demonstrate why to use Amazon Elastic Compute Cloud (Amazon EC2)
• Identify the functionality in the Amazon EC2 console
• Perform basic functions in Amazon EC2 to build a virtual computing
environment
• Identify Amazon EC2 cost optimization elements
• Demonstrate when to use AWS Elastic Beanstalk
• Demonstrate when to use AWS Lambda
• Identify how to run containerized applications in a cluster of managed servers

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 86

In summary, in this module, you learned how to:


• Provide an overview of different AWS compute services in the cloud
• Demonstrate why to use Amazon Elastic Compute Cloud (Amazon EC2)
• Identify the functionality in the Amazon EC2 console
• Perform basic functions in Amazon EC2 to build a virtual computing environment
• Identify Amazon EC2 cost optimization elements
• Demonstrate when to use AWS Elastic Beanstalk
• Demonstrate when to use AWS Lambda
• Identify how to run containerized applications in a cluster of managed servers

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 93
AWS Training and Certification Module 6: Compute

Complete the knowledge check

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 87

It is now time to complete the knowledge check for this module.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 94
AWS Training and Certification Module 6: Compute

Sample exam question


Which AWS service helps developers quickly deploy resources which can make use of different
programming languages, such as .NET and Java?

Choice Response

A AWS CloudFormation

B AWS SQS

C AWS Elastic Beanstalk

D Amazon Elastic Compute Cloud (Amazon EC2)

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 88

Look at the answer choices and rule them out based on the keywords.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 95
AWS Training and Certification Module 6: Compute

Sample exam question answer


Which AWS service helps developers quickly deploy resources which can make use of different
programming languages, such as .NET and Java?

The correct answer is C.


The keywords in the question are developers quickly deploy resources and different
programming languages.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 89

The following are the keywords to recognize: developers quickly deploy resources and different
programming languages.

The correct answer is C. AWS Elastic Beanstalk

Incorrect answers:
Answer A: AWS CloudFormation
Answer B: AWS SQS
Answer D: Amazon Elastic Compute Cloud (Amazon EC2)

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 96
AWS Training and Certification Module 6: Compute

Additional resources
• Amazon EC2 Documentation: https://docs.aws.amazon.com/ec2/
• Amazon EC2 Pricing: https://aws.amazon.com/ec2/pricing/
• Amazon ECS Workshop: https://ecsworkshop.com/
• Running Containers on AWS: https://containersonaws.com/
• Amazon EKS Workshop: https://www.eksworkshop.com/
• AWS Lambda Documentation: https://docs.aws.amazon.com/lambda/
• AWS Elastic Beanstalk Documentation: https://docs.aws.amazon.com/elastic-
beanstalk/
• Cost Optimization Playbook:
https://d1.awsstatic.com/pricing/AWS_CO_Playbook_Final.pdf

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 90

Compute services on AWS is a large topic, and this module only provided an introduction to the
subject. The following resources provide more detail:
• Amazon EC2 Documentation: https://docs.aws.amazon.com/ec2/
• Amazon EC2 Pricing: https://aws.amazon.com/ec2/pricing/
• Amazon ECS Workshop: https://ecsworkshop.com/
• Running Containers on AWS: https://containersonaws.com/
• Amazon EKS Workshop: https://www.eksworkshop.com/
• AWS Lambda Documentation: https://docs.aws.amazon.com/lambda/
• AWS Elastic Beanstalk Documentation: https://docs.aws.amazon.com/elastic-beanstalk/
• Cost Optimization Playbook: https://d1.awsstatic.com/pricing/AWS_CO_Playbook_Final.pdf

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 97
AWS Training and Certification Module 6: Compute

Thank you

Corrections, feedback, or other questions?


Contact us at https://support.aws.amazon.com/#/contacts/aws-academy.
All trademarks are the property of their owners.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 91

Thank you for completing this module.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 98
AWS Academy Cloud Foundations
Module 07 Student Guide
Version 2.0.12
100-ACCLFO-20-EN-SG
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.

This work may not be reproduced or redistributed, in whole or in part,


without prior written permission from Amazon Web Services, Inc.
Commercial copying, lending, or selling is prohibited.

All trademarks are the property of their owners.


AWS Training and Certification AWS Academy Cloud Foundations

Contents
Module 7: Storage 4

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 3
AWS Training and Certification Module 7: Storage

Module 7: Storage
AWS Academy Cloud Foundations

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Welcome to Module 7: Storage.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 4
AWS Training and Certification Module 7: Storage

Module overview
Topics Demos
• Amazon Elastic Block Store (Amazon • Amazon EBS console
EBS) • Amazon S3 console
• Amazon Simple Storage Service • Amazon EFS console
(Amazon S3) • Amazon S3 Glacier console
• Amazon Elastic File System (Amazon Lab
EFS) • Working with Amazon EBS
• Amazon Simple Storage Service Activities
Glacier • Storage solution case study

Knowledge check
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 2

Cloud storage is typically more reliable, scalable, and secure than traditional on-premises storage
systems. Cloud storage is a critical component of cloud computing because it holds the
information that applications use. Big data analytics, data warehouses, the Internet of Things
(IoT), databases, and backup and archive applications all rely on some form of data storage
architecture.

This module addresses the following topics:


• Amazon Elastic Block Store (Amazon EBS)
• Amazon Simple Storage Service (Amazon S3)
• Amazon Elastic File System (Amazon EFS)
• Amazon Simple Storage Service Glacier

This module includes four recorded demonstrations that show you how to use the AWS
Management Console to create storage solutions.

This module includes a hands-on lab where you create an Amazon EBS volume, and then attach it
to an Amazon Elastic Compute Cloud (Amazon EC2) instance. You also create a snapshot of your
volume and then use the snapshot to create a new volume.

This module includes an activity that challenges you to determine the best storage solution for a
business case.

Finally, you are asked to complete a knowledge check that tests your understanding of the key
concepts in this module.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 5
AWS Training and Certification Module 7: Storage

Module objectives
After completing this module, you should be able to:
• Identify the different types of storage
• Explain Amazon S3
• Identify the functionality in Amazon S3
• Explain Amazon EBS
• Identify the functionality in Amazon EBS
• Perform functions in Amazon EBS to build an Amazon EC2 storage solution
• Explain Amazon EFS
• Identify the functionality in Amazon EFS
• Explain Amazon S3 Glacier
• Identify the functionality in Amazon S3 Glacier
• Differentiate between Amazon EBS, Amazon S3, Amazon EFS, and Amazon S3 Glacier

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 3

The goal of this module is to discover key concepts that relate to storage. You will learn about the
different types of storage resources that are available and review the different pricing options so
that you can understand how different choices affect your solution cost.

After completing this module, you should be able to:


• Identify the different types of storage
• Explain Amazon S3
• Identify the functionality in Amazon S3
• Explain Amazon EBS
• Identify the functionality in Amazon EBS
• Perform functions in Amazon EBS to build an Amazon EC2 storage solution
• Explain Amazon EFS
• Identify the functionality in Amazon EFS
• Explain Amazon S3 Glacier
• Identify the functionality in Amazon S3 Glacier
• Differentiate between Amazon EBS, Amazon S3, Amazon EFS, and Amazon S3 Glacier

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 6
AWS Training and Certification Module 7: Storage

Core AWS services

Amazon Amazon
S3 EBS

Amazon Amazon
EFS S3 Glacier
Amazon Virtual Amazon Elastic AWS Identity and
Private Cloud Compute Cloud Storage Access Management
(Amazon VPC) (Amazon EC2) (IAM)

Amazon Relational Amazon


Database Service DynamoDB
Database
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 4

Storage is another AWS core service category. Some broad categories of storage include: instance
store (ephemeral storage), Amazon EBS, Amazon EFS, Amazon S3, and Amazon S3 Glacier.
• Instance store, or ephemeral storage, is temporary storage that is added to your Amazon EC2
instance.
• Amazon EBS is persistent, mountable storage that can be mounted as a device to an Amazon
EC2 instance. Amazon EBS can be mounted to an Amazon EC2 instance only within the same
Availability Zone. Only one Amazon EC2 instance at a time can mount an Amazon EBS volume.
• Amazon EFS is a shared file system that multiple Amazon EC2 instances can mount at the same
time.
• Amazon S3 is persistent storage where each file becomes an object and is available through a
Uniform Resource Locator (URL); it can be accessed from anywhere.
• Amazon S3 Glacier is for cold storage for data that is not accessed frequently (for example,
when you need long-term data storage for archival or compliance reasons).

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 7
AWS Training and Certification Module 7: Storage

Section 1: Amazon Elastic Block


Store (Amazon EBS)
Module 7: Storage

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introduce Section 1: Amazon Elastic Block Store (Amazon EBS).

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 8
AWS Training and Certification Module 7: Storage

Storage

Amazon Elastic Block Store


(Amazon EBS)

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 6

Amazon EBS provides persistent block storage volumes for use with Amazon EC2 instances.
Persistent storage is any data storage device that retains data after power to that device is shut
off. It is also sometimes called non-volatile storage.

Each Amazon EBS volume is automatically replicated within its Availability Zone to protect you
from component failure. It is designed for high availability and durability. Amazon EBS volumes
provide the consistent and low-latency performance that is needed to run your workloads.

With Amazon EBS, you can scale your usage up or down within minutes, while paying a low price
for only what you provision.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 9
AWS Training and Certification Module 7: Storage

AWS storage options: Block storage versus object storage

What if you want to change one character in a 1-GB file?

Block storage Object storage


Change one block (piece of the file) Entire file must be updated
that contains the character
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 7

What happens if you want to change one character in a 1-GB file? With block storage, you change
only the block that contains the character. With object storage, the entire file must be updated.

One critical difference between some storage types is whether they offer block-level storage or
object-level storage.

This difference has a major effect on the throughput, latency, and cost of your storage solution.
Block storage solutions are typically faster and use less bandwidth, but they can cost more than
object-level storage.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 10
AWS Training and Certification Module 7: Storage

Amazon EBS
Amazon EBS enables you to create individual storage volumes and
attach them to an Amazon EC2 instance:
• Amazon EBS offers block-level storage.
• Volumes are automatically replicated within its Availability Zone.
• It can be backed up automatically to Amazon S3 through snapshots.
• Uses include –
• Boot volumes and storage for Amazon Elastic Compute Cloud (Amazon EC2)
instances
• Data storage with a file system
• Database hosts
• Enterprise applications
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 8

Amazon EBS enables you to create individual storage volumes and attach them to an Amazon EC2
instance. Amazon EBS offers block-level storage, where its volumes are automatically replicated
within its Availability Zone. Amazon EBS is designed to provide durable, detachable, block-level
storage (which is like an external hard drive) for your Amazon EC2 instances. Because they are
directly attached to the instances, they can provide low latency between where the data is stored
and where it might be used on the instance.

For this reason, they can be used to run a database with an Amazon EC2 instance. Amazon EBS
volumes are included as part of the backup of your instances into Amazon Machine Images (or
AMIs). AMIs are stored in Amazon S3 and can be reused to create new Amazon EC2 instances
later.

A backup of an Amazon EBS volume is called a snapshot. The first snapshot is called the baseline
snapshot. Any other snapshot after the baseline captures only what is different from the previous
snapshot.

Amazon EBS volumes uses include:


• Boot volumes and storage for Amazon EC2 instances
• Data storage with a file system
• Database hosts
• Enterprise applications

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 11
AWS Training and Certification Module 7: Storage

Amazon EBS volume types

Solid State Drives (SSD) Hard Disk Drives (HDD)


General Purpose Provisioned Throughput- Cold
IOPS Optimized
Maximum Volume Size 16 TiB 16 TiB 16 TiB 16 TiB
Maximum IOPS/Volume 16,000 64,000 500 250
Maximum 250 MiB/s 1,000 MiB/s 500 MiB/s 250 MiB/s
Throughput/Volume

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 9

Matching the correct technology to your workload is a best practice for reducing storage costs.
Provisioned IOPS SSD-backed Amazon EBS volumes can give you the highest performance.
However, if your application doesn't require or won't use performance that high, General
Purpose SSD is usually sufficient. Only SSDs can be used as boot volumes for EC2 instances. The
lower-cost options might be a solution for additional storage or use cases other than boot
volumes.

To learn more about Amazon EBS volume types, see


https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 12
AWS Training and Certification Module 7: Storage

Amazon EBS volume type use cases


Solid State Drives (SSD) Hard Disk Drives (HDD)
General Purpose Provisioned IOPS Throughput-Optimized Cold
• This type is • Critical business • Streaming workloads • Throughput-oriented
recommended for most applications that require that require consistent, storage for large
workloads sustained IOPS fast throughput at a low volumes of data that is
performance, or more price infrequently accessed
than 16,000 IOPS or 250
MiB/second of
throughput per volume
• System boot volumes • Large database • Big data • Scenarios where the
workloads lowest storage cost is
important
• Virtual desktops • Data warehouses • It cannot be a boot
volume
• Low-latency interactive
• Log processing
applications
• Development and test • It cannot be a boot
environments volume

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 10

As mentioned previously an Amazon EBS volume is a durable, block-level storage device that you
can attach to a single EC2 instance. You can use Amazon EBS volumes as primary storage for data
that requires frequent updates, such as the system drive for an instance or storage for a database
application. You can also use them for throughput-intensive applications that perform continuous
disk scans. Amazon EBS volumes persist independently from the running life of an EC2 instance.

Use cases for EBS vary by the storage type used and whether you are using General Purpose of
Provisioned IOPS.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 13
AWS Training and Certification Module 7: Storage

Amazon EBS features


• Snapshots –
• Point-in-time snapshots
• Recreate a new volume at any time
• Encryption –
• Encrypted Amazon EBS volumes
• No additional cost
• Elasticity –
• Increase capacity
• Change to different types

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 11

To provide an even higher level of data durability, Amazon EBS enables you to create point-in-
time snapshots of your volumes, and you can re-create a new volume from a snapshot at any
time. You can also share snapshots or even copy snapshots to different AWS Regions for even
greater disaster recovery (DR) protection. For example, you can encrypt and share your
snapshots from Virginia in the US to Tokyo, Japan.

You can also have encrypted Amazon EBS volumes at no additional cost, so the data that moves
between the EC2 instance and the EBS volume inside AWS data centers is encrypted in transit.

As your company grows, the amount of data that is stored on your Amazon EBS volumes is also
likely to grow. Amazon EBS volumes can increase capacity and change to different types, so you
can change from hard disk drives (HDDs) to solid state drives (SSDs) or increase from a 50-GB
volume to a 16-TB volume. For example, you can do this resize operation dynamically without
needing to stop the instances.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 14
AWS Training and Certification Module 7: Storage

Amazon EBS: Volumes, IOPS, and pricing


1. Volumes –
• Amazon EBS volumes persist independently from the instance.
• All volume types are charged by the amount that is provisioned per
month.
2. IOPS –
• General Purpose SSD:
• Charged by the amount that you provision in GB per month until storage is
released.
• Magnetic:
• Charged by the number of requests to the volume.
• Provisioned IOPS SSD:
• Charged by the amount that you provision in IOPS (multiplied by the
percentage of days that you provision for the month).

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 12

When you begin to estimate the cost for Amazon EBS, you must consider the following:
1. Volumes – Volume storage for all Amazon EBS volume types is charged by the amount you
provision in GB per month, until you release the storage.
2. IOPS – I/O is included in the price of General Purpose SSD volumes. However, for Amazon EBS
magnetic volumes, I/O is charged by the number of requests that you make to your volume.
With Provisioned IOPS SSD volumes, you are also charged by the amount you provision in
IOPS (multiplied by the percentage of days that you provision for the month).

The pricing and provisioning of Amazon EBS are complex. In general, you pay for the size of the
volume and its usage. To learn more about the full, highly complex pricing and provisioning
concepts of Amazon EBS, see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-
volume-types.html.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 15
AWS Training and Certification Module 7: Storage

Amazon EBS: Snapshots and data transfer


3. Snapshots –
• Added cost of Amazon EBS snapshots to Amazon S3 is per GB-
month of data stored.

4. Data transfer –
• Inbound data transfer is free.
• Outbound data transfer across Regions incurs charges.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 13

3. Snapshots – Amazon EBS enables you to back up snapshots of your data to Amazon S3 for
durable recovery. If you opt for Amazon EBS snapshots, the added cost is per GB-month of
data stored.
4. Data transfer – When you copy Amazon EBS snapshots, you are charged for the data that is
transferred across Regions. After the snapshot is copied, standard Amazon EBS snapshot
charges apply for storage in the destination Region.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 16
AWS Training and Certification Module 7: Storage

Section 1 key Amazon EBS features:


takeaways • Persistent and customizable block storage
for Amazon EC2
• HDD and SSD types
• Replicated in the same Availability Zone
• Easy and transparent encryption
• Elastic volumes
• Back up by using snapshots

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 14

Amazon EBS provides block-level storage volumes for use with Amazon EC2 instances. Amazon
EBS volumes are off-instance storage that persists independently from the life of an instance.
They are analogous to virtual disks in the cloud. Amazon EBS provides three volume types:
General Purpose SSD, Provisioned IOPS SSD, and magnetic.

The three volume types differ in performance characteristics and cost, so you can choose the
right storage performance and price for the needs of your applications.

Additional benefits include replication in the same Availability Zone, easy and transparent
encryption, elastic volumes, and backup by using snapshots.

To learn more about Amazon EBS, see: https://aws.amazon.com/ebs/.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 17
AWS Training and Certification Module 7: Storage

Recorded demo:
Amazon Elastic
Block Store

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 15

Now, take a moment to watch the Elastic Block Store demo at https://aws-tc-largeobjects.s3-us-
west-2.amazonaws.com/ILT-TF-100-ACFNDS-20-EN/Module_7_EBS+v2.0.mp4. The recording runs
a little over 5 minutes, and it reinforces many of the concepts that were discussed in this section
of the module.

The demonstration shows how to configure the following resources by using the AWS
Management Console. The demonstration shows how to:
• Create an Amazon General Purpose (SSD) EBS volume
• Attach the EBS volume to an EC2 instance

The demonstration also shows how to interact with the EBS volume using the Amazon Command
Line Interface and how to mount the EBS volume to the EC2 instance.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 18
AWS Training and Certification Module 7: Storage

Lab 4:
Working with
Amazon EBS

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 16

You will now work on Lab 4: Working with Amazon EBS.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 19
AWS Training and Certification Module 7: Storage

Lab 4: Scenario
This lab is designed to show you how to create an Amazon EBS
volume. After you create the volume, you will attach the volume to an
Amazon EC2 instance, configure the instance to use a virtual disk,
create a snapshot and then restore from the snapshot.

Attached Created
Amazon Amazon Snapshot
EC2 EBS
instance

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 17

This lab is designed to show you how to create an Amazon EBS volume. After you create the
volume, you will attach the volume to an Amazon EC2 instance, configure the instance to use a
virtual disk, create a snapshot and then restore from the snapshot.

After completing this lab, you should be able to:


• Create an Amazon EBS volume
• Attach that volume to an instance
• Configure the instance to use the virtual disk
• Create an Amazon EBS snapshot
• Restore the snapshot

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 20
AWS Training and Certification Module 7: Storage

Lab 4: Final product

Attached Created
Amazon Amazon Snapshot
EC2 EBS
instance

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 18

In this lab, you:


• Created an Amazon EBS volume
• Attached that volume to an instance
• Configured the instance to use the virtual disk
• Created an Amazon EBS snapshot
• Restored the snapshot

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 21
AWS Training and Certification Module 7: Storage

~ 30 minutes

Begin Lab 4: Working


with Amazon EBS

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 19

It is now time to start the lab.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 22
AWS Training and Certification Module 7: Storage

Lab debrief:
Key takeaways

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 20

In this lab, you:


• Created an Amazon EBS volume
• Attached the volume to an instance
• Configured the instance to use the virtual disk
• Created an Amazon EBS snapshot
• Restored the snapshot

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 23
AWS Training and Certification Module 7: Storage

Section 2: Amazon Simple Storage


Service (Amazon S3)
Module 7: Storage

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introduce Section 3: Amazon Simple Storage Service.

Companies need the ability to simply and securely collect, store, and analyze their data on a
massive scale. Amazon S3 is object storage that is built to store and retrieve any amount of data
from anywhere: websites and mobile apps, corporate applications, and data from Internet of
Things (IoT) sensors or devices.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 24
AWS Training and Certification Module 7: Storage

Storage

Amazon Simple Storage Service


(Amazon S3)

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 22

Amazon S3 is object-level storage, which means that if you want to change a part of a file, you
must make the change and then re-upload the entire modified file. Amazon S3 stores data as
objects within resources that are called buckets.

You will now learn more about Amazon S3.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 25
AWS Training and Certification Module 7: Storage

Amazon S3 overview

• Data is stored as objects in buckets


• Virtually unlimited storage
• Single object is limited to 5 TB
• Designed for 11 9s of durability
• Granular access to bucket and objects

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 23

Amazon S3 is a managed cloud storage solution that is designed to scale seamlessly and provide
11 9s of durability. You can store virtually as many objects as you want in a bucket, and you can
write, read, and delete objects in your bucket. Bucket names are universal and must
be unique across all existing bucket names in Amazon S3. Objects can be up to 5 TB in size. By
default, data in Amazon S3 is stored redundantly across multiple facilities and multiple devices in
each facility.

The data that you store in Amazon S3 is not associated with any particular server, and you do not
need manage any infrastructure yourself. You can put as many objects into Amazon S3 as you
want. Amazon S3 holds trillions of objects and regularly peaks at millions of requests per second.

Objects can be almost any data file, such as images, videos, or server logs. Because Amazon S3
supports objects as large as several terabytes in size, you can even store database snapshots as
objects. Amazon S3 also provides low-latency access to the data over the internet by Hypertext
Transfer Protocol (HTTP) or Secure HTTP (HTTPS), so you can retrieve data anytime from
anywhere. You can also access Amazon S3 privately through a virtual private cloud (VPC)
endpoint. You get fine-grained control over who can access your data by using AWS Identity and
Access Management (IAM) policies, Amazon S3 bucket policies, and even per-object access
control lists.

By default, none of your data is shared publicly. You can also encrypt your data in transit and
choose to enable server-side encryption on your objects.
You can access Amazon S3 through the web-based AWS Management Console; programmatically
through the API and SDKs; or with third-party solutions, which use the API or the SDKs.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 26
AWS Training and Certification Module 7: Storage

Amazon S3 includes event notifications that enable you to set up automatic notifications when
certain events occur, such as when an object is uploaded to a bucket or deleted from a specific
bucket. Those notifications can be sent to you, or they can be used to trigger other processes, such
as AWS Lambda functions.
With storage class analysis, you can analyze storage access patterns and transition the right data to
the right storage class. The Amazon S3 Analytics feature automatically identifies the optimal lifecycle
policy to transition less frequently accessed storage to Amazon S3 Standard – Infrequent Access
(Amazon S3 Standard-IA). You can configure a storage class analysis policy to monitor an entire
bucket, a prefix, or an object tag.
When an infrequent access pattern is observed, you can easily create a new lifecycle age policy that
is based on the results. Storage class analysis also provides daily visualizations of your storage usage
in the AWS Management Console. You can export them to an Amazon S3 bucket to analyze by using
the business intelligence (BI) tools of your choice, such as Amazon QuickSight.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 27

23
AWS Training and Certification Module 7: Storage

Amazon S3 storage classes


Amazon S3 offers a range of object-level storage classes that are
designed for different use cases:
• Amazon S3 Standard

• Amazon S3 Intelligent-Tiering

• Amazon S3 Standard-Infrequent Access (Amazon S3 Standard-IA)

• Amazon S3 One Zone-Infrequent Access (Amazon S3 One Zone-IA)

• Amazon S3 Glacier

• Amazon S3 Glacier Deep Archive

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 24

Amazon S3 offers a range of object-level storage classes that are designed for different use cases.
These classes include:
• Amazon S3 Standard – Amazon S3 Standard is designed for high durability, availability, and
performance object storage for frequently accessed data. Because it delivers low latency and
high throughput, Amazon S3 Standard is appropriate for a variety of use cases, including cloud
applications, dynamic websites, content distribution, mobile and gaming applications, and big
data analytics.
• Amazon S3 Intelligent-Tiering – The Amazon S3 Intelligent-Tiering storage class is designed to
optimize costs by automatically moving data to the most cost-effective access tier, without
performance impact or operational overhead. For a small monthly monitoring and automation
fee per object, Amazon S3 monitors access patterns of the objects in Amazon S3 Intelligent-
Tiering, and moves the objects that have not been accessed for 30 consecutive days to the
infrequent access tier. If an object in the infrequent access tier is accessed, it is automatically
moved back to the frequent access tier. There are no retrieval fees when you use the Amazon
S3 Intelligent-Tiering storage class, and no additional fees when objects are moved between
access tiers. It works well for long-lived data with access patterns that are unknown or
unpredictable.
• Amazon S3 Standard-Infrequent Access (Amazon S3 Standard-IA) – The Amazon S3 Standard-
IA storage class is used for data that is accessed less frequently, but requires rapid access when
needed. Amazon S3 Standard-IA is designed to provide the high durability, high throughput,
and low latency of Amazon S3 Standard, with a low per-GB storage price and per-GB retrieval
fee. This combination of low cost and high performance makes Amazon S3 Standard-IA good
for long-term storage and backups, and as a data store for disaster recovery files.
• Amazon S3 One Zone-Infrequent Access (Amazon S3 One Zone-IA) – Amazon S3 One Zone-IA
is for data that is accessed less frequently, but requires rapid access when needed. Unlike

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 28
AWS Training and Certification Module 7: Storage

other Amazon S3 storage classes, which store data in a minimum of three Availability Zones,
Amazon S3 One Zone-IA stores data in a single Availability Zone and it costs less than Amazon S3
Standard-IA. Amazon S3 One Zone-IA works well for customers who want a lower-cost option for
infrequently accessed data, but do not require the availability and resilience of Amazon S3
Standard or Amazon S3 Standard-IA. It is a good choice for storing secondary backup copies of on-
premises data or easily re-creatable data. You can also use it as cost-effective storage for data
that is replicated from another AWS Region by using Amazon S3 Cross-Region Replication.

• Amazon S3 Glacier – Amazon S3 Glacier is a secure, durable, and low-cost storage class for data
archiving. You can reliably store any amount of data at costs that are competitive with—or
cheaper than—on-premises solutions. To keep costs low yet suitable for varying needs, Amazon
S3 Glacier provides three retrieval options that range from a few minutes to hours. You can
upload objects directly to Amazon S3 Glacier, or use Amazon S3 lifecycle policies to transfer data
between any of the Amazon S3 storage classes for active data (Amazon S3 Standard, Amazon S3
Intelligent-Tiering, Amazon S3 Standard-IA, and Amazon S3 One Zone-IA) and Amazon S3 Glacier.

• Amazon S3 Glacier Deep Archive – Amazon S3 Glacier Deep Archive is the lowest-cost storage
class for Amazon S3. It supports long-term retention and digital preservation for data that might
be accessed once or twice in a year. It is designed for customers — particularly customers in
highly regulated industries, such as financial services, healthcare, and public sectors — that retain
datasets for 7–10 years (or more) to meet regulatory compliance requirements. Amazon S3
Glacier Deep Archive can also be used for backup and disaster recovery use cases. It is a cost-
effective and easy-to-manage alternative to magnetic tape systems, whether these tape systems
are on-premises libraries or off-premises services. Amazon S3 Glacier Deep Archive complements
Amazon S3 Glacier, and it is also designed to provide 11 9s of durability. All objects that are stored
in Amazon S3 Glacier Deep Archive are replicated and stored across at least three geographically
dispersed Availability Zones, and these objects can be restored within 12 hours.

For more information about Amazon S3 storage classes, see


https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-class-intro.html.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 29

24
AWS Training and Certification Module 7: Storage

Amazon S3 bucket URLs (two styles)


Amazon S3 To upload your data:
1. Create a bucket in an AWS Region.
2. Upload almost any number of objects to the bucket.

Bucket path-style URL endpoint:


[bucket name] https://s3.ap-northeast-1.amazonaws.com/bucket-name

Region code Bucket name

Bucket virtual hosted-style URL endpoint:


Preview2.mp4 https:// bucket-name.s3-ap-northeast-1.amazonaws.com
Tokyo Region
(ap-northeast-1) Bucket name Region code

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 25

To use Amazon S3 effectively, you must understand a few simple concepts. First, Amazon S3
stores data inside buckets. Buckets are essentially the prefix for a set of files, and must be
uniquely named across all of Amazon S3 globally. Buckets are logical containers for objects. You
can have one or more buckets in your account. You can control access for each bucket—who can
create, delete, and list objects in the bucket. You can also view access logs for the bucket and its
objects, and choose the geographical region where Amazon S3 stores the bucket and its contents.

To upload your data (such as photos, videos, or documents), create a bucket in an AWS Region,
and then upload almost any number of objects to the bucket.

In the example, Amazon S3 was used to create a bucket in the Tokyo Region, which is identified
within AWS formally by its Region code: ap-northeast-1

The URL for a bucket is structured like the examples. You can use two different URL styles to refer
to buckets.

Amazon S3 refers to files as objects. As soon as you have a bucket, you can store almost any
number of objects inside it. An object is composed of data and any metadata that describes that
file, including a URL. To store an object in Amazon S3, you upload the file that you want to store
to a bucket.

When you upload a file, you can set permissions on the data and any metadata.

In this example, the object Preview2.mp4 is stored inside the bucket. The URL for the file includes
the object name at the end.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 30
AWS Training and Certification Module 7: Storage

Data is redundantly stored in the Region

media/welcome.mp4
Facility 1 Facility 2 Facility 3

my-bucket-name

Region

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 26

When you create a bucket in Amazon S3, it is associated with a specific AWS Region. When you
store data in the bucket, it is redundantly stored across multiple AWS facilities within your
selected Region.

Amazon S3 is designed to durably store your data, even if there is concurrent data loss in two
AWS facilities.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 31
AWS Training and Certification Module 7: Storage

Designed for seamless scaling

media/welcome.mp4 prod2.mp4 prod3.mp4 prod4.mp4

prod5.mp4 prod6.mp4 prod7.mp4 prod8.mp4

my-bucket-name

prod9.mp4 prod10.mp4 prod11.mp4 prod12.mp4

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 27

Amazon S3 automatically manages the storage behind your bucket while your data grows. You
can get started immediately, and your data storage will grow with your application needs.

Amazon S3 also scales to handle a high volume of requests. You do not need to provision the
storage or throughput, and you are billed only for what you use.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 32
AWS Training and Certification Module 7: Storage

Access the data anywhere

AWS Management
AWS Command Line SDK
Console
Interface

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 28

You can access Amazon S3 through the console, AWS Command Line Interface (AWS CLI), or AWS
SDK. You can also access the data in your bucket directly by using REST-based endpoints.

The endpoints support HTTP or HTTPS access. To support this type of URL-based access, Amazon
S3 bucket names must be globally unique and Domain Name Server (DNS)-compliant.

Also, object keys should use characters that are safe for URLs.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 33
AWS Training and Certification Module 7: Storage

Common use cases


• Storing application assets
• Static web hosting
• Backup and disaster recovery (DR)
• Staging area for big data
• Many more….

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 29

This flexibility to store a virtually unlimited amount of data—and to access that data from
anywhere—means that Amazon S3 is suitable for a variety of scenarios. You will now consider
some use cases for Amazon S3:
• As a location for any application data, Amazon S3 buckets provide a shared location for storing
objects that any instances of your application can access—including applications on Amazon
EC2 or even traditional servers. This feature can be useful for user-generated media files,
server logs, or other files that your application must store in a common location. Also, because
the content can be fetched directly over the internet, you can offload serving that content
from your application and enable clients to directly fetch the data from Amazon S3
themselves.
• For static web hosting, Amazon S3 buckets can serve the static contents of your website,
including HTML, CSS, JavaScript, and other files.
• The high durability of Amazon S3 makes it a good candidate for storing backups of your data.
For greater availability and disaster recovery capability, Amazon S3 can even be configured to
support cross-Region replication so that data in an Amazon S3 bucket in one Region can be
automatically replicated to another Amazon S3 Region.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 34
AWS Training and Certification Module 7: Storage

Amazon S3 common scenarios

• Backup and storage

• Application hosting
Amazon S3 buckets
• Media hosting

• Software delivery
Corporate
data center
Amazon
EC2
instances
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 30

Backup and storage – Provide data backup and storage services for others

Application hosting – Provide services that deploy, install, and manage web applications

Media hosting – Build a redundant, scalable, and highly available infrastructure that hosts video,
photo, or music uploads and downloads

Software delivery – Host your software applications that customers can download

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 35
AWS Training and Certification Module 7: Storage

Amazon S3 pricing
• Pay only for what you use, including –
• GBs per month
• Transfer OUT to other Regions
• PUT, COPY, POST, LIST, and GET requests

• You do not pay for –


• Transfers IN to Amazon S3
• Transfers OUT from Amazon S3 to Amazon CloudFront or Amazon EC2 in
the same Region

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 31

With Amazon S3, specific costs vary depending on the Region and the specific requests that were
made. You pay only for what you use, including gigabytes per month; transfer out of other
Regions; and PUT, COPY, POST, LIST, and GET requests.

As a general rule, you pay only for transfers that cross the boundary of your Region, which means
you do not pay for transfers in to Amazon S3 or transfers out from Amazon S3 to Amazon
CloudFront edge locations within that same Region.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 36
AWS Training and Certification Module 7: Storage

Amazon S3: Storage pricing (1 of 2)


To estimate Amazon S3 costs, consider the following:
1. Storage class type –
• Standard storage is designed for:
• 11 9s of durability
• Four 9s of availability
• S3 Standard-Infrequent Access (S-IA) is designed for:
• 11 9s of durability
• Three 9s of availability
2. Amount of storage –
• The number and size of objects

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 32

When you begin to estimate the costs of Amazon S3, you must consider the following:
1. Storage class type –
• Standard storage is designed to provide 11 9s of durability and four 9s of availability.
• S3 Standard – Infrequent Access (S-IA) is a storage option within Amazon S3 that you
can use to reduce your costs by storing less frequently accessed data at slightly lower
levels of redundancy than Amazon S3 standard storage. Standard – Infrequent Access
is designed to provide the same 11 9s of durability as Amazon S3, with three 9s of
availability in a given year. Each class has different rates.
2. Amount of storage – The number and size of objects stored in your Amazon S3 buckets.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 37
AWS Training and Certification Module 7: Storage

Amazon S3: Storage pricing (2 of 2)


3. Requests –
• The number and type of requests (GET, PUT, COPY)
• Type of requests:
• Different rates for GET requests than other requests.
4. Data transfer –
• Pricing is based on the amount of data that is transferred out of
the Amazon S3 Region
• Data transfer in is free, but you incur charges for data that is transferred
out.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 33

3. Requests – Consider the number and type of requests. GET requests incur charges at different
rates than other requests, such as PUT and COPY requests.
• GET – Retrieves an object from Amazon S3. You must have READ access to use this
operation.
• PUT – Adds an object to a bucket. You must have WRITE permissions on a bucket to
add an object to it.
• COPY – Creates a copy of an object that is already stored in Amazon S3. A COPY
operation is the same as performing a GET and then a PUT.
4. Data transfer – Consider the amount of data that is transferred out of the Amazon S3 Region.
Remember that data transfer in is free, but there is a charge for data transfer out.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 38
AWS Training and Certification Module 7: Storage

Section 2 key • Amazon S3 is a fully managed cloud


storage service.
takeaways
• You can store a virtually unlimited number
of objects.
• You pay for only what you use.
• You can access Amazon S3 at any time
from anywhere through a URL.
• Amazon S3 offers rich security controls.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 34

You have completed an introduction to Amazon S3, including key features and some common use
cases.

To learn more about Amazon S3, see https://aws.amazon.com/s3/.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 39
AWS Training and Certification Module 7: Storage

Recorded demo:
Amazon Simple
Storage System

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 35

Now, take a moment to watch the Amazon S3 demo at https://aws-tc-largeobjects.s3-us-west-


2.amazonaws.com/ILT-TF-100-ACFNDS-20-EN/Module_7_S3+v2.0.mp4. The recording runs a
little over 4 minutes, and it reinforces many of the concepts that were discussed in this section of
the module.

The demonstration shows how to configure the following resources by using the AWS
Management Console. The demonstration shows how to:
• Create an Amazon S3 bucket
• Upload files and create folders
• Change bucket settings

The demonstration also reviews some of the more commonly used settings for an S3 bucket.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 40
AWS Training and Certification Module 7: Storage

Section 3: Amazon Elastic File


System (Amazon EFS)
Module 7: Storage

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introduce Section 3: Amazon Elastic File System (Amazon EFS)

Amazon EFS implements storage for EC2 instances that multiple virtual machines can access at
the same time. It is implemented as a shared file system that uses the Network File System (NFS)
protocol.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 41
AWS Training and Certification Module 7: Storage

Storage

Amazon Elastic File


System (Amazon EFS)
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 37

Amazon Elastic File System (Amazon EFS) provides simple, scalable, elastic file storage for use
with AWS services and on-premises resources. It offers a simple interface that enables you to
create and configure file systems quickly and easily.

Amazon EFS is built to dynamically scale on demand without disrupting applications—it will grow
and shrink automatically as you add and remove files. It is designed so that your applications
have the storage they need, when they need it.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 42
AWS Training and Certification Module 7: Storage

Amazon EFS features


• File storage in the AWS Cloud
• Works well for big data and analytics, media processing workflows,
content management, web serving, and home directories
• Petabyte-scale, low-latency file system
• Shared storage
• Elastic capacity
• Supports Network File System (NFS) versions 4.0 and 4.1 (NFSv4)
• Compatible with all Linux-based AMIs for Amazon EC2

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 38

Amazon EFS is a fully managed service that makes it easy to set up and scale file storage in the
AWS Cloud. You can use Amazon EFS to build a file system for big data and analytics, media
processing workflows, content management, web serving, and home directories.

You can create file systems that are accessible to Amazon EC2 instances through a file system
interface (using standard operating system file I/O APIs). These file systems support full file
system access semantics, such as strong consistency and file locking.

Amazon EFS file systems can automatically scale from gigabytes to petabytes of data without the
need to provision storage. Thousands of Amazon EC2 instances can access an Amazon EFS file
system at the same time, and Amazon EFS is designed to provide consistent performance to each
Amazon EC2 instance. Amazon EFS is also designed to be highly durable and highly available.
Amazon EFS requires no minimum fee or setup costs, and you pay only for the storage that you
use.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 43
AWS Training and Certification Module 7: Storage

Amazon EFS architecture

VPC

Availability Zone A Availability Zone B Availability Zone C


Private subnet Private subnet Private subnet

Network Network Network


Interface Interface Interface

Private subnet

Network Network
Network
Interface Interface
Interface
Mount target Mount target Mount target

Elastic File System


© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 39

Amazon EFS provides file storage in the cloud. With Amazon EFS, you can create a file system,
mount the file system on an Amazon EC2 instance, and then read and write data from to and
from your file system. You can mount an Amazon EFS file system in your VPC, through NFS
versions 4.0 and 4.1 (NFSv4).

You can access your Amazon EFS file system concurrently from Amazon EC2 instances in your
VPC, so applications that scale beyond a single connection can access a file system. Amazon EC2
instances that run in multiple Availability Zones within the same AWS Region can access the file
system, so many users can access and share a common data source.

In the diagram, the VPC has three Availability Zones, and each Availability Zone has one mount
target that was created in it. We recommend that you access the file system from a mount target
within the same Availability Zone. One of the Availability Zones has two subnets. However, a
mount target is created in only one of the subnets.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 44
AWS Training and Certification Module 7: Storage

Amazon EFS implementation


1 Create your Amazon EC2 resources and launch your Amazon EC2 instance.

2 Create your Amazon EFS file system.

3 Create your mount targets in the appropriate subnets.

4 Connect your Amazon EC2 instances to the mount targets.

5 Verify the resources and protection of your AWS account.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 40

You must complete five steps to create and use your first Amazon EFS file system, mount it on an
Amazon EC2 instance in your VPC, and test the end-to-end setup:
1. Create your Amazon EC2 resources and launch your instance. (Before you can launch and
connect to an Amazon EC2 instance, you must create a key pair, unless you already have one.)
2. Create your Amazon EFS file system.
3. In the appropriate subnets, create your mount targets.
4. Next, connect to your Amazon EC2 instance and mount the Amazon EFS file system.
5. Finally, clean up your resources and protect your AWS account.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 45
AWS Training and Certification Module 7: Storage

Amazon EFS resources


File system
• Mount target
• Subnet ID
• Security groups
• One or more per file system

• Create in a VPC subnet


• One per Availability Zone
• Must be in the same VPC

• Tags
• Key-value pairs

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 41

In Amazon EFS, a file system is the primary resource. Each file system has properties such as:
• ID
• Creation token
• Creation time
• File system size in bytes
• Number of mount targets that are created for the file system
• File system state

Amazon EFS also supports other resources to configure the primary resource. These resources
include mount targets and tags.

Mount target: To access your file system, you must create mount targets in your VPC.
Each mount target has the following properties:
• The mount target ID
• The subnet ID for the subnet where it was created
• The file system ID for the file system where it was created
• An IP address where the file system can be mounted
• The mount target state

You can use the IP address or the Domain Name System (DNS) name in your mount command.

Tags: To help organize your file systems, you can assign your own metadata to each of the file
systems that you create. Each tag is a key-value pair.

Think of mount targets and tags as subresources that do not exist unless they are associated with
a file system.
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 46
AWS Training and Certification Module 7: Storage

Section 3 key • Amazon EFS provides file storage over a network.


• Perfect for big data and analytics, media
takeaways processing workflows, content management, web
serving, and home directories.
• Fully managed service that eliminates storage
administration tasks.
• Accessible from the console, an API, or the CLI.
• Scales up or down as files are added or removed
and you pay for what you use.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 42

You have completed an introduction to Amazon EFS, including key features and key resources.
Amazon EFS provides file storage in the cloud that works well for big data and analytics, media
processing workflows, content management, web serving, and home directories.

Amazon EFS scales up or down when files are added or removed, and you pay for only what you
are using.

Amazon EFS is a fully managed service that is accessible from the console, an API, or the AWS CLI.

To learn more about Amazon EFS, see https://aws.amazon.com/efs/

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 47
AWS Training and Certification Module 7: Storage

Recorded demo:
Amazon Elastic
File System

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 43

Now, take a moment to watch the Amazon EFS demo at https://aws-tc-largeobjects.s3-us-west-


2.amazonaws.com/ILT-TF-100-ACFNDS-20-EN/Module_7_EFS+v2.0.mp4. The recording runs a
little over 6 minutes, and it reinforces many of the concepts that were discussed in this section of
the module.

The demonstration shows how to configure the following resources by using the AWS
Management Console. The demonstration shows how to:
• Create an Elastic File System (EFS) implementation in a Virtual Private Cloud
• Attach the EFS
• Configure security and performance settings for the EFS implementation

The demonstration also reviews .how to get specific instructions for how to validate your EFS
installation so you can connect to EC2 instances.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 48
AWS Training and Certification Module 7: Storage

Section 4: Amazon S3 Glacier


Module 7: Storage

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Introducing Section 4: Amazon S3 Glacier

Amazon S3 Glacier is a secure, durable, and extremely low-cost cloud storage service for data
archiving and long-term backup.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 49
AWS Training and Certification Module 7: Storage

Storage

Amazon S3 Glacier

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 45

This section covers Amazon S3 Glacier.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 50
AWS Training and Certification Module 7: Storage

Amazon S3 Glacier review


Amazon S3 Glacier is a data archiving service that is designed for
security, durability, and an extremely low cost.
• Amazon S3 Glacier is designed to provide 11 9s of durability for objects.

• It supports the encryption of data in transit and at rest through Secure Sockets
Layer (SSL) or Transport Layer Security (TLS).

• The Vault Lock feature enforces compliance through a policy.

• Extremely low-cost design works well for long-term archiving.


• Provides three options for access to archives—expedited, standard, and bulk—
retrieval times range from a few minutes to several hours.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 46

When you use Amazon S3 Glacier to archive data, you can store your data at an extremely low
cost (even in comparison to Amazon S3), but you cannot retrieve your data immediately when
you want it.

Data that is stored in Amazon S3 Glacier can take several hours to retrieve, which is why it works
well for archiving.

There are three key Amazon S3 Glacier terms you should be familiar with:
• Archive – Any object (such as a photo, video, file, or document) that you store in Amazon S3
Glacier. It is the base unit of storage in Amazon S3 Glacier. Each archive has its own unique ID
and it can also have a description.
• Vault – A container for storing archives. When you create a vault, you specify the vault name
and the Region where you want to locate the vault.
• Vault access policy – Determine who can and cannot access the data that is stored in the
vault, and what operations users can and cannot perform. One vault access permissions policy
can be created for each vault to manage access permissions for that vault. You can also use a
vault lock policy to make sure that a vault cannot be altered. Each vault can have one vault
access policy and one vault lock policy that are attached to it.

You have three options for retrieving data, each with varying access times and cost:
• Expedited retrievals are typically made available within 1–5 minutes (highest cost).
• Standard retrievals typically complete within 3–5 hours (less time than expedited, more time
than bulk).
• Bulk retrievals typically complete within 5–12 hours (lowest cost).

You might compare these options to choosing the cost for shipping a package by using the most
© 2022 Amazon Web Services,
economical Inc. or itsfor
method affiliates.
yourAll rights reserved.
needs. 51
AWS Training and Certification Module 7: Storage

Amazon S3 Glacier

• Storage service for low-cost data


archiving and long-term backup
• You can configure lifecycle
Archive after Delete after
archiving of Amazon S3 content 30 days 5 years
to Amazon S3 Glacier
Amazon Amazon
• Retrieval options – S3 bucket S3 Glacier
Archive
• Standard: 3–5 hours
• Bulk: 5–12 hours
• Expedited: 1–5 minutes

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 47

Amazon S3 Glacier's data archiving means that although you can store your data at an extremely
low cost (even in comparison to Amazon S3), you cannot retrieve your data immediately when
you want it.

Data stored in Amazon S3 Glacier can take several hours to retrieve.

You should be familiar with three key Amazon S3 Glacier terms:


• Archive: Any object such as a photo, video, file, or document that you store in Amazon S3
Glacier. It is the base unit of storage in Amazon S3 Glacier. Each archive has its own unique ID
and can also have a description.
• Vault: A container for storing archives. When you create a vault, you specify the vault name
and the region in which you would like to locate the vault.
• Vault Access Policy: Determine who can and cannot access the data stored in the vault and
what operations users can and cannot perform. One vault access policy can be created for
each vault to manage access permissions for that vault. You can also use a vault lock policy to
make sure a vault cannot be altered. Each vault can have one vault access policy and one vault
lock policy that is attached to it.

Three options are available for retrieving data with varying access times and cost: expedited,
standard, and bulk retrievals. They are listed as follows:
• Expedited retrievals are typically made available within 1 – 5 minutes (highest cost).
• Standard retrievals typically complete within 3 – 5 hours (less than expedited, more than
bulk).
• Bulk retrievals typically complete within 5 – 12 hours (lowest cost).

Compare it to choosing the cost to most economically ship a package.


© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 52
AWS Training and Certification Module 7: Storage

Amazon S3 Glacier use cases

Media asset archiving

Healthcare information archiving

Regulatory and compliance archiving

Scientific data archiving

Digital preservation

Magnetic tape replacement

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 48

Media asset archiving


Media assets—such as video and news footage—require durable storage and can grow to many
petabytes over time. Amazon S3 Glacier enables you to archive older media content affordably
and then move it to Amazon S3 for distribution when it is needed.

Healthcare information archiving


To meet regulatory requirements, hospital systems must retain petabytes of patient records—
such as Low-Income Subsidy (LIS) information, picture archiving and communication system
(PACS) data, or Electronic Health Records (EHR)—for decades. Amazon S3 Glacier can help you
reliably archive patient record data securely at a very low cost.

Regulatory and compliance archiving


Many enterprises, like those in financial services and healthcare, must retain regulatory and
compliance archives for extended durations. Amazon S3 Glacier Vault Lock can help you set
compliance controls so you can work towards meeting your compliance objectives, such as the
U.S. Securities and Exchange Commission (SEC) Rule 17a-4(f).

Scientific data archiving


Research organizations generate, analyze, and archive large amounts of data. By using Amazon S3
Glacier, you can reduce the complexities of hardware and facility management and capacity
planning.

Digital preservation
Libraries and government agencies must handle data integrity challenges in their digital
preservation efforts. Unlike traditional systems—which can require laborious data verification
and manual repair—Amazon S3 Glacier performs regular, systematic data integrity checks, and it
© 2022 Amazon Web Services,
is designed Inc. automatically
to be or its affiliates. All rights reserved.
self-healing. 53
AWS Training and Certification Module 7: Storage

Using Amazon S3 Glacier

RESTful
web services

Java or .NET
SDKs

Amazon S3 with
lifecycle policies

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 49

To store and access data in Amazon S3 Glacier, you can use the AWS Management Console.
However, only a few operations—such as creating and deleting vaults, and creating and managing
archive policies—are available in the console.

For almost all other operations and interactions with Amazon S3 Glacier, you must use either the
Amazon S3 Glacier REST APIs, the AWS Java or .NET SDKs, or the AWS CLI.

You can also use lifecycle policies to archive data into Amazon S3 Glacier. Next, you will learn
about lifecycle policies.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 54
AWS Training and Certification Module 7: Storage

Lifecycle policies
Amazon S3 lifecycle policies enable you to delete or move objects
based on age.

Amazon S3 Amazon S3 Amazon Delete


Standard Standard - S3 Glacier
Infrequent
Access

30 days 60 days 365 days

Preview2.mp4 Preview2.mp4 Preview2.mp4

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 50

You should automate the lifecycle of the data that you store in Amazon S3. By using lifecycle
policies, you can cycle data at regular intervals between different Amazon S3 storage types. This
automation reduces your overall cost, because you pay less for data as it becomes less important
with time.

In addition to setting lifecycle rules per object, you can also set lifecycle rules per bucket.

Consider an example of a lifecycle policy that moves data as it ages from Amazon S3 Standard to
Amazon S3 Standard – Infrequent Access, and finally, into Amazon S3 Glacier before it is
deleted. Suppose that a user uploads a video to your application and your application generates a
thumbnail preview of the video. This video preview is stored to Amazon S3 Standard, because it is
likely that the user wants to access it right away.

Your usage data indicates that most thumbnail previews are not accessed after 30 days. Your
lifecycle policy takes these previews and moves them to Amazon S3 – Infrequent Access after 30
days. After another 30 days elapse, the preview is unlikely to be accessed again. The preview is
then moved to Amazon S3 Glacier, where it remains for 1 year. After 1 year, the preview is
deleted. The important thing is that the lifecycle policy manages all this movement automatically.

To learn more about object lifecycle management, see


http://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 55
AWS Training and Certification Module 7: Storage

Storage comparison
Amazon S3 Amazon S3 Glacier
Data Volume No limit No limit
Average Latency ms minutes/hours
Item Size 5 TB maximum 40 TB maximum
Cost/GB per Month Higher cost Lower cost
PUT, COPY, POST,
Billed Requests UPLOAD and retrieval
LIST, and GET
¢ ¢¢
Retrieval Pricing
Per request Per request and per GB

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 51

While Amazon S3 and Amazon S3 Glacier are both object storage solutions that enable you to
store a virtually unlimited amount of data, they have some critical differences between them. The
chart outlines some of these differences.

1. Be careful when you decide which storage solution is correct for your needs. These two
services serve very different storage needs. Amazon S3 is designed for frequent, low-latency
access to your data, but Amazon S3 Glacier is designed for low-cost, long-term storage of
infrequently accessed data.
2. The maximum item size in Amazon S3 is 5 TB, but Amazon S3 Glacier can store items that are
up to 40 TB.
3. Because Amazon S3 gives you faster access to your data, the storage cost per gigabyte is
higher than it is with Amazon S3 Glacier.
4. While both services have per-request charges, Amazon S3 charges for PUT, COPY, POST, LIST,
GET operations. In contrast, Amazon S3 Glacier charges for UPLOAD and retrieval operations.
5. Because Amazon S3 Glacier was designed for less-frequent access to data, it costs more for
each retrieval request than Amazon S3.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 56
AWS Training and Certification Module 7: Storage

Server-side encryption

Corporate AWS Cloud


data center

https Your Applications on


Amazon EC2 Amazon EC2

AWS Cloud

Data is encrypted Your application must


by default enable server-side encryption

Amazon S3 Glacier Amazon S3

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 52

Another important difference between Amazon S3 and Amazon S3 Glacier is how data is
encrypted. Server-side encryption is focused on protecting data at rest. With both solutions, you
can securely transfer your data over HTTPS. Any data that is archived in Amazon S3 Glacier is
encrypted by default. With Amazon S3, your application must initiate server-side encryption. You
can accomplish server-side encryption in Amazon S3 in several ways:
• Server-side encryption with Amazon S3-managed encryption keys (SSE-S3) employs strong
multi-factor encryption. Amazon S3 encrypts each object with a unique key. As an additional
safeguard, it encrypts the key with a main key that it regularly rotates. Amazon S3 server-side
encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption
Standard (AES-256), to encrypt your data.
• Using server-side encryption with Customer-provided Encryption Keys (SSE-C) enables you to
set your own encryption keys. You include the encryption key as part of your request, and
Amazon S3 manages both encryption (as it writes to disks), and decryption (when you access
your objects).
• Using server-side encryption with AWS Key Management Service (AWS KMS) is a service that
combines secure, highly available hardware and software to provide a key management
system that is scaled for the cloud. AWS KMS uses Customer Master Keys (CMKs) to encrypt
your Amazon S3 objects. You use AWS KMS through the Encryption Keys section in the IAM
console. You can also access AWS KMS through the API to centrally create encryption keys,
define the policies that control how keys can be used, and audit key usage to prove that they
are being used correctly. You can use these keys to protect your data in Amazon S3 buckets.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 57
AWS Training and Certification Module 7: Storage

Security with Amazon S3 Glacier

Control access with


IAM

Amazon S3 Glacier
encrypts your data with
AES-256
Amazon
S3 Glacier Amazon S3 Glacier
manages your keys for
you
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 53

By default, only you can access your data. You can enable and control access to your data in
Amazon S3 Glacier by using IAM. You set up an IAM policy that specifies user access.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 58
AWS Training and Certification Module 7: Storage

Section 4 key • Amazon S3 Glacier is a data archiving


service that is designed for security,
takeaways durability, and an extremely low cost.
• Amazon S3 Glacier pricing is based on
Region.
• Its extremely low-cost design works well
for long-term archiving.
• The service is designed to provide 11 9s of
durability for objects.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 54

You have completed an introduction to Amazon S3 Glacier, which included key differences
between Amazon S3 and Amazon S3 Glacier.

To learn more about Amazon S3 Glacier, see Glacier.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 59
AWS Training and Certification Module 7: Storage

Recorded demo:
Amazon S3
Glacier

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 55

Now, take a moment to watch the Amazon Glacier demo. The recording runs a little over 2
minutes, and it reinforces many of the concepts that were discussed in this section of the
module.

The demonstration shows how to configure the following resources by using the AWS
Management Console. The demonstration shows how to:
• Create an Amazon Glacier vault.
• Upload archived items to the vault using a third-party graphical interface tool.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 60
AWS Training and Certification Module 7: Storage

Activity: Storage
Case Studies

Photo by Pixabay from Pexels.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 56

In this educator-led activity, you will be asked to log in to the AWS Management Console. The
activity instructions are on the next slide. You will be challenged to answer five questions. The
educator will lead the class in a discussion of each question, and reveal the correct answers.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 61
AWS Training and Certification Module 7: Storage

Storage case study activity (1 of 3)


Case 1: A data analytics company for travel sites must store billions of customer events per day.
They use the data analytics services that are in the diagram. The following diagram illustrates their
architecture.

Amazon API Gateway Amazon Kinesis AWS Lambda

Amazon Kinesis
Data Firehose

Amazon Elastic Container Amazon Kinesis


Service

Storage ??

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 57

Break into groups of four or five people. Review the assigned case study. Create a presentation
that describes the best storage solution for the organization that is described in your group’s
case. Your presentation should include the key factors that you considered when you selected the
storage technology, and any factors that might change your recommendation.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 62
AWS Training and Certification Module 7: Storage

Storage case study activity (2 of 3)


Case 2: A collaboration software company processes email for enterprise customers. They have
more than 250 enterprise customers and more than half a million users. They must store petabytes
of data for their customers. The following diagram illustrates their architecture.

Elastic Load Balancing


Corporate data
center

Storage ??

Amazon EC2 instances

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 58

Break into groups of four or five people. Review the assigned case study. Create a presentation
that describes the best storage solution for the organization that is described in your group’s
case. Your presentation should include the key factors that you considered when you selected the
storage technology, and any factors that might change your recommendation.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 63
AWS Training and Certification Module 7: Storage

Storage case study activity (3 of 3)


Case 3: A data protection company must be able to ingest and store large amounts of customer
data and help their customers meet compliance requirements. They use Amazon EC2 for scalable
compute and Amazon DynamoDB for duplicate data and metadata lookups. The following diagram
illustrates their architecture.

Amazon Amazon
EC2 DynamoDB

Clients

Storage ??

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 59

Break into groups of four or five people. Review the assigned case study. Create a presentation
that describes the best storage solution for the organization that is described in your group’s
case. Your presentation should include the key factors that you considered when you selected the
storage technology, and any factors that might change your recommendation.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 64
AWS Training and Certification Module 7: Storage

Module wrap-up
Module 7: Storage

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.

It’s now time to review the module, and wrap up with a knowledge check and a discussion of a
practice certification exam question.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 65
AWS Training and Certification Module 7: Storage

Module summary
In summary, in this module, you learned how to:
• Identify the different types of storage
• Explain Amazon S3
• Identify the functionality in Amazon S3
• Explain Amazon EBS
• Identify the functionality in Amazon EBS
• Perform functions in Amazon EBS to build an Amazon EC2 storage solution
• Explain Amazon EFS
• Identify the functionality in Amazon EFS
• Explain Amazon S3 Glacier
• Identify the functionality in Amazon S3 Glacier
• Differentiate between Amazon EBS, Amazon S3, Amazon EFS, and Amazon S3 Glacier

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 61

In summary, in this module, you learned how to:


• Identify the different types of storage
• Explain Amazon S3
• Identify the functionality in Amazon S3
• Explain Amazon EBS
• Identify the functionality in Amazon EBS
• Perform functions in Amazon EBS to build an Amazon EC2 storage solution
• Explain Amazon EFS
• Identify the functionality in Amazon EFS
• Explain Amazon S3 Glacier
• Identify the functionality in Amazon S3 Glacier
• Differentiate between Amazon EBS, Amazon S3, Amazon EFS, and Amazon S3 Glacier

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 66
AWS Training and Certification Module 7: Storage

Complete the knowledge check

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 62

Complete the knowledge check for this module.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 67
AWS Training and Certification Module 7: Storage

Sample exam question


A company wants to store data that is not frequently accessed. What is the best and cost-effective
solution that should be considered?

Choice Response

A AWS Storage Gateway

B Amazon Simple Storage Service Glacier

C Amazon Elastic Block Store (Amazon EBS)

D Amazon Simple Storage Service (Amazon S3)

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 63

Look at the answer choices and rule them out based on the keywords.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 68
AWS Training and Certification Module 7: Storage

Sample exam question answer


A company wants to store data that is not frequently accessed. What is the best and cost-effective
solution that should be considered?

The correct answer is B. Amazon Simple Storage Service Glacier


The keywords in the question are “not frequently accessed” and “cost-effective solution.”

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 64

The following are the keywords to recognize: “not frequently accessed” and “cost-effective
solution.”

The correct answer is B. Amazon Simple Storage Service Glacier.

Incorrect answers:
Answer A: AWS Storage Gateway
Answer C: Amazon Elastic Block Store (Amazon EBS)
Answer D: Amazon Simple Storage Service (Amazon S3)

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 69
AWS Training and Certification Module 7: Storage

Additional resources
• AWS Storage page: https://aws.amazon.com/products/storage/

• Storage Overview: https://docs.aws.amazon.com/whitepapers/latest/aws-overview/storage-


services.html

• Recovering files from an Amazon EBS volume backup:


https://aws.amazon.com/blogs/compute/recovering-files-from-an-amazon-ebs-volume-
backup/

• Confused by AWS Storage Options? S3, EFS, EBS Explained:


https://dzone.com/articles/confused-by-aws-storage-options-s3-ebs-amp-efs-explained

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 65

If you want to learn more about the topics covered in this module, you might find the following
additional resources helpful:
• AWS Storage page: https://aws.amazon.com/products/storage/
• Storage Overview: https://docs.aws.amazon.com/whitepapers/latest/aws-overview/storage-
services.html
• Recovering files from an Amazon EBS volume backup:
https://aws.amazon.com/blogs/compute/recovering-files-from-an-amazon-ebs-volume-
backup/
• Confused by AWS Storage Options? S3, EFS, EBS Explained:
https://dzone.com/articles/confused-by-aws-storage-options-s3-ebs-amp-efs-explained

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 70
AWS Training and Certification Module 7: Storage

Thank you

Corrections, feedback, or other questions?


Contact us at https://support.aws.amazon.com/#/contacts/aws-academy.
All trademarks are the property of their owners.

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 66

Thank you for completing this module.

© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 71

You might also like