Serverless Architectural
Patterns and Best Practices
Sascha Möllering
29.03.2017
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
Serverless characteristics and practices
3-tier web application
Batch processing
Stream processing
Operations automation
Wrap-up/Q&A
Spectrum of AWS offerings
“On EC2” Managed Serverless
Amazon Amazon AWS Amazon Amazon
Amazon EC2 Lambda Cognito Kinesis
EMR Elasticsearch
Service
Amazon Amazon Amazon
Amazon Amazon S3 DynamoDB SQS
ElastiCache Redshift
Amazon
RDS Amazon API Amazon
AWS IoT
Gateway CloudWatch
Serverless patterns built with functions
Functions are the unit of deployment and scale
Scales per request—users cannot over or under-provision
Never pay for idle
Skip the boring parts; skip the hard parts
Lambda considerations and best practices
AWS Lambda is stateless—architect accordingly
• Assume no affinity with underlying compute
infrastructure
• Local filesystem access and child process may not
extend beyond the lifetime of the Lambda request
Lambda considerations and best practices
Can your Lambda functions Executes during
cold start
survive the cold?
• Instantiate AWS clients and import sys
import logging
database clients outside the import rds_config
import pymysql
scope of the handler to take
rds_host = "rds-instance"
advantage of connection re-use. db_name = rds_config.db_name
try:
• Schedule with CloudWatch conn = pymysql.connect(
except:
Events for warmth logger.error("ERROR:
• ENIs for VPC support are def handler(event, context):
with conn.cursor() as cur:
attached during cold start Executes with
each invocation
Lambda considerations and best practices
How about a file system?
• Don’t forget about /tmp (512 MB exports.ffmpeg = function(event,context)
scratch space) {
new ffmpeg('./thumb.MP4', function (err,
video)
{
if (!err) {
video.fnExtractFrameToJPG('/tmp’)
function (error, files) { … }
…
if (!error)
console.log(files);
context.done();
...
Lambda considerations and best practices
Custom CloudWatch metrics
def put_cstate ( iid, state ):
• 40 KB per POST response = cwclient.put_metric_data(
Namespace='AWSx/DirectConnect',
• Default Acct Limit of 150 TPS MetricData=[
{
• Consider aggregating with Kinesis 'MetricName':'ConnectionState',
'Dimensions': [
{
'Name': 'ConnectionId',
'Value': iid
},
],
'Value': state,
'Unit': 'None’
…
Pattern 1: 3-Tier Web Application
Web application
Amazon Amazon
CloudFront S3
Browser
Amazon API Dynamic content Data stored in
Gateway in AWS Lambda Amazon
DynamoDB
Serverless web app security
Amazon Amazon
CloudFront S3
• OAI • Bucket Policies
• Geo-Restriction • ACLs
• Signed Cookies
• Signed URLs
• DDOS
Browser
Amazon API
Gateway AWS Amazon
AuthZ Lambda DynamoDB
• Throttling
• Caching
• Usage Plans
IAM IAM
Serverless web app security
Amazon Amazon
CloudFront S3
• OAI • Bucket Policies
• Geo-Restriction • ACLs
• Signed Cookies
• Signed URLs
• DDOS
Browser
Amazon AWS WAF Amazon API
CloudFront Gateway AWS Amazon
Lambda DynamoDB
• HTTPS • Throttling
• Disable Host • Caching
Header Forwarding • Usage Plans
AuthZ IAM IAM
Serverless web app monitoring
AWS
CloudTrail
Amazon Amazon
CloudFront S3
• Access Logs in S3
• Access Logs in S3 Bucket Bucket
• CloudWatch Metrics-
https://aws.amazon.com/cl Custom CloudWatch
Browser oudfront/reporting/ Metrics & Alarms
Streams
logs logs
Amazon API AWS Amazon
AWS WAF Gateway Lambda DynamoDB
• WebACL Testing • Latency • Latency
• Invocations
• Total Requests • Count • Invocation Errors • Throughput
• Allowed/Blocked • Cache Hit/Miss • Throttled Reqs
• Duration
Requests by ACL • 4XX/5XX Errors • Returned Bytes
• Throttled
Invocations • Documentation
Serverless web app lifecycle management
AWS SAM (Serverless Application Model) - blog
AWS
Lambda
Code/Packages/
Swagger
Amazon
DynamoDB
Serverless Amazon AWS
Package & Serverless
Template S3 CloudFormation
Deploy Template
w/ CodeUri
Amazon API
Gateway
package deploy
CI/CD Tools
Amazon API Gateway best practices
Use mock integrations
Signed URL from API Gateway for large or binary file
uploads to S3
Asynchronous calls for Lambda > 30s
Greedy variable, ANY method, proxy integration
Root /
Your Node.js
/{proxy+} ANY Express app
Simple yet very powerful:
• Automatically scale to meet demand
• Only pay for the requests you receive
Pattern 2: Batch Processing
Characteristics
Large data sets
Periodic or scheduled tasks
Extract Transform Load (ETL) jobs
Usually non-interactive and long running
Many problems fit MapReduce programming model
Serverless batch processing
AWS Lambda:
Mappers
Amazon S3 AWS Lambda: AWS Lambda: Amazon S3
Object Splitter …. Reducer Results
…. Amazon DynamoDB:
Mapper Results
Considerations and best practices
Cascade mapper functions
Lambda languages vs. SQL
Speed is directly proportional to the concurrent Lambda
function limit
Use DynamoDB/ElastiCache/S3 for intermediate state of
mapper functions
Lambda MapReduce Reference Architecture
Cost of serverless batch processing
200 GB normalized Google Ngram data-set
Serverless:
• 1000 concurrent Lambda invocations
• Processing time: 9 minutes
• Cost: $7.06
Pattern 3: Stream Processing
Stream processing characteristics
• High ingest rate
• Near real-time processing (low latency from ingest to
process)
• Spiky traffic (lots of devices with intermittent network
connections)
• Message durability
• Message ordering
Serverless stream processing architecture
KPL: S3:
Producer Amazon Kinesis: Lambda: Intermediate Aggregated S3:
Stream Stream Processor Data Final Aggregated Output
Sensors
CloudWatch Events: Lambda: Lambda:
Trigger every 5 minutes Scheduled Dispatcher Periodic Dump to S3
Fan-out pattern
• Number of Amazon Kinesis Streams shards corresponds to concurrent
Lambda invocations
• Trade higher throughput & lower latency vs. strict message ordering
KPL:
Producer Amazon Kinesis: Lambda: Lambda:
Stream Dispatcher Processors
Sensors
Increase throughput, reduce processing latency
Best practices
• Tune batch size when Lambda is triggered by Amazon
Kinesis Streams – reduce number of Lambda
invocations
• Tune memory setting for your Lambda function – shorten
execution time
• Use KPL to batch messages and saturate Amazon
Kinesis Stream capacity
Monitoring
Amazon Kinesis Stream metric GetRecords.IteratorAgeMilliseconds maximum
Amazon Kinesis Analytics
Amazon Kinesis Streams
Producer Amazon Kinesis: Amazon Kinesis Analytics: S3:
Stream Window Aggregation Aggregated Output
Sensors
CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO
"DESTINATION_SQL_STREAM"
SELECT STREAM "device_id",
FLOOR("SOURCE_SQL_STREAM_001".ROWTIME TO MINUTE) as "round_ts",
SUM("measurement") as "sample_sum",
Aggregation
COUNT(*) AS "sample_count" Time Window
FROM "SOURCE_SQL_STREAM_001"
GROUP BY "device_id", FLOOR("SOURCE_SQL_STREAM_001".ROWTIME TO MINUTE);
Cost comparison - assumptions
• Variable message rate over 6 hours
• Costs extrapolated over 30 days
50,000
MESSAGES/SEC
20,000 20,000 20,000
10,000 10,000
1 2 3 4 5 6
HOURS
Cost comparison
Serverless Server-based on EC2
• Amazon Kinesis Stream with 5 • Kafka cluster (3 x m3.large)
shards • Zookeeper cluster (3 x m3.large)
• Consumer (1 x c4.xlarge)
Service Monthly Cost Service Monthly Cost
Amazon Kinesis Streams $ 58.04 EC2 Kafka Cluster $292.08
AWS Lambda $259.85 EC2 Zookeeper Cluster $292.08
Amazon S3 (Intermediate Files) $ 84.40 EC2 Consumer $152.99
Amazon CloudWatch $ 4.72 Total On-Demand $737.15
Total $407.01 1-year All Upfront RI $452.42
Lambda architecture
AWS Lambda:
Mappers
Amazon S3 AWS Lambda: AWS Lambda: Amazon S3
Object Splitter …. Reducer Results
Amazon
S3 …. Amazon DynamoDB:
Mapper Results
Batch Layer
KPL: S3:
Data Producer Amazon Kinesis: Lambda: Intermediate Aggregated S3:
Stream Stream Processor Data Final Aggregated Output
Sources
Sensors Serving Layer
CloudWatch Events: Lambda: Lambda:
Trigger every 5 minutes Scheduled Dispatcher Periodic Dump to S3
Speed Layer
Pattern 4: Automation
Automation characteristics
• Respond to alarms or events
• Periodic jobs
• Auditing and Notification
• Extend AWS functionality
…All while being Highly Available and Scalable
Automation: dynamic DNS for EC2 instances
Amazon EC2 Instance Amazon CloudWatch Events: AWS Lambda: Amazon Route53:
State Changes Rule Triggered Update Route53 Private Hosted Zone
Tag: xyz.example.com A 10.2.0.134
CNAME = ‘xyz.example.com’
Amazon DynamoDB:
EC2 Instance Properties
Automation: image thumbnail creation from S3
S3: AWS Lambda: S3:
Source Bucket Resize Images Destination Bucket
Triggered on
PUTs
Users upload photos
CapitalOne Cloud Custodian
Amazon CloudWatch Events: AWS Lambda: Amazon SNS:
Rules Triggered Policy & Compliance Rules Alert Notifications
AWS CloudTrail:
Events
Amazon CloudWatch Logs:
Logs
Read more here: http://www.capitalone.io/cloud-custodian/docs/index.html
Best practices
• Document how to disable event triggers for your automation when
troubleshooting
• Gracefully handle API throttling by retrying with an exponential back-
off algorithm (AWS SDKs do this for you)
• Publish custom metrics from your Lambda function that are
meaningful for operations (e.g. number of EBS volumes
snapshotted)
Thank you!