Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
26 views111 pages

Week 5 Preparing For PCA Module 4

The document outlines the preparation for a Professional Cloud Architect journey, focusing on optimizing technical and business processes for Cymbal Direct. It details business and technical requirements, current and new processes, and various Google Cloud services such as Filestore, Firestore, and BigQuery. Additionally, it emphasizes the importance of CI/CD pipelines, managed services, and security measures like penetration testing and chaos engineering.

Uploaded by

Serguei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views111 pages

Week 5 Preparing For PCA Module 4

The document outlines the preparation for a Professional Cloud Architect journey, focusing on optimizing technical and business processes for Cymbal Direct. It details business and technical requirements, current and new processes, and various Google Cloud services such as Filestore, Firestore, and BigQuery. Additionally, it emphasizes the importance of CI/CD pipelines, managed services, and security measures like penetration testing and chaos engineering.

Uploaded by

Serguei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 111

Preparing for Your

Professional
Cloud Architect
Journey
Module 4: Analyzing and Optimizing Technical
and Business Processes
Week 5 agenda
Diagnostic Questions
Optimizing Cymbal Data services for exam guide Section
Direct’s technical and (Filestore, Firestore, 4: Analyzing and
business processes Memorystore, optimizing technical
and procedures Spanner, BigQuery, and business
Bigtable) processes

1 2 3 4 5 6

QUIZ Filestore, Firestore Mountkirk Games


& Firebase case study analysis
QUIZ time!
Optimizing Cymbal
Direct’s technical and
business processes and
procedures
Your role in
● Analyzing and defining technical processes
optimizing ● Analyzing and defining business processes

business and ● Developing procedures to ensure reliability


of solutions in production
technical
processes
Business Requirements

● Cymbal Direct’s management wants to make sure that they can easily scale to handle additional
demand when needed, so they can feel comfortable with expanding to more test markets.
● Streamline development for application modernization and new features/products.
● Ensure that developers spend as much time on core business functionality as possible, and not
have to worry about scalability wherever possible.
● Allow for partners to order directly via API
● Get a production version of the social media highlighting service up and running, and ensure no
inappropriate content
Technical Requirements

● Move to managed services wherever possible


● Ensure that developers can deploy container based workloads to testing and production
environments in a highly scalable environment.
● Standardize on containers where possible, but also allow for existing virtualization infrastructure
to run as-is without a re-write, so it can be slowly refactored over-time
● Securely allow partner integration
● Allow for streaming of IoT data from drones
Process optimization
The current build process at Cymbal Direct is:
● Package monolithic application with its dependencies
● Check it in and notify the QA team they need to test it
● Stress test the application to ensure it performs well
● Build a VM image for deployment

Plan Code Build Test Release Deploy Operate + Monitor

● Stakeholder ● Check in ● Docker Image ● Unit ● Tag ● VM ● Scale


● Integration ● Artifact available ● Kubernetes ● Ensure availability
● Cloud run
Fail ● red/black
● canary
Fail

Continuous Integration Continuous Deployment


Process optimization

Requirements not met:


● development is streamlined
● developers focus on core business functionality
● Move to managed services wherever possible
● deploy container based workloads
Example of end-to-end CI/CD pipeline

Code Artifact Registry Vulnerability Binary Trusted gke-demo


Scanning Authorization images

Vulnerability Found

Untrusted Audit Log


2 images

Key Management
Service

hello-world-not-signed hello-world-signed attestor-demo attest-key


Proprietary + Confidential

Container Registry vs Artifact Registry

● Container Registry is currently in maintenance mode


Although it is still available and supported as a Google
Enterprise API, it won’t see new any features
● Artifact Registry is the successor and the
Container Registry
recommended solution
vs
● Artifact Registry covers all use cases of container
registry and can be used for additional packages like prefer this for
maven, npm, python, etc. new projects

● See more info on how to migrate here


https://cloud.google.com/artifact-registry/docs/transiti
on/transition-from-gcr
Artifact Registry
Process optimization
New Process
● New features are implemented as
microservices in Docker containers
● Code check-in triggers CI/CD
pipeline w/ automatic test & release
● Code is deployed to Cloud Run
Chaos Engineering

Developing ● Creates a culture of reliability


● Crashes systems intentionally to

Cymbal Direct's build resiliency


● Service Mesh can help you here!
procedures to
ensure solution Penetration testing
reliability ● Mimics the behavior of hackers
to attack your own environment

“If you plan to evaluate the security of your Cloud Platform infrastructure
with penetration testing, you are not required to contact us.”
Filestore
Proprietary + Confidential

Filestore
Managed NFS, NOT a database

Filestore Basic Filestore High Scale Filestore Enterprise


(GA) (Public Preview) (GA)

File sharing, Software Dev, HPC, Financial Modeling, SAP, GKE, and
Workloads
and Web Hosting Pharma, and Analytics ‘Lift & Shift Apps”

Capacity 1 - 64 TiB 10 - 100 TiB 1 - 10 TiB

Scale Scale-up Scale-out Scale-out

Capacity Management Grow Grow & Shrink Grow & Shrink

Max Performance
1.2GiB/s | 60k 26GiB/s | 920k 1.2GiB/s | 120k
(Throughput | IOPS)

Data Protection Backups None Snapshots

Availability SLA 99.9% 99.9% 99.99%


Firestore
Firestore: When to use?

Firestore is ideal for applications that rely on highly available structured data at scale.

Ideal Use Cases:

● Product catalogs that provide real-time inventory and product details for a retailer.
● User profiles that deliver a customized experience based on the user’s past activities and preferences.
● Transactions based on ACID properties

Non-Ideal Use Cases:

● OLTP relational database with full SQL support. Consider: Cloud SQL
● Data isn’t highly structured or no need for ACID transactions. Consider: Cloud Bigtable
● Interactive querying in an online analytical processing (OLAP) system. Consider: BigQuery
● Unstructured data such as images or movies, Consider: Cloud Storage
Firestore: Datastore mode vs Firestore (native) mode
Both Native Mode (only) Datastore Mode (only)

Data model Strong consistency Documents and Entities, kinds, ancestor


collections queries/results

Performance No read limits 10K writes/sec


limits 500 documents/txn

API Firestore (Documents) Datastore (Entities)


&$985487

Security IAM Firebase Rules

Offline data Yes


persistence

Real-time Yes
updates

Firestore or Datastore - comparison


Proprietary + Confidential
Firestore vs Filestore

… vs Firebase
Exam Tip: Firestore is a NoSQL Database, but Firebase
is a development platform with a ton of additional
features that uses Firestore. Make sure to differentiate
between them!
Firebase
*** Platform, NOT a database ***
Firebase is Google’s complete app development platform
Release
Testing
Complete = it provides different products to: management

● Build apps
Backend Analytics
● Test apps compute Develop Run
● Implement authentication (Firebase
Crash
Authentication can be a part of PCA exam on reporting
Data +
very high-level!) Authentication
● Run apps Engage
Messaging
● Run analytics
Experimentation
● Personalize apps Personalization
● And more…

iOS Android Web C++ Unity

Exam Tip: Firestore is usually a part of Firebase-based app (for storing and syncing data)
Memorystore
Spanner
What workloads fit Cloud Spanner best?

01 02 03 04
Sharded RDBMS Scalable relational Manageability/HA Multi-region
data
Manually sharding is Highly automated. Write once and
difficult. People do it Scalable relational Online Schema automatically replicate
to achieve scale. database. Instead of changes and your data to multiple
Cloud Spanner gives moving to NoSQL, patching. No planned regions.
you relational data move from one downtime and comes
Most customers use
and scale. relational database with up to a 99.999%
regional instances, but
to a more scalable availability SLA.
multi-region is there if
relational database.
you need it.
When Cloud Spanner fits less well

TIP
It’s NOT a straightforward thing to migrate a different RDBMS to
Cloud Spanner. Be familiar with challenges on high level.

1 2 3 4

Lift and shift Lots of Compatibility App is very sensitive to


in-database needed very low latency
business logic (micro/nano/low single
(triggers, stored digit ms)
procedures)
Lots of analytics / OLAP
type of queries /
workloads
Bigquery
BigQuery hierarchy
Project -> Dataset -> Tables (-> Partitions)
● For each query, BigQuery executes a full-column scan.
● BigQuery performance and query costs are based on the
amount of data scanned.
● You can set the geographic location of a Dataset at creation
time only.
● All tables that are referenced in a query must be stored in
datasets in the same location.
● When you copy a table (bq cp), the datasets that contain the
source table and destination table must reside in the same
location.
○ You can copy a dataset (NOT with bq cp, but with BigQuery
Data Transfer Service) within a region or from one region to
another
● Dataset names are case-sensitive
BigQuery: Controlling access to datasets Exam Tips: It’s a common
practice to have a Dataset in one
Common BigQuery predefined roles project and perform queries from
another one (split billing!).

Admin Full Access to all datasets

Data Editor Access to edit all contents of the


datasets

Data Owner Full access to datasets and all of


their contents

Data Viewer Access to view datasets and all


of their contents

Job User Access to run jobs

Metadata Viewer Access to view table and dataset


metadata

User Access to run queries and create


datasets

Read Sessions Access to create and use read


User sessions
BigQuery: Controlling access to datasets

You can grant access at the following BigQuery resource levels:


● organization or Google Cloud project level
● dataset level
● table or view level
a. Authorized Views
● You can also restrict access to data on more granular level by using the following methods:
a. column-level access control
b. dynamic data masking (aka “some columns may be hidden, depending on privileges”)
i. Works together with column-level security.
ii. no need to modify existing queries by excluding the columns that the user cannot access
c. row-level security (aka “some rows may be hidden, depending on privileges”)
i. One table can have multiple row-level access policies. Row-level access policies can coexist on a
table with column-level security as well as dataset-level, table-level, and project-level access
controls.
BigQuery: Controlling access to datasets
Authorized Views
1. View: View is a virtual table defined by a SQL query. When you
create a view, you query it in the same way you query a table

2. Query: When a user queries the view, the query results


contain data only from the tables and fields specified in the
query that defines the view.

3. Authorized Views: An authorized view allows you to share


query results with particular users and groups without giving
them access to the underlying tables.

Exam Tip: Authorized Views were especially useful when


there were no table/column-level permissions. However,
they’re still often-used way to selectively share access to
datasets (and they pop up on the exam!).
MAKE SURE TO UNDERSTAND HOW TO CREATE AND
SHARE SUCH A VIEW.
BigQuery: Controlling access to datasets
Authorized Views

BigQuery Data Viewer

[email protected]
BigQuery - Data Transfer Service
Mostly useful for regular data transfers to BigQuery

● BigQuery Data Transfer Service


automates data movement from
various sources into BigQuery on a
scheduled, managed basis.
● You can initiate data backfills to
recover from any outages or gaps.
BigQuery - Batch vs Streaming inserts
Most common architectures

Exam Tip: There is additional cost for streaming (both inserts and reads) in BigQuery.
BigQuery: Sharing Datasets with others
AllAuthenticatedUsers

The special setting allAuthenticatedUsers makes


a dataset public. Authenticated users must use
BigQuery within their own project and have
access to run BigQuery jobs so that they can
query the Public Dataset. The billing for the
query goes to their project, even though the
query is using public or shared data. In summary,
the cost of a query is always assigned to the
active project from where the query is executed.
BigQuery: Sharing Queries with others
Mostly for collaboration

● Query needs to be saved first, before it’s shared;


● Can share incomplete / invalid queries -> collaboration;
● Project-level saved queries are visible to principals with the required permissions;
● Public saved queries are visible to anyone with a link to the query;
BigQuery: Scheduling queries
Mostly useful for regular execution

● Scheduled queries use features of


BigQuery Data Transfer Service.
● If the destination table for your
results doesn't exist when you set up
the scheduled query, BigQuery
attempts to create the table for you.
● You can set up a scheduled query to
authenticate as a service account.
BigQuery: Query results caching
Limit Access by Data Lifecycle Stages

● Query results are cached to improve performance and reduce costs


for repeated queries
● Cache is per user
● Still subject to quota policies
● Cache results have a size limit of 128 MB compressed
● No charge for queries that use cached results
● Results are cached for approximately 24 hours
● Lifetime extended when a query returns a cached result
● Use of cached results can be turned off (useful for benchmarking)
BigQuery: table/partition (automatic) data expiration
Can be set for dataset / table / partition

Best practice for data lifecycle management.


Expiration in BigQuery automatically implements
retention policy.

● Dataset expiration
○ = “default table expiration time” for a dataset

● Table expiration
○ If Dataset expiration is set, each table inherits this setting by default

● Partition expiration:
○ The setting applies to all partitions in the table, but is calculated
independently for each partition based on the partition time.
○ At any point after a table is created, you can update the table's
partition expiration
BigQuery: Table Partitioning
c2 c3 eventDate

Partitioning versus sharding: 2018-01-01

● Table sharding is the practice of storing data in multiple 2018-01-02


tables, using a naming prefix such as
[PREFIX]_YYYYMMDD. Partitioning is recommended over 2018-01-03
table sharding, because partitioned tables perform
2018-01-04
better.

2018-01-05
You can partition BigQuery tables by:

● Time-unit column: Tables are partitioned based on a SELECT * FROM ...


TIMESTAMP, DATE, or DATETIME column in the table. WHERE eventDate BETWEEN
● Ingestion time: Tables are partitioned based on the “2018-01-03” AND
timestamp when BigQuery ingests the data.
“2018-01-04”
● Integer range: Tables are partitioned based on an integer
column.
BigQuery: Table Clustering

c1 userId c3

2018-01-01

2018-01-02

2018-01-03

2018-01-04

2018-01-05

SELECT c1, c3 FROM ... WHERE userId BETWEEN 52 and 63


AND eventDate BETWEEN “2018-01-03” AND “2018-01-04”
BigQuery - table partitioning vs clustering
Decision making

● Clustering gives you more granularity than partitioning alone allows


● Use clustering if your queries commonly use filters or aggregation against multiple particular
columns.
BigQuery: table partitioning AND clustering
Both partitioning and clustering can improve performance and reduce query cost

Exam Tip: You can combine partitioning with


clustering. Data is first partitioned and then data in
each partition is clustered by the clustering columns.
BigQuery: Storage Pricing

Storage pricing is the cost to store data that you load into BigQuery. You pay for active storage and
long-term storage.

● Active storage includes any table or table partition that has been modified in the last 90 days.
● Long-term storage includes any table or table partition that has not been modified for 90
consecutive days. The price of storage for that table automatically drops by approximately
50%. There is no difference in performance, durability, or availability between active and long-term
storage.
Bigtable
Bigtable is a common migration target for key-value,
wide-column and time-series databases
● Petabyte-scale

● fully managed NoSQL database


service for use cases where low
latency random data access,
scalability and reliability are
critical.

● scales seamlessly

● integrates with the Apache®


Cloud Bigtable
ecosystem and supports the
HBase™ API.
What is Bigtable good for?

Use Case Examples Applications that need... Storage Engine


● Time-series data, such as CPU ● Batch MapReduce
● Very high throughput
and memory usage over time for ● Stream Processing/Analytics
multiple servers. ● Scalability
● Marketing data, such as purchase ● ML applications
● Non-Structured key/value data
histories and customer
where each value is no larger than
preferences.
● Financial data, such as
&$985487

10MB
transaction histories, stock prices,
and currency exchange rates.
● Internet of Things data, such as Exam Tip: types of apps where you’d consider using
usage reports from energy meters Bigtable: recommendation engines, personalizing user
and home appliances. experience, Internet of Things, real-time analytics, fraud
● Graph data, such as information detection, migrating from HBase or Cassandra, Fintech,
about how users are connected to gaming, high-throughput data streaming for creating /
one another. improving ML models.
Bigtable for analytics… ?
Bigtable vs BigQuery

NoSQL wide column Enterprise data warehouse for


database relational structured data

Low latency per-entry Cloud Large scale, ad-hoc


BigQuery
access Bigtable SQL-based OLAP analysis
Organizational insights
Heavy read/write events
Analyze data from Cloud
Bigtable database
&$985487

Ad-hoc analysis
Optimized for Point read/write
and reporting

Cohort
Typical target User/entity level
/population level

Exam Tip: BigTable might be optimal for “real-time analytics”, when


you need to make decisions on events as they’re happening.
Bigtable: Hadoop migration and modernization

Apache Hadoop/HBase Data Ecosystem / Cloud Bigtable

Stream Processing Stream Processing


Kafka, Spark Kafka, Spark

Database

Cloud Bigtable
HBase
Scripting & Querying “After” Scripting & Querying
“Before” HIVE, Impala, Pig, Mahout
HIVE, Impala, Pig, Mahout

Compute
Dataproc

Database
Distributed Processing Distributed Processing
Spark, MapReduce Spark, MapReduce

Distributed Storage Storage


HDFS Cloud Storage

Simplified Google Cloud


Hadoop Stack Storage and Databases

Exam Tip: Main goal: decoupling of storage & compute. As a consequence,


you can treat Dataproc clusters as job-specific / ephemeral
What is Bigtable not good for?

Not good for… Considerations


● Not a relational database ● You need full SQL support for OLTP
● No SQL Queries or Joins
&$9854
87
→ consider Spanner or CloudSQL
● No Multi-Row ● Interactive querying for OLAP
Transactions → consider BigQuery
● Need to store immutable blobs larger than 10MB (e.g.
movies, images)
→ consider Cloud Storage
Comparing GCP
storage solutions
Proprietary + Confidential

SQL vs noSQL
SQL (aka ‘Relational’) NoSQL (aka ‘Non-relational’)
“traditional” table-based RDBMSes key-value, wide column, document
Strongly typed, fixed schemas Dynamic schemas
Almost all ACID-compliant Mostly BASE
Considerable percentage of logic can be done in Most of logic needs to be offloaded to application
database layer
Default choice for most monoliths Suitable for some microservices
performance capped at some point (vertical Processing nodes often separate from storage
scaling only, plus sharding, offloading read-only nodes (if network is fast enough)
etc)
In GCP: Cloud SQL, Cloud Spanner In GCP: Firestore, Bigtable
Outside of GCP: MySQL, Oracle, PostgreSQL, Outside of GCP: MongoDB, Redis, Cassandra,
Microsoft SQL Server. HBase, CouchDB
Proprietary + Confidential

OLTP vs OLAP

OLTransactionalP OLAnalyticalP
For processing data in transaction-oriented Multi-dimensional, analytical queries used in
apps BI, reporting, data mining etc
Large amounts of transactions Large volume of data
A mix of Inserts, Updates, Deletes on individual Loading data from source + selects. Optimized
records. for high throughput reads on large number of
records
Tables are normalized Tables are not normalized
ACID & (mostly) SQL SQL (sometimes NoSQL)
Cloud SQL, Cloud Spanner BigQuery

Exam Tip: Here you’ll find a GREAT Decision tree for database choices on AWS, Microsoft Azure,
Google Cloud Platform, and cloud-agnostic
Cloud Storage

Cloud Cloud Cloud Cloud Cloud Cloud BigQuery


Storage Datastore Firestore Bigtable SQL Spanner

Overview Ideal for


● Fully managed, highly reliable ● Images and videos
● Cost-efficient, scalable object/blob ● Objects and blobs
store
● Objects access via HTTP requests ● Unstructured data
● Object name is the only key ● Static website hosting
Cloud Datastore

Cloud Cloud Cloud Cloud Cloud Cloud BigQuery


Storage Datastore Firestore Bigtable SQL Spanner

Overview Ideal for


● Fully managed NoSQL ● Semi-structured application data
● Scalable ● Durable key-value data
● Hierarchical data
● Managing multiple indexes
● Transactions
Cloud Firestore

Cloud Cloud Cloud Cloud Cloud Cloud BigQuery


Storage Datastore Firestore Bigtable SQL Spanner

Overview Ideal for


● Fully managed, serverless, NoSQL ● Document-oriented data
● Scalable ● Large collections of small documents
● Native mobile and web client libraries ● Native mobile and web clients
● Real-time updates ● Durable key-value data
● Hierarchical data
● Managing multiple indexes
● Transactions
Cloud Bigtable

Cloud Cloud Cloud Cloud Cloud Cloud BigQuery


Storage Datastore Firestore Bigtable SQL Spanner

Overview Ideal for


● High performance wide column NoSQL ● Operational applications
database service
● Sparsely populated table ● Analytical applications
● Can scale to billions of rows and ● Storing large amounts of single-keyed
thousands of columns data
● Can store TB to PB of data ● MapReduce operations
Cloud SQL

Cloud Cloud Cloud Cloud Cloud Cloud BigQuery


Storage Datastore Firestore Bigtable SQL Spanner

Overview Ideal for


● Managed service ● Web frameworks
○ Replication
○ Failover
○ Backups
● MySQL, PostgreSQL, and SQL Server ● Structured data
● Relational database service ● OLTP workloads
● Proxy allows for secure access to your ● Applications using MySQL/PGS
Cloud SQL Second Generation
instances without whitelisting
Cloud Spanner

Cloud Cloud Cloud Cloud Cloud Cloud BigQuery


Storage Datastore Firestore Bigtable SQL Spanner

Overview Ideal for


● Mission-critical relational database ● Mission-critical applications
service
● Transactional consistency ● High transactions
● Global scale ● Scale and consistency requirements
● High availability
● Multi-region replication
● 99.999% SLA
BigQuery

Cloud Cloud Cloud Cloud Cloud Cloud BigQuery


Storage Datastore Firestore Bigtable SQL Spanner

Overview Ideal for

● Low-cost enterprise data warehouse for ● Online Analytical Processing (OLAP)


analytics workloads
● Fully managed ● Big data exploration and processing
● Petabyte scale ● Reporting via Business Intelligence (BI)
tools
● Fast response times
● Serverless
Comparing storage and database

In memory Relational Non-relational Object Warehouse

App Engine Cloud Cloud Cloud Cloud


Memcache SQL Firestore BigQuery
Spanner Bigtable Storage

Good for: Good for: Good for: Good for: Good for: Good for: Good for:
Web/mobile apps, Web RDBMS+scale, Hierarchical, Heavy read + Binary or object Enterprise data
gaming frameworks HA, HTAP mobile, web write, events data warehouse
Such as:
Such as: Such as: User metadata, Such as: Such as: Such as: Such as:
Game state, user CMS, Ad/Fin/MarTec User profiles, AdTech, Images, media Analytics,
sessions eCommerce h Game State financial, IoT serving, backups dashboards

TIP
Try to read from bottom up (what’s the most appropriate storage for analytics
workloads? What’s good for global, horizontally scalable RDBMS?
Comparing storage options: Technical details

Firestore Bigtable Cloud Storage Cloud SQL Cloud Spanner BigQuery

NoSQL NoSQL Relational Relational Relational


Type Blobstore
document wide column SQL for OLTP SQL for OLTP SQL for OLAP

Transactions Yes Single-row No Yes Yes No

Complex
Yes No No Yes Yes Yes
queries

Capacity Terabytes+ Petabytes+ Petabytes+ 10,230 GB Petabytes Petabytes+

~10 MB/cell
Determined 10,240 MiB/
Unit size 1 MB/entity ~100 5 TB/object 10 MB/row
by DB engine row
MB/row
GCP: storage service decision tree
GCP: storage service decision tree (version #2)
NO YES

Is your data
Start
structured?

Do you need
mobile SDKs?
Is your workload
analytics?

Do you need
Cloud Storage Cloud Storage
updates or low
for Firebase
Is your data latency?
relational?

Do you need
Cloud Bigtable BigQuery
horizontal
High throughput Data warehouse
scalability? Do you need Tabular data
mobile SDKs?

Cloud Cloud Firebase Firestore


Spanner SQL Realtime DB Transactions
Mountkirk Games
case study analysis
Mountkirk Games
MountKirk Games
Analytics pipeline
Proposed Technical Solutions MountKirk Games
● Containers
○ GKE with multiple regional clusters and Workload Identity
○ Services exposed via global load balancers
○ (possibly) Connect the clusters with Anthos (which gives additional benefits: control and encryption of the traffic,
centralized management etc).
○ Cluster and workload autoscaling -> either configure GKE autoscalers, or just deploy GKE clusters in AutoPilot mode.
○ Additional Node Pools with preemptible instances.
○ Additional Node Pools with GPUs.
● Cloud Spanner as database for leaderboards.
○ Deployed in multi-region setup to minimize latency from GKE clusters to Spanner.
● CI / CD pipeline for rapid deployment:
○ Cloud Source Repositories to store and work on source code
○ Cloud Build
○ Artifact Registry (previously: Container Registry, focused only on containers) for storing artifacts after they're built
○ Cloud Deploy; Alternatively, 3rd party software (Jenkins, Spinnaker etc)
● Migrate for GKE/Anthos -> migrating VM-based workloads to Kubernetes (GKE).
● Cloud Operations Suite for monitoring / telemetry.
○ GCS buckets / BigQuery to store logs for longer periods of time.
● Advanced analytics: source (GKE game servers or GCS buckets) -> Pub/Sub -> Dataflow -> BigQuery + Data Studio / Looker
● GCP Game Servers: possibly, but the architecture above will handle it just fine as well.
[Mountkirk Games case study] Diagnostic Question #1

For this question, refer to the


Mountkirk Games case study. A. Evaluate the impact of migrating their current batch ETL code to
Mountkirk Games wants to migrate Cloud Dataflow.
from their current analytics and B. Write a schema migration plan to denormalize data for better
statistics reporting model to one that performance in BigQuery.
meets their technical requirements on C. Draw an architecture diagram that shows how to move from a
Google Cloud Platform. single MySQL database to a MySQL cluster.
D. Load 10 TB of analytics data from a previous game into a Cloud
SQL instance, and run test queries against the full dataset to
confirm that they complete successfully.
E. Integrate Cloud Armor to defend against possible SQL injection
attacks in analytics files uploaded to Cloud Storage.
Which two steps should be part of their
migration plan? (Choose two.)
[Mountkirk Games case study] Diagnostic Question #1

For this question, refer to the


Mountkirk Games case study. A. Evaluate the impact of migrating their current batch ETL
Mountkirk Games wants to migrate code to Cloud Dataflow.
from their current analytics and B. Write a schema migration plan to denormalize data for
statistics reporting model to one that better performance in BigQuery.
meets their technical requirements on C. Draw an architecture diagram that shows how to move from a
Google Cloud Platform. single MySQL database to a MySQL cluster.
D. Load 10 TB of analytics data from a previous game into a Cloud
SQL instance, and run test queries against the full dataset to
confirm that they complete successfully.
E. Integrate Cloud Armor to defend against possible SQL injection
attacks in analytics files uploaded to Cloud Storage.
Which two steps should be part of their
migration plan? (Choose two.)
[Mountkirk Games case study] Diagnostic Question #2

A. Configure an organizational policy which constrains where


resources can be deployed.
Mountkirk Games wants to limit the
physical location of resources to their
operating Google Cloud regions. B. Configure IAM conditions to limit what resources can be
configured.

C. Configure the quotas for resources in the regions not being used
to 0.

D. Configure a custom alert in Cloud Monitoring so you can disable


resources as they are created in other regions.
What should you do?
[Mountkirk Games case study] Diagnostic Question #2

A. Configure an organizational policy which constrains where


resources can be deployed.
Mountkirk Games wants to limit the
physical location of resources to their
operating Google Cloud regions. B. Configure IAM conditions to limit what resources can be
configured.

C. Configure the quotas for resources in the regions not being used
to 0.

D. Configure a custom alert in Cloud Monitoring so you can disable


resources as they are created in other regions.
What should you do?
[Mountkirk Games case study] Diagnostic Question #3

A. Tests should scale well beyond the prior approaches

Mountkirk Games wants you to design


their new testing strategy. How should B. Unit tests are no longer required, only end-to-end tests
the test coverage differ from their
existing backends on the other C. Tests should be applied after the release is in the production
platforms? environment

D. Tests should include directly testing the Google Cloud Platform


(GCP) infrastructure

What should you do?


[Mountkirk Games case study] Diagnostic Question #3

A. Tests should scale well beyond the prior approaches

Mountkirk Games wants you to design


their new testing strategy. How should B. Unit tests are no longer required, only end-to-end tests
the test coverage differ from their
existing backends on the other C. Tests should be applied after the release is in the production
platforms? environment

D. Tests should include directly testing the Google Cloud


Platform (GCP) infrastructure

What should you do?


[optional] Links to useful
materials
Proprietary + Confidential

Optional materials 1
[ READING ]
● Make sure you know the differences between BigQuery and BigTable.
● Be aware how BigQuery table partitioning works.

[ VIDEOS ]
● Cloud Networking 104 (Load Balancers): Cloud OnAir: Networking 104 - Everything You Need to Know About Load
Balancers on GCP
● Querying external data with BigQuery
● BigQuery: What is BigQuery?
● [IMPORTANT TO KNOW] Sharing BigQuery data with others: Protect data with authorized views
● BigTable: What is Cloud Bigtable?
● Data Studio introduction: Data Studio in a minute
● BigTable: What can you do with Bigtable?
● Cloud Spanner [5 min]: What is Cloud Spanner | Cloud Spanner Explained | Cloud Native Relational Database
● Cloud Spanner [2x5min]: How to set up a Cloud Spanner instance & Cloud Spanner: Database deep dive
● Introduction to Firestore: Introduction to Firestore | NoSQL Document Database
● What is Dataprep? (do not confuse with Dataproc, Dataflow and other Data<service>) No code data wrangling with
Dataprep #GCPSketchnote
● Decision tree to migrate Apache Hadoop workloads to Dataproc: Decision tree to migrate Apache Hadoop
workloads to Dataproc #GCPSketchnote
Proprietary + Confidential

Optional materials 2
● Creating a large Dataproc Cluster with preemptible VMs: Creating a large Dataproc Cluster with preemptible VMs
● What is Cloud Build?: What is Cloud Build? #GCPSketchnote
● Three ways to improve CI/CD in your serverless app
● How to protect secrets with Secret Manager: Level Up - Secret Manager
● What is Cloud Armor?: What is Cloud Armor? #GCPSketchnote

[ PODCASTS ]
● BigQuery Admin Reference Guides
● Firebase (not to be mixed up with Firestore!)
● Cloud Functions
● Cloud BigTable

[ DEEP DIVES ]
● BigQuery and Cloud Spanner deep dive: Under the hood of Google Cloud data technologies: BigQuery and Cloud
Spanner
● (~20 mins) BigQuery lab that will familiarize you with basics and show interesting insights at the same time.
Diagnostic Questions
for Exam Guide Section 4: Analyzing
and optimizing technical and business
processes
PCA Exam Guide Section 4:
Analyzing and optimizing technical and business processes

4.1 Analyzing and defining technical processes

4.2 Analyzing and defining business processes

4.3 Developing procedures to ensure reliability


of solutions in production
4.1 Analyzing and defining technical processes

Considerations include:
● Software development life cycle (SDLC)
● Continuous integration / continuous deployment
● Troubleshooting / root cause analysis best practices
● Testing and validation of software and infrastructure
● Service catalog and provisioning
● Business continuity and disaster recovery
4.1 Diagnostic Question 01 Discussion

You are asked to implement a lift and shift A. Commit the configuration file to your software repository.
operation for Cymbal Direct’s Social Media B. Run terraform plan to verify the contents of the Terraform
Highlighting service. You compose a configuration file.
Terraform configuration file to build all
the necessary Google Cloud resources. C. Run terraform apply to deploy the resources described in the
configuration file.

What is the next step in the Terraform D. Run terraform init to download the necessary provider modules.
What should you do?
workflow for this effort?
4.1 Diagnostic Question 01 Discussion

You are asked to implement a lift and shift A. Commit the configuration file to your software repository.
operation for Cymbal Direct’s Social Media B. Run terraform plan to verify the contents of the Terraform
Highlighting service. You compose a configuration file.
Terraform configuration file to build all
the necessary Google Cloud resources. C. Run terraform apply to deploy the resources described in the
configuration file.

What is the next step in the Terraform D. Run terraform init to download the necessary provider modules.
What should you do?
workflow for this effort?
4.1 Diagnostic Question 02 Discussion

You have implemented a manual A. Implement and reference a source repository in your Cloud Build
CI/CD process for the configuration file.
container services required for B. Implement a build trigger that applies your build configuration when a
the next implementation of the new software update is committed to Cloud Source Repositories.
Cymbal Direct’s Drone Delivery
project. You want to automate C. Specify the name of your Container Registry in your Cloud Build
configuration.
the process.
D. Configure and push a manifest file into an environment repository in
Cloud Source Repositories.
What should you do?
4.1 Diagnostic Question 02 Discussion

You have implemented a manual A. Implement and reference a source repository in your Cloud Build
CI/CD process for the configuration file.
container services required for B. Implement a build trigger that applies your build configuration when a
the next implementation of the new software update is committed to Cloud Source Repositories.
Cymbal Direct’s Drone Delivery
project. You want to automate C. Specify the name of your Container Registry in your Cloud Build
configuration.
the process.
D. Configure and push a manifest file into an environment repository in
Cloud Source Repositories.
What should you do?
4.1 Diagnostic Question 03 Discussion

You have an application A. Implement a scheduled snapshot on your Compute Engine instances.
implemented on Compute Engine. B. Implement a regional managed instance group.
You want to increase the
durability of your application. C. Monitor your application’s usage metrics and implement autoscaling.
D. Perform health checks on your Compute Engine instances.

What should you do?


4.1 Diagnostic Question 03 Discussion

You have an application A. Implement a scheduled snapshot on your Compute Engine instances.
implemented on Compute Engine. B. Implement a regional managed instance group.
You want to increase the
durability of your application. C. Monitor your application’s usage metrics and implement autoscaling.
D. Perform health checks on your Compute Engine instances.

What should you do?


4.1 Diagnostic Question 04 Discussion

Developers on your team A. Implement a Cloud Build configuration file with build steps.
frequently write new versions B. Implement a build trigger that references your repository and branch.
of the code for one of your
applications. You want to C. Set proper permissions for Cloud Build to access deployment resources.
automate the build process D. Upload application updates and Cloud Build configuration files to Cloud Source
when updates are pushed to Repositories.
Cloud Source Repositories.

What should you do?


4.1 Diagnostic Question 04 Discussion

Developers on your team A. Implement a Cloud Build configuration file with build steps.
frequently write new versions B. Implement a build trigger that references your repository and branch.
of the code for one of your
applications. You want to C. Set proper permissions for Cloud Build to access deployment resources.
automate the build process D. Upload application updates and Cloud Build configuration files to Cloud Source
when updates are pushed to Repositories.
Cloud Source Repositories.

What should you do?


4.1 Diagnostic Question 05 Discussion

Your development team used Cloud Source A. The runtime environment does not have permissions to the
Repositories, Cloud Build, and Artifact Artifact Registry in your current project.
Registry to successfully implement the build B. The runtime environment does not have permissions to Cloud
portion of an application's CI/CD process.. Source Repositories in your current project.
However, the deployment process is erroring
out. Initial troubleshooting shows that the C. The Artifact Registry might be in a different project.
runtime environment does not have D. You need to specify the Artifact Registry image by name.
access to the build images. You need to
advise the team on how to resolve the issue.

What could cause this problem?


4.1 Diagnostic Question 05 Discussion

Your development team used Cloud Source A. The runtime environment does not have permissions to the
Repositories, Cloud Build, and Artifact Artifact Registry in your current project.
Registry to successfully implement the build B. The runtime environment does not have permissions to Cloud
portion of an application's CI/CD process.. Source Repositories in your current project.
However, the deployment process is erroring
out. Initial troubleshooting shows that the C. The Artifact Registry might be in a different project.
runtime environment does not have D. You need to specify the Artifact Registry image by name.
access to the build images. You need to
advise the team on how to resolve the issue.

What could cause this problem?


4.1 Diagnostic Question 06 Discussion

You are implementing a disaster A. Hot with a low recovery time objective (RTO)
recovery plan for the cloud version of B. Warm with a high recovery time objective (RTO)
your drone solution. Sending videos to
the pilots is crucial from an C. Cold with a low recovery time objective (RTO)
operational perspective. D. Hot with a high recovery time objective (RTO)

What design pattern should you choose


for this part of your architecture?
4.1 Diagnostic Question 06 Discussion

You are implementing a disaster A. Hot with a low recovery time objective (RTO)
recovery plan for the cloud version of B. Warm with a high recovery time objective (RTO)
your drone solution. Sending videos to
the pilots is crucial from an C. Cold with a low recovery time objective (RTO)
operational perspective. D. Hot with a high recovery time objective (RTO)

What design pattern should you choose


for this part of your architecture?
4.1 Diagnostic Question 07 Discussion

The number of requests received by your A. Applying a circuit breaker


application is nearing the maximum B. Applying exponential backoff
specified in your design. You want to
limit the number of incoming requests C. Increasing jitter
until the system can handle the workload. D. Applying graceful degradation

What design pattern does


this situation describe?
4.1 Diagnostic Question 07 Discussion

The number of requests received by your A. Applying a circuit breaker


application is nearing the maximum B. Applying exponential backoff
specified in your design. You want to
limit the number of incoming requests C. Increasing jitter
until the system can handle the workload. D. Applying graceful degradation

What design pattern does


this situation describe?
4.1 Diagnostic Question 08 Discussion

The pilot subsystem in your Delivery by A. Configure proper startup scripts for your VMs.
Drone service is critical to your service. B. Deploy a load balancer to distribute traffic across multiple
You want to ensure that connections machines.
to the pilots can survive a VM
outage without affecting connectivity. C. Create persistent disk snapshots.
D. Implement a managed instance group and load balancer.

What should you do?


4.1 Diagnostic Question 08 Discussion

The pilot subsystem in your Delivery by A. Configure proper startup scripts for your VMs.
Drone service is critical to your service. B. Deploy a load balancer to distribute traffic across multiple
You want to ensure that connections machines.
to the pilots can survive a VM
outage without affecting connectivity. C. Create persistent disk snapshots.
D. Implement a managed instance group and load balancer.

What should you do?


4.1 Diagnostic Question 09 Discussion

Cymbal Direct wants to improve its A. You should implement canary testing.
drone pilot interface. You want to B. You should implement A/B testing.
collect feedback on proposed
changes from the community of pilots C. You should implement a blue/green deployment.
before rolling out updates D. You should implement an in-place release.
systemwide.

What type of deployment


pattern should you implement?
4.1 Diagnostic Question 09 Discussion

Cymbal Direct wants to improve its A. You should implement canary testing.
drone pilot interface. You want to B. You should implement A/B testing.
collect feedback on proposed
changes from the community of pilots C. You should implement a blue/green deployment.
before rolling out updates D. You should implement an in-place release.
systemwide.

What type of deployment


pattern should you implement?
4.1 Analyzing and defining technical processes

Resources to start your journey

Securing the software development lifecycle with Cloud Build and SLSA
CI/CD with Google Cloud
Site Reliability Engineering
DevOps tech: Continuous testing | Google Cloud
Application deployment and testing strategies | Cloud Architecture Center
Chapter 17 - Testing for Reliability
Service Catalog documentation | Google Cloud
What is Disaster Recovery? | Google Cloud
API design guide
4.2 Analyzing and defining business processes

Considerations include:
● Stakeholder management (e.g. influencing and facilitation)
● Change management
● Team assessment / skills readiness
● Decision-making processes
● Customer success management
● Cost optimization / resource optimization (capex / opex)
4.2 Analyzing and defining business processes

Resources to start your journey

What is Digital Transformation?


Cloud Cost Optimization: Principles for Lasting Success
Cost Optimization on Google Cloud for Developers and Operators
Certification solutions for Team Readiness
Developing procedures to ensure
4.3 reliability of solutions in production

● Chaos engineering
● Penetration testing
4.3 Diagnostic Question 10 Discussion

You want to establish procedures A. Block access to storage assets in one of your zones.
for testing the resilience of the B. Inject a bad health check for one or more of your resources.
delivery-by-drone solution.
C. Load test your application to see how it responds.
D. Block access to all resources in a zone.

How would you simulate


a scalability issue?
4.3 Diagnostic Question 10 Discussion

You want to establish procedures A. Block access to storage assets in one of your zones.
for testing the resilience of the B. Inject a bad health check for one or more of your resources.
delivery-by-drone solution.
C. Load test your application to see how it responds.
D. Block access to all resources in a zone.

How would you simulate


a scalability issue?
Developing procedures to ensure
4.3 reliability of solutions in production

Resources to start your journey

Site Reliability Engineering


Site Reliability Engineering (SRE) | Google Cloud
Patterns for scalable and resilient apps | Cloud Architecture Center
How to achieve a resilient IT strategy with Google Cloud
Patterns for scalable and resilient apps | Cloud Architecture Center
Disaster recovery planning guide | Cloud Architecture Center
Make sure to…
Enjoy the journey as
much as the destination!

You might also like