SNOWPROⓇ ADVANCED:
DATA ANALYST
DAA-C01 AND DAA-R01 EXAM
STUDY GUIDE
Last Updated: September 28, 2024
SNOWPRO ADVANCED: DATA ANALYST STUDY GUIDE
OVERVIEW
This is a self-learning study guide that highlights concepts that may be covered on
Snowflake’s SnowPro Advanced: Data Analyst Certification exam.
This study guide does not guarantee certification success.
Holding the SnowPro Core certification in good standing is a prerequisite for taking the
Advanced: Data Analyst certification.
For an overview and more information on the SnowPro Core Certification exam or SnowPro
Advanced Certification series, please navigate here.
RECOMMENDATIONS FOR USING THE GUIDE
This guide will show the Snowflake topics and subtopics covered on the exam. Following
the topics will be additional resources consisting of videos, documentation, blogs, or
exercises to help you understand the Data Analyst role on the Snowflake Data Cloud.
Estimated length of study guide: 10 – 13 hours
Some links may have more value than others, depending on your experience, the same
amount of time should not be spent on each link. Some links may appear in more than one
domain.
Page 1
TABLE OF CONTENTS
SNOWPRO ADVANCED: DATA ANALYST STUDY GUIDE OVERVIEW 1
RECOMMENDATIONS FOR USING THE GUIDE 1
SNOWPRO ADVANCED: DATA ANALYST CERTIFICATION OVERVIEW 2
SNOWPRO ADVANCED: DATA ANALYST PREREQUISITE 3
SNOWPRO ADVANCED: DATA ANALYST SUBJECT AREA BREAKDOWN 4
SNOWPRO ADVANCED: DATA ANALYST DOMAINS & OBJECTIVES 4
Domain 1.0: Data Ingestion and Data Preparation 4
Domain 1.0: Data Ingestion and Data Preparation Study Resources 5
Domain 2.0: Data Transformation and Data Modeling 6
Domain 2.0: Data Transformation and Data Modeling Study Resources 7
Domain 3.0: Data Analysis 8
Domain 3.0: Data Analysis Study Resources 8
Domain 4.0: Data Presentation and Data Visualization 9
Domain 4.0: Data Presentation and Data Visualization Study Resources 9
SNOWPRO ADVANCED: DATA ANALYST SAMPLE QUESTIONS 10
SNOWPRO ADVANCED: DATA ANALYST CERTIFICATION
OVERVIEW
The SnowPro Advanced: Data Analyst exam tests advanced knowledge and skills to apply
comprehensive data analysis principles using Snowflake and its components. The exam will
assess skills through scenario-based questions and real-world examples.
This certification will test the ability to:
● Prepare and load data
● Perform simple data transformations for data analysis
● Build and troubleshoot advanced SQL queries in Snowflake
● Use Snowflake built-in functions and create User-Defined Functions (UDFs)
● Perform descriptive and diagnostic data analyses
● Perform data forecasting
● Prepare and present data to meet business requirements
Target Audience:
1+ year of Snowflake data cloud analytics experience, including practical, hands-on use of
the Snowflake Data Cloud. In addition, successful candidates may have:
● Fluency with advanced SQL
Knowledge of an additional computer language is a plus but not a requirement.
This exam is designed for:
● Snowflake Data Analysts
● ELT Developers
● BI Specialists
Page 2
SNOWPRO ADVANCED: DATA ANALYST PREREQUISITE
Eligible individuals must hold an active SnowPro Core Certified credential. If you feel you
need more guidance on the Snowflake fundamentals, please see the SnowPro Core Exam
Study Guide.
STEPS TO SUCCESS
1. Review the Data Analyst Exam Guide
2. Attend Snowflake’s Instructor-Led Data Analyst Snowflake Training
3. Review and study applicable white papers and documentation
4. Get hands-on practical experience with relevant business requirements using
Snowflake
5. Attend Snowflake Webinars
6. Attend Snowflake Virtual Hands-on Labs for hands-on practical experience
7. Schedule your exam
8. Take your exam!
Additional Snowflake Assets to check out for Advanced: Data Analyst
Snowflake for Dummies Guide Series Books
Page 3
EXAM FORMAT FOR DATA ANALYST CERTIFICATION
Exam Version: DAA-C01
Total Number of Questions: 65
Question Types: Multiple Select, Multiple Choice
Time Limit: 115 minutes
Languages: English
Passing Score: 750 + Scaled Scoring from 0 - 1000
EXAM FORMAT FOR DATA ANALYST RECERTIFICATION
The SnowPro Advanced: Data Analyst Recertification exam offers candidates a route to
maintain their certification status with Snowflake. The Recertification exam shares the same
outline as the regular Data Analyst Certification exam, but is shorter and is offered at a
reduced rate. You must hold the Data Analyst Certification in good standing to take the Data
Analyst Recertification exam.
Exam Version: DAA-R01
Total Number of Questions: 40
Question Types: Multiple Select, Multiple Choice
Time Limit: 85 minutes
Languages: English
Passing Score: 750 + Scaled Scoring from 0 - 1000
Prerequisites: SnowPro Advanced: Data Analyst Certified
Page 4
SNOWPRO ADVANCED: DATA ANALYST SUBJECT AREA
BREAKDOWN
This exam guide includes test domains, weightings, and objectives. It is not a comprehensive
listing of all the content that will be presented on this examination. The table below lists the
main content domains and their weightings.
Domain Domain Weightings on Exams
1.0 Data Ingestion and Data Preparation 15-20%
2.0 Data Transformation and Data Modeling 20-25%
3.0 Data Analysis 30-35%
4.0 Data Presentation and Data Visualization 25-30%
SNOWPRO ADVANCED: DATA ANALYST DOMAINS &
OBJECTIVES
Domain 1.0: Data Ingestion and Data Preparation
1.1 Use a collection system to retrieve data.
● Assess how often data needs to be collected
● Identify the volume of data to be collected
● Identify data sources
● Retrieve data from a source
1.2 Perform data discovery to identify what is needed from the available datasets.
● Query tables in Snowflake
● Evaluate which transformations are required
1.3 Enrich data by identifying and accessing relevant data from the Snowflake
Marketplace.
● Find external data sets that correlate with available data
● Use data shares to join data with existing data sets
● Create tables and views
1.4 Outline and use best practice considerations relating to data integrity structures.
● Primary keys for tables
● Perform table joins between parent/child tables
● Constraints
1.5 Implement data processing solutions.
● Aggregate and enrich data
● Automate and implement data processing
Page 5
● Respond to processing failures
● Use logging and monitoring solutions
1.6 Given a scenario, prepare data and load into Snowflake.
● Load files using Snowsight
● Load data from external/internal stages into a Snowflake table
● Load different types of data
● Perform general DML (insert, update, delete)
● Identify and resolve data import errors
1.7 Given a scenario, use Snowflake functions.
● Scalar functions
● Aggregate functions
● Window functions
● Table functions
● System functions
● Geospatial functions
Domain 1.0: Data Ingestion and Data Preparation Study Resources
Additional Assets
NULL handling in Snowflake (article)
Snowflake Documentation Links
Access History
Account Usage
Bulk Loading Using COPY
COPY INTO <table>
COPY_HISTORY
CREATE TABLE
DATEDIFF
Data Consumers
INFER_SCHEMA
Introduction to External Tables
Introduction to Unstructured Data Support
Lateral Join
Loading Data into Snowflake
Loading Using the Web Interface (Limited)
Modifying Constraints
Object Dependencies
Overview of Constraints
Overview of Data Loading
PARSE_JSON
PIVOT
Preparing Your Data Files
Page 6
PUT
Querying Data Using Worksheets
Querying Metadata for Staged Files
QUALIFY
SAMPLE / TABLESAMPLE
Semi-structured Data Types
TOP <n>
TO_TIMESTAMP / TO_TIMESTAMP_*
Transforming Data During a Load
Working with Joins
Working with Subqueries
Domain 2.0: Data Transformation and Data Modeling
2.1 Prepare different data types into a consumable format.
● CSV
● JSON (query and parse)
● Parquet
2.2 Given a dataset, clean the data.
● Identify and analyze data anomalies
● Handle erroneous data
● Validate data types
● Use clones as required by specific use-cases
2.3 Given a dataset or scenario, work with and query the data.
● Aggregate and validate the data.
● Apply analytic functions
● Perform pre-math calculations (examples, randomization, ranking, grouping,
min/max)
● Perform classifications
● Perform casting - change data types to ensure data can be presented consistently
● Enrich the data
● Leverage partition pruning
● Use Time Travel and cloning features
● Use built-in functions for traversing, flattening, and nesting semi-structured data
● Use native data types
Page 7
2.4 Use data modeling to manipulate the data to meet BI requirements.
● Select and implement an effective data model
● Identify when to use a data model and when to use a flattened data set
● Use different modeling techniques for the consumption layer (for example,
dimensional, Data Vault)
2.5 Optimize query performance.
● Understand the attributes of the Query Profile
● Understand how to view and analyze the query execution plan
● Troubleshoot query performance
● Leverage result, metadata, and virtual warehouse caching
● Use of different types of database objects, such as materialized views
Domain 2.0: Data Transformation and Data Modeling Study Resources
Additional Assets
NULL handling in Snowflake (article)
Performance impact from local and remote disk spilling (article)
How to Analyze JSON with SQL | Snowflake (PDF)
Snowflake Documentation Links
Analyzing Queries Using Query Profile
COUNT
Data Type Conversion
DATEDIFF
FLATTEN
LAG
Numeric Data Types
OBJECT_AGG
Querying Semi-structured Data
REGEXP_LIKE
SAMPLE / TABLESAMPLE
SPLIT_TO_TABLE
TRIM
Understanding & Using Time Travel
Using Persisted Query Results
Warehouse Considerations
Window Functions
Working with Secure Views
Working with Temporary and Transient Tables
Page 8
Domain 3.0: Data Analysis
3.1 Use SQL extensibility features.
● User-Defined Functions (UDFs)
● Stored procedures
● Regular, secure, and materialized views
3.2 Perform a descriptive analysis.
● Summarize large data sets using Snowsight dashboards
● Perform exploratory ad-hoc analyses
3.3 Perform a diagnostic analysis.
● Find reasons/causes of anomalies or patterns in historical data
● Collect related data
● Identify demographics and relationships
● Analyze statistics and trends
3.4 Perform forecasting.
● Use statistics and built in functions
● Make predictions based on data
Domain 3.0: Data Analysis Study Resources
Snowflake Documentation Links
APPROX_COUNT_DISTINCT
AVG
Constraints
DENSE_RANK
Estimating the Number of Distinct Values
EXPLAIN
FLATTEN
Handling Exceptions
HLL
NTILE
Overview of Stored Procedures
Querying Data Using Worksheets
RANK
REGR_INTERCEPT
REGR_SLOPE
ROW_NUMBER
SHA2 , SHA2_HEX
STDDEV_SAMP
Visualizing Worksheet Data
Working with Materialized Views
Page 9
Domain 4.0: Data Presentation and Data Visualization
4.1 Given a use case, create reports and dashboards to meet business requirements.
● Evaluate and select the data for building dashboards
● Understand the effects of row access policies and Dynamic Data Masking
● Compare and contrast different chart types (for example, bar charts, scatter plots,
heat grids, scorecards)
● Understand what is required to connect BI tools to Snowflake
● Create charts and dashboard in Snowsight
4.2 Given a use case, maintain reports and dashboards to meet business requirements.
● Build automated and repeatable tasks
● Operationalize data
● Store and update data
● Manage and share Snowsight dashboards
● Configure subscriptions and updates
4.3 Given a use case, incorporate visualizations for dashboards and reports.
● Present data for business use analyses
● Identify patterns and trends
● Identify correlations among variables
● Customize data presentations using filtering and editing techniques
Domain 4.0: Data Presentation and Data Visualization Study Resources
Snowflake Documentation Links
Account Usage
CORR
GENERATOR
Querying Data Using Worksheets
ROW_NUMBER
SEQ1 / SEQ2 / SEQ4 / SEQ8
Using Row Access Policies
Visualizing Data With Dashboards
Visualizing Worksheet Data
Ready to register for an exam? Navigate here to get started.
Page 10
SNOWPRO ADVANCED: DATA ANALYST SAMPLE QUESTIONS
1. A retail company needs to run a marketing campaign targeting customers in specific
sales regions. A Data Analyst needs to support this campaign using the necessary
data in Snowflake.
How can the Analyst meet this requirement?
A. Use Snowsight to load the region table (region_id, name, comment).
B. Use Snowsight to load the customer table (id, name, address, region_id,
phone_number).
C. Use the COPY command to load the region table (region_id, name, comment)
D. Use the COPY command to load the customer table (id, name, address,
region_id, phone_number)
2. The following JSON object is stored in a VARIANT column called src in a table
called car_sales:
{"vehicle" : [
{"make": "Honda", "model": "Civic", "year": "2019",
"price": "20275", "extras":["ext warranty", "paint
protection"]},
{"make": "Toyota", "model": "Camry", "year": "2021",
"price": "28375", "extras":["ext warranty", "paint
protection", "rust proofing"]}
]}
Which query would return the following result?
Make Model Extras
Honda Civic ext warranty
Honda Civic paint protection
Toyota Camry ext warranty
Toyota Camry paint protection
Toyota Camry rust proofing
Page 11
A. SELECT
src:vehicle.make::string AS make,
src:vehicle.model::string AS model,
src:vehicle.extras::string AS extras
FROM car_sales
ORDER BY make, model, extras;
B. SELECT
vm.value:make::string AS make,
vm.value:model::string AS model,
ve.value::string AS extras
FROM car_sales
,lateral flatten(input => src:vehicle) AS vm
,lateral flatten(input => vm.value:extras) AS ve
ORDER BY make, model, extras;
C. SELECT
vm.value:make::string AS make,
vm.value:model::string AS model,
vm.value:extras::string AS extras
FROM car_sales
,lateral flatten(input => src:vehicle) AS vm
ORDER BY make, model, extras;
D. SELECT
src:vehicle.make::string AS make,
src:vehicle.model::string AS model,
vm.value::string AS extras
FROM car_sales
,lateral flatten(input => src:vehicle.extras) AS vm
ORDER BY make, model, extras;
Page 12
3. A Data Analyst created a schema named PUBLIC. This schema contains two
permanent tables as shown below:
CREATE TABLE TABLE1 (NAME VARCHAR)
DATA_RETENTION_TIME_IN_DAYS = 10;
CREATE TABLE TABLE2 (NAME VARCHAR);
The following command is run:
ALTER SCHEMA PUBLIC SET DATA_RETENTION_TIME_IN_DAYS =
15;
What will be the result of running the command?
A. The attempt to set the data retention limit at the schema level will cause the
statement to fail.
B. The retention time on TABLE1 does not change. The retention time on
TABLE2 will be set to 15 days.
C. The retention time on both tables will be set to 15 days.
D. The retention time will be unchanged for both tables.
4. A Data Analyst has a sequence of numeric values that could represent some quantity
or amount values. The Analyst tries to label the values using ranking windows
functions as shown below:
Page 13
What are the hidden values (as indicated by the green circles) in the SQL query
result grid? (Select TWO).
A. 2 for the hidden NTILE cell
B. 3 for the hidden RANK cell
C. 4 for the hidden RANK cell
D. 3 for the hidden DENSE_RANK cell
E. 1 for the hidden NTILE cell
5. A Data Analyst has been asked to produce a tile in a dashboard using Snowsight.
The chart should always show orders for the last 30 days excluding partial days
based on the order_date field.
Which notation will meet this requirement?
A. where order_date = :databucket
B. where order_date = :datebucket
C. where order_date = :date_picker
D. where order_date = :daterange
Page 14
Correct responses for sample questions:
1: d, 2: b, 3: b, 4: c & e , 5: d
The information provided in this study guide is provided for your purposes only and may
not be provided to third parties.
IN ADDITION, THIS STUDY GUIDE IS PROVIDED “AS IS”. NEITHER SNOWFLAKE
NOR ITS SUPPLIERS MAKES ANY OTHER WARRANTIES, EXPRESS OR IMPLIED,
STATUTORY OR OTHERWISE, INCLUDING BUT NOT LIMITED TO WARRANTIES
OF MERCHANTABILITY, TITLE, FITNESS FOR A PARTICULAR PURPOSE OR
NONINFRINGEMENT.
Page 15