Week 2
Intermediate SQL I: Aggregates
SQL for Business Users
Module Description
Data aggregation is the next step towards being able to
do complex queries for deeper data analysis. This module
will present how SQL aggregations are done, particularly
in uncovering basic descriptive statistics about the data.
Module Objective
At the end of this week, learners will be able to
(1) Understand basic concepts of data aggregation
(2) Understand and apply basic SQL aggregation functions such
as SUM, MIN/MAX, AVG et
(3) Getting comfortable with describing data using SQL
aggregations
c
Practice Dataset
For this part of the
course, we’ll be working
with Apple (AAPL) stock
price data, taken from
Google Finance.
Aggregate functions in SQL
• Overview: As pivot table is to Excel, SQL excels at aggregating
dat
• COUNT - counts how many rows are in a particular colum
• SUM - adds together all the values in a particular colum
• MIN/MAX - return the lowest and the highest values in a
particular column, respectivel
• AVG - calculates the average of a group of selected values
a
COUNT
• Counting all row
• This is how we count the number of rows in a table
Results to 3555
s
COUNT
• Counting all individual column
• This will count the number of non-null/non-empty rows under
the column high
What is the result?
How does it compare to the previous example?
s
COUNT
• Counting non-numerical column
• you can use COUNT even on non-numerical (in this case, date)
columns
What is the result?
s
COUNT DISTINCT
• Counting distinct values in a colum
• you can use COUNT DISTINCT to count
Count the number of distinct dates in the table.
How many do you get? Is it equal to the value from
our last slide? Why?
n
SUM
• SUM totals the values in a given colum
• You can only use SUM on numerical column
• Reminder: aggregators only aggregate vertically (columnar).
The answer should be a huge number
MIN/MAX
• MIN and MAX return the lowest and highest values of a colum
• You can use MIN and MAX even on non-numerical column
• MIN returns the lowest number, earliest date, or any non-
numeric that’s alphabetically close to the letter “A”, meanwhile
MAX does the opposite
.
MIN/MAX
• MIN and MAX return the lowest and highest values of a colum
• As an exercise, try replacing “volume” with “date” to find the
minimum and maximum dates in our table
AVG
• AVG calculates the average of a selected colum
• You can only use AVG on numeric column
• It also ignores nulls/empty value
What is the result?
s
GROUP BY
• GROUP BY allows aggregation on a part of a table like
counting the number of trades per yea
GROUP BY
• GROUP BY allows aggregation on multiple columns (separated
by commas
)
GROUP BY
• GROUP BY allows aggregation by substituting column names
with number
s
GROUP BY
• ORDER BY following a GROUP BY clause allows control over how
the aggregations are grouped together (ascending or descending
GROUP BY
• NOTE: GROUP BY executes BEFORE the LIMIT functio
• This means that the GROUP BY function groups a table on
specific columns before return the number of rows specified by
the LIMIT functio
n
HAVING
• HAVING allows filters after GROUP BY is execute
• The query below filters for months where AAPL stock got to a high
price of over $400/share
QUIZ: Use table tutorial.aapl_historical_stock_price
• Count the number of unique dates.
• What is the maximum close value
• What is the total volume in month 3
• What is the average high price
• Write a query that finds the maximum close price per month in
descending maximum close price (no aliases).
• Write a query that finds the total volume per month having
maximum close price greater than $500 in ascending month
(alias the total volume as ‘total_volume”).
?
Week 3
Intermediate SQL II: Joins
SQL for Business Users