Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
14 views5 pages

Windowed Aggergations

Uploaded by

thirupathin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views5 pages

Windowed Aggergations

Uploaded by

thirupathin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Windowed Aggregations

An introduction to windowed aggregations.

Windowed aggregations partition the results from a SQL query into groups in order to
perform calculations across adjacent rows of the query result. Currently windowed
aggregations cannot be combined in the same SELECT statement with GROUP BY,
HAVING, or any other aggregations. They are placed before an ORDER BY clause if one is
used. To invoke a WINDOW function you use a special syntax with the OVER clause to
specify the WINDOW. There are three parts in a windowed aggregation:

 PARTITION BY - The partition specification works similarly to the GROUP BY


clause and indicates how the query results are divided into groups. It is always
optional.
 ORDER BY - The ordering specification determines in what order the aggregations
will be applied. It is required for the new functions and optional when using standard
aggregations as windowed functions.
 ROWS BETWEENor RANGE BETWEEN - The window frame designates a subset
of consecutive rows adjacent to the current row in the window that will be evaluated
as a group. The difference between ROWS BETWEEN and RANGE BETWEEN is
that ROWS BETWEEN specifies the distance in number of rows and RANGE
BETWEEN specifies the distance in the values. Options for both are:
o N PRECEDING AND M FOLLOWING - where N and M are positive
integers.
o N PRECEDING AND M PRECEDING - legal but uncommon
o N FOLLOWING AND M FOLLOWING - legal but uncommon
o CURRENT - can replace either a PRECEDING value or a FOLLOWING
value as in ROWS BETWEEN CURRENT AND 7 FOLLOWING or ROWS
BETWEEN 7 PRECEDING AND CURRENT
o UNBOUNDED- can replace either a PRECEDING value or a FOLLOWING
value as in ROWS BETWEEN 7 PRECEDING AND UNBOUNDED or
ROWS BETWEEN UNBOUNDED and CURRENT

If not specified, the default range is the beginning of the partition through the current row. It
is always optional.

Per the SQL standard, the newly supported windowed functions do NOT work with
WINDOW frames. WINDOW frames are only for other aggregate functions.

Windowed functions also take WINDOW clauses which are similar to WITH clauses
elsewhere in SQL. They allow you to define a WINDOW in the clause and then finish it’s
definition in the aggregation, but you can’t actually override anything.That is to say, you can
specify one or two of the three parts of the aggregation (PARTITION BY, ORDER BY, or
ROWS BETWEEN) but they can’t also be specified in the aggregation. WINDOW clauses
can also be nested. In other words you can define one, call it from another, and then use the
second one in the aggregation. Though there are many ways you can use WINDOW clauses,
the most common one is to fully define a window within the clause and then use it multiple
times in different aggregations.

Following are some examples of common uses of windowed functions:

39
AVG OVER

If we wanted to know the average order value for a customer ordered from oldest to newest
and grouped into threes (an average of the current order, the preceding order, and the
following order) the query would look like this:

SELECT account, create_date, AVG(order_value) OVER (PARTITION BY account


ORDER BY create_date ROWS BETWEEN 1 preceding and 1 following)
FROM jun_2017_orders

And a subset of the results would look like this:

account create_date avg


Betasoloin 6/3/17 3993.5
Betasoloin 6/4/17 4159.666667
Betasoloin 6/6/17 4187
Betasoloin 6/9/17 3570
Betasoloin 6/24/17 3109
Gekko & Co 6/2/17 3189.5
Gekko & Co 6/11/17 2480.333333
Gekko & Co 6/12/17 878
Gekko & Co 6/14/17 542

ROW_NUMBER

This is a useful function for knowing what row you are on within your query results:

SELECT name, regional_office, ROW_NUMBER() OVER ( PARTITION BY


regional_office ORDER BY name)
FROM employees

Results show the row numbers based on the order by constraint:

name regional_office row_number


Carl Lin West 1
Carol Thompson West 2
Elease Gluck West 3
Hayden Neloms West 4
James Ascencio West 5
Kami Bicknell West 6
Kary Hendrixson West 7
Markita Hansen West 8
Maureen Marcano West 9

RANK

The are a few things to note about rank on a range of values in a window partition:

40
 If ranked values are the same they get the same rank.
 A row that doesn’t have the same value as the preceding row has a rank equal to its
row number.
 Because of the way it handles ties, RANK can have gaps in the sequence.

Here is an example of a query that returns the home run ranking of baseball players in 2000
from the Lahman Sabermetrics dataset:

SELECT playerid, HR, RANK() OVER ( ORDER BY HR DESC)


FROM batting
WHERE yearId = 2000
ORDER BY HR DESC, playerid

And here are the results:

playerid HR rank
sosasa01 50 1
bondsba01 49 2
bagweje01 47 3
glaustr01 47 3
guerrvl01 44 5
hidalri01 44 5
giambja01 43 7
sheffga01 43 7
thomafr04 43 7
edmonji01 42 10
heltoto01 42 10
batisto01 41 12
delgaca01 41 12
rodrial01 41 12
griffke02 40 15

FIRST_VALUE

The FIRST_VALUE function returns the first value for each partition, as in this query which
returns the first order value for each account in the jun_2017_ orders table:

SELECT account, create_date, order_value, FIRST_VALUE(order_value) OVER


(PARTITION BY account ORDER BY create_date)
FROM jun_2017_orders
account create_date order_value first_value
Betasoloin 6/3/17 5100 5100
Betasoloin 6/4/17 2887 5100
Betasoloin 6/9/17 5182 5100
Betasoloin 6/24/17 1036 5100
Gekko & Co 6/2/17 5304 5304
Gekko & Co 6/11/17 1075 5304

41
account create_date order_value first_value
Gekko & Co 6/12/17 1062 5304
Gekko & Co 6/14/17 497 5304
Gekko & Co 6/19/17 67 5304
Gekko & Co 6/25/17 5031 5304
Gekko & Co 6/26/17 5622 5304
Plexzap 6/11/17 3850 3850

LEAD

If you want to know something like both the current order value and the next order value for
a customer you could use the LEAD function. LEAD takes three arguments:

 The name of the column to evaluate


 The number of rows to offset
 The value to put if there are no more rows that meet the partition requirement

Here is an example of a query that aggregates the results by account and either returns the
next row’s value for the company or returns -1 if it’s the last order for the account:

SELECT account, create_date, order_value, LEAD(order_value, 1, -1) OVER (PARTITION


BY account ORDER BY create_date)
FROM jun_2017_orders
account create_date order_value lead
Betasoloin 6/3/17 5100 2887
Betasoloin 6/4/17 2887 4492
Betasoloin 6/6/17 4492 5182
Betasoloin 6/9/17 5182 1036
Betasoloin 6/24/17 1036 -1
Gekko & Co 6/2/17 5304 1075
Gekko & Co 6/11/17 1075 1062
Gekko & Co 6/12/17 1062 497

LAG

The LAG function works like the LEAD function except that it returns a value from any
preceding row:

SELECT account, create_date, order_value, LAG(order_value, 1, -1) OVER (PARTITION


BY account ORDER BY create_date)
FROM jun_2017_orders
count create_date order_value lag
Betatech 2017-06-02 3666 -1
Betatech 2017-06-05 460 3666
Betatech 2017-06-06 41 460
Betatech 2017-06-11 1067 41
Betatech 2017-06-12 5370 1067

42
count create_date order_value lag
Betatech 2017-06-17 3822 5370
Betatech 2017-06-24 4410 3822
Betatech 2017-06-25 619 4410
Ganjaflex 2017-06-02 3780 -1

Here is a list of all the new windowed functions with links to their reference pages:

 CUME_DIST
 DENSE_RANK
 FIRST_VALUE
 LAST_VALUE
 LEAD
 LAG
 NTH_VALUE
 NTILE
 PERCENT_RANK
 RANK
 ROW_NUMBER

43

You might also like