0% found this document useful (0 votes)

43 views14 pages

Window Functions

Uploaded by

vlearning365

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views14 pages

Window Functions

Uploaded by

vlearning365

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Data Engineering

Data Transformation

WINDOW
FUNCTIONS

DHANESH SARPALE
BASIC EXAMPLE

In the above example, the average salary has

been calculated by aggregating the salaries
based on the job titles of the employees.

DHANESH SARPALE
GENERIC SYNTAX
MySQL

SELECT <column_list>,
<aggregate_function>(<column_expression>) OVER
(
PARTITION BY <partition_expression>
ORDER BY <order_expression>
ROWS <window_frame>
) AS <alias>
FROM <table_name>;

Python

import pandas as pd

df['<alias>'] = df['<column_expression>'].<aggregate_function>
().\
groupby(<partition_expression>).\
<transform_function>()

DHANESH SARPALE
LIST OF SQL WINDOW FUNCTIONS

DHANESH SARPALE
DATA ENGINEERING COMMON
OPERATIONS WITH WINDOW
FUNCTIONS
aggregating, transforming, and analyzing data
within precise partitions or windows

1. Data Aggregation
2. Data Cleansing
3. Data Enrichment
4. Data Partitioning
5. Data Ordering

DHANESH SARPALE
1. DATA AGGREGATION

To perform aggregations over subsets of

data within a given window.
To calculate aggregated values such as
cumulative sums, averages, counts, or
percentages.
To perform these aggregations efficiently
and in a flexible manner, allowing to
aggregate data at different levels of
granularity.

DHANESH SARPALE
1. DATA AGGREGATION
MySQL

SELECT product_id, category, sales,

SUM(sales) OVER (PARTITION BY category) As
category_total_sales,
AVG(sales) OVER (PARTITION BY category) As
category_avg_sales,
SUM(sales) OVER () AS overall_total_sales,
AVG(sales) OVER () AS overall_avg_sales
FROM sales_data
GROUP BY product_id, category;

Python
import pandas as pd

# Assume you already have the data loaded into a pandas DataFrame called
'df'

# Calculating the sum and average sales for each product and category, and
overall sum and average
df['category_total_sales'] = df.groupby('category')['sales'].transform('sum')
df['category_avg_sales'] = df.groupby('category')['sales'].transform('mean')
df['overall_total_sales'] = df['sales'].sum()
df['overall_avg_sales'] = df['sales'].mean()

# Displaying the DataFrame

print(df)

DHANESH SARPALE
2. DATA CLEANSINS

To assist in data cleansing tasks by

identifying and handling duplicates,
missing values, or outliers within specific
windows.
To rank rows based on certain criteria and
identify duplicate records.
To calculate statistical measures within
windows to identify outliers that need to be
handled or removed during the ETL process.

DHANESH SARPALE
2. DATA CLEANSING
MySQL

SELECT name, score,

RANK() OVER (ORDER BY score DESC) AS rank
FROM students;

Python

import pandas as pd

# Assume you already have the data loaded into a pandas

DataFrame called 'df'

# Assigning ranks to students based on their exam scores

df['rank'] = df['score'].rank(ascending=False, method='min')

# Displaying the DataFrame

print(df)

DHANESH SARPALE
3.DATA ENRICHMENT

Window functions provide the ability to

enrich data by computing values based on a
subset of related records within a window.
to derive new information or generate
additional features for your dataset.
For instance, to calculate moving averages,
running totals, or cumulative sums within a
window to provide insights into trends or
patterns in the data.

DHANESH SARPALE
3.DATA ENRICHMENT
MySQL
SELECT product_id, sales,
AVG(sales) OVER (ORDER BY date_column ROWS
BETWEEN 2 PRECEDING AND CURRENT ROW) AS
moving_average,
SUM(sales) OVER (ORDER BY date_column) AS
running_total,
SUM(sales) OVER (ORDER BY date_column) AS
cumulative_sum
FROM sales_data;

Python
# Calculate the 3-day moving average of sales for each
product
df['moving_average'] = df['sales'].rolling(window=3,
min_periods=1).mean()

# Calculate the cumulative sum of sales for each product

df['cumulative_sum'] = df['sales'].cumsum()

# Display the DataFrame

print(df)

DHANESH SARPALE
4. DATA PARTITIONANING

Window functions enable to partition data

into logical groups based on one or more
columns.
This is particularly helpful during the
transformation phase of ETL when
performing calculations or aggregations
separately for different partitions.
For example, to partition data by region, time
period, or any other relevant attribute and
apply window functions within each partition
to obtain partition-specific results.

DHANESH SARPALE
5.DATA ORDERING

Window functions provide the ability to order

data within each partition based on specified
criteria.
This is useful when performing calculations
or aggregations in a specific order.
For example, to order time series data by
timestamp and use window functions to
calculate moving averages or detect trends
over a specified window size.

DHANESH SARPALE
Thank you for taking the time to read
this document! If you found it valuable,
I would greatly appreciate it if you
could show your support by liking and
sharing it with your network. I am
eager to connect with you on LinkedIn,
Let's connect and collaborate to foster
growth together!

DHANESH SARPALE

Data Analyst Cheat Sheet
No ratings yet
Data Analyst Cheat Sheet
28 pages
Advanced SQL Concepts
No ratings yet
Advanced SQL Concepts
38 pages
Ade 1737191501
No ratings yet
Ade 1737191501
29 pages
Window Functions
No ratings yet
Window Functions
30 pages
SQL Window Function !!
No ratings yet
SQL Window Function !!
30 pages
The Power of SQL Aggregate Window Functions
No ratings yet
The Power of SQL Aggregate Window Functions
4 pages
DP080 Lecture 5
No ratings yet
DP080 Lecture 5
13 pages
Lecture 11 DMS
No ratings yet
Lecture 11 DMS
15 pages
Window Function by Pragya Rathi 1751487084 2
No ratings yet
Window Function by Pragya Rathi 1751487084 2
14 pages
Data Science Tools Guide: SQL, R, Python
No ratings yet
Data Science Tools Guide: SQL, R, Python
23 pages
Mastering SQL Window Functions - 01
No ratings yet
Mastering SQL Window Functions - 01
39 pages
Data Science Tools Study Guides For MIT's 15.003
No ratings yet
Data Science Tools Study Guides For MIT's 15.003
23 pages
DP080 Lecture 5
No ratings yet
DP080 Lecture 5
13 pages
DP080 Lecture 5
No ratings yet
DP080 Lecture 5
13 pages
Window Functions in SQL
No ratings yet
Window Functions in SQL
14 pages
DM Cia1
No ratings yet
DM Cia1
31 pages
Aggregation Analytical Functions
No ratings yet
Aggregation Analytical Functions
113 pages
Window Functions
No ratings yet
Window Functions
14 pages
SQL Project - Exploring Trends, Segmentation & KPIs
No ratings yet
SQL Project - Exploring Trends, Segmentation & KPIs
43 pages
Window Function in MySQL
No ratings yet
Window Function in MySQL
10 pages
Window Functions in SQL and PySpark
No ratings yet
Window Functions in SQL and PySpark
5 pages
OLAP Functions Part 1
No ratings yet
OLAP Functions Part 1
41 pages
Advanced Concepts in SQL
No ratings yet
Advanced Concepts in SQL
5 pages
Questions For Preparation
No ratings yet
Questions For Preparation
9 pages
Deloitee Data Engineer Interview Questions
No ratings yet
Deloitee Data Engineer Interview Questions
24 pages
Experiment No 3 - Final
No ratings yet
Experiment No 3 - Final
44 pages
Supermarket Sales Data Analysis
No ratings yet
Supermarket Sales Data Analysis
6 pages
MATODA Raport Store20
No ratings yet
MATODA Raport Store20
13 pages
Walmart Data Analyst Interview Experience
No ratings yet
Walmart Data Analyst Interview Experience
10 pages
SQL Interview Questions For A Data Engineer
No ratings yet
SQL Interview Questions For A Data Engineer
11 pages
Deloitte Data Engineer Interview Experience (0-3 Yoe)
No ratings yet
Deloitte Data Engineer Interview Experience (0-3 Yoe)
22 pages
Federated Databases & SQL Analytics
No ratings yet
Federated Databases & SQL Analytics
50 pages
Expt 2 - 2-1
No ratings yet
Expt 2 - 2-1
31 pages
? Window Functions ?
No ratings yet
? Window Functions ?
10 pages
HTML Code
No ratings yet
HTML Code
3 pages
Deloite Data Engineer Interview Questions
No ratings yet
Deloite Data Engineer Interview Questions
24 pages
Crack Your Data Engineering SQL Round
No ratings yet
Crack Your Data Engineering SQL Round
112 pages
Data Manipulation in Python Using Pandas
No ratings yet
Data Manipulation in Python Using Pandas
12 pages
TD Advanced SQL
No ratings yet
TD Advanced SQL
88 pages
SQL-Data Analytcs
No ratings yet
SQL-Data Analytcs
13 pages
SQL 1721960421
No ratings yet
SQL 1721960421
131 pages
SQL Guide for Data Engineers
No ratings yet
SQL Guide for Data Engineers
7 pages
Python - Pandas - Numpy Interview Q&A
No ratings yet
Python - Pandas - Numpy Interview Q&A
12 pages
Window Functions and Syntax (Slides)
No ratings yet
Window Functions and Syntax (Slides)
14 pages
DBMS 6,7,8,9
No ratings yet
DBMS 6,7,8,9
22 pages
HTML Code
No ratings yet
HTML Code
4 pages
SQL For Data Analysis Cheat Sheet-By Srija Biswas
No ratings yet
SQL For Data Analysis Cheat Sheet-By Srija Biswas
22 pages
EDA With Pandas
No ratings yet
EDA With Pandas
8 pages
Battle of The Data Tools - Pandas Vs SQL
No ratings yet
Battle of The Data Tools - Pandas Vs SQL
12 pages
Data Aggregation Using Python
No ratings yet
Data Aggregation Using Python
33 pages
Vislaization Manual
No ratings yet
Vislaization Manual
27 pages
Tech Mahindra Data Analyst Interview Questions
No ratings yet
Tech Mahindra Data Analyst Interview Questions
11 pages
S03-Window Functions Within SQLite
No ratings yet
S03-Window Functions Within SQLite
15 pages
Practical
No ratings yet
Practical
12 pages
SQL (Window Function)
No ratings yet
SQL (Window Function)
6 pages
Window Functions in SQL
No ratings yet
Window Functions in SQL
26 pages
Techniques
No ratings yet
Techniques
31 pages
DWDM
No ratings yet
DWDM
81 pages
Window Function SQL
No ratings yet
Window Function SQL
2 pages
ROCK Clustering Example
100% (2)
ROCK Clustering Example
4 pages
What Is Python Programming Cycle
No ratings yet
What Is Python Programming Cycle
8 pages
Computational Lab in Physics: Part I Basics of Linux, Emacs & C++
No ratings yet
Computational Lab in Physics: Part I Basics of Linux, Emacs & C++
37 pages
CSE 202 Mini Project Report PDF
No ratings yet
CSE 202 Mini Project Report PDF
10 pages
Bca 304
No ratings yet
Bca 304
280 pages
50 Programming Questions
No ratings yet
50 Programming Questions
3 pages
Lecture 10 Basic CNN
No ratings yet
Lecture 10 Basic CNN
65 pages
Text Formulas
No ratings yet
Text Formulas
18 pages
RF Engineering Career Guide
No ratings yet
RF Engineering Career Guide
69 pages
Power Off Reset Reason
No ratings yet
Power Off Reset Reason
3 pages
The Ultimate Anti-Debugging Reference PDF
No ratings yet
The Ultimate Anti-Debugging Reference PDF
145 pages
MLR Institute of Technology
No ratings yet
MLR Institute of Technology
16 pages
Linux Kernel Module Techniques
No ratings yet
Linux Kernel Module Techniques
22 pages
History of C Programming Language
No ratings yet
History of C Programming Language
23 pages
Component Models for Developers
No ratings yet
Component Models for Developers
36 pages
Round Robin Scheduling C/C++ Program
No ratings yet
Round Robin Scheduling C/C++ Program
4 pages
2022-11-13 - Black Mass Halloween 2022
No ratings yet
2022-11-13 - Black Mass Halloween 2022
103 pages
Abstraction in Problem Solving
No ratings yet
Abstraction in Problem Solving
39 pages
Manikanta Kumar Resume
No ratings yet
Manikanta Kumar Resume
2 pages
Computer GuessPaper 2025
No ratings yet
Computer GuessPaper 2025
8 pages
Lizard Stream Cipher
No ratings yet
Lizard Stream Cipher
12 pages
Gauss-Seidel Method Solution
No ratings yet
Gauss-Seidel Method Solution
1 page
IIS2121 CourseHandout 2024
No ratings yet
IIS2121 CourseHandout 2024
6 pages
Data Structures
No ratings yet
Data Structures
71 pages
Shaadi.com
100% (4)
Shaadi.com
154 pages
Bisma Ali - Assignment
No ratings yet
Bisma Ali - Assignment
5 pages
GLOBUS Database Enquiry Guide
No ratings yet
GLOBUS Database Enquiry Guide
18 pages
SystemVerilog for Chip Designers
No ratings yet
SystemVerilog for Chip Designers
16 pages
Read The Sensor Data and Upload The Data To Thing Speak Cloud Using Node MCU
No ratings yet
Read The Sensor Data and Upload The Data To Thing Speak Cloud Using Node MCU
4 pages
Releaselog-20240912 2
No ratings yet
Releaselog-20240912 2
6 pages

Window Functions

Uploaded by

Window Functions

Uploaded by

Data Engineering

In the above example, the average salary has

To perform aggregations over subsets of

SELECT product_id, category, sales,

# Displaying the DataFrame

To assist in data cleansing tasks by

SELECT name, score,

# Assume you already have the data loaded into a pandas

# Assigning ranks to students based on their exam scores

# Displaying the DataFrame

Window functions provide the ability to

# Calculate the cumulative sum of sales for each product

# Display the DataFrame

Window functions enable to partition data

Window functions provide the ability to order

You might also like