Understanding SQL ranking functions:
ROW_NUMBER(), RANK(), and DENSE_RANK()
SQL provides several ranking functions that allow you to assign ranks to rows within a result set
based on specific criteria. These are particularly useful in scenarios like leaderboards, top-N queries,
and data analysis where you need to order and categorize your data.
Here's an explanation of three common ranking functions: ROW_NUMBER(), RANK(),
and DENSE_RANK(), with illustrative examples.
1. ROW_NUMBER()
This function assigns a unique sequential number to each row within a partition of the result set,
starting with 1 for the first row in each partition. It simply numbers the rows sequentially without
considering duplicate values, meaning even if two rows have the same value, they will receive
different and arbitrary row numbers. The order of rows with duplicate values is not guaranteed to be
the same each time you run the query, unless you specify additional columns in the `ORDER BY`
clause to handle ties.
SELECT
product_name,
sales,
ROW_NUMBER() OVER (ORDER BY sales DESC) AS RowNum
FROM
products;
2. RANK()
Assigns ranks to rows based on the ORDER BY clause.
When there are ties in the ORDER BY column, RANK() assigns the same rank to the tied rows,
but the next rank is skipped.
3. DENSE_RANK()
Similar to RANK(), DENSE_RANK() assigns ranks to rows based on the ORDER BY clause.
However, DENSE_RANK() does not skip ranks when there are ties; it assigns consecutive
ranks.
Key Differences:
Function Ties Ranking Behavior
ROW_NUMBER Assigns unique rank to each row No gaps in ranking, skips rows for ties
RANK Assigns same rank to tied rows, skips next rank Gaps in ranking when ties exist
DENSE_RANK Assigns same rank to tied rows, no skipping of ranks No gaps in ranking
Partitioning:
All three functions can be used with the PARTITION BY clause to partition the data and calculate
ranks within each partition separately.
SELECT
department,
employee_name,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as row_num,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) as rank_num,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dense_rank_num
FROM
employees;