Indexing in Oracle
Indexing in Oracle
In Oracle databases, an index is a crucial schema object designed to enhance the performance of
data retrieval operations. Imagine it like a book's index – it provides a quick way to locate specific
information without having to scan through the entire table.
Here's how indexes work and what you should consider:
2. Types of indexes
Oracle supports several types of indexes, each with specific strengths:
B-Tree Indexes: These are the default and most common type of index in Oracle, suitable for
most scenarios, particularly columns with high cardinality (many unique values) and OLTP
(Online Transaction Processing) environments.
Bitmap Indexes: These are efficient for columns with low cardinality (few unique values, like
gender or status flags) and often used in data warehousing environments with complex
queries and aggregations.
Unique Indexes: These ensure that all values in the indexed columns are unique, often used
to enforce primary key or unique constraints.
Composite Indexes: These indexes are created on multiple columns and are beneficial for
queries that filter on combinations of those columns.
Function-Based Indexes: These indexes store the results of a function or expression applied
to columns, useful for queries that frequently use such expressions.
Domain Indexes: These are specialized indexes tailored for complex data types (e.g., spatial,
text) and require using the Oracle Data Cartridge.
Reverse Key Indexes: These reverse the bytes of the index key to distribute inserts more
evenly across the index blocks, reducing contention in high-insert environments like Oracle
Real Application Clusters.
3. Benefits of indexing
Improved Query Performance: Indexes significantly speed up data retrieval by allowing
Oracle to quickly locate the required rows without performing full table scans.
Reduced I/O Operations: By storing a subset of the data and using pointers, indexes reduce
the amount of data that needs to be read from disk, leading to faster query execution.
Optimized Joins: Indexes on join columns can drastically improve the performance of queries
involving multiple tables by facilitating efficient matching of rows.
Enforced Uniqueness: Indexes can be used to enforce unique constraints on columns,
guaranteeing data integrity.
Faster Sorting: When queries require sorting results based on indexed columns, Oracle can
leverage the sorted index structure to avoid performing separate sort operations.
1|Page
4. Drawbacks of indexing
Storage Overhead: Indexes consume additional disk space, and the size can be significant for
large tables and multiple indexes.
Reduced Write Performance: Any changes to the indexed data (inserts, updates, or deletes)
require Oracle to also update the corresponding indexes, which can slow down DML (Data
Manipulation Language) operations.
Increased Maintenance Overhead: Indexes need regular maintenance, including rebuilding to
address fragmentation and updating statistics to ensure the optimizer uses them effectively.
Potential for Over-Indexing: Creating too many indexes can negatively impact performance
by increasing storage consumption, slowing down writes, and potentially confusing the query
optimizer.
1. B-tree indexes
Description: The default and most common type, B-tree indexes are well-suited for a wide
range of workloads, including high-cardinality columns (many distinct values) and OLTP
(Online Transaction Processing) systems. They store sorted key values and ROWIDs (pointers
to table rows) in a balanced tree structure, enabling efficient searching and range scans.
Example:
sql
CREATE INDEX idx_employees_lastname ON employees (last_name);
Use code with caution.
This creates a B-tree index on the last_name column of the employees table. It will speed up queries
that filter or sort by last name, like SELECT * FROM employees WHERE last_name = 'Smith'.
2|Page
2. Bitmap indexes
Description: Ideal for low-cardinality columns (few distinct values, like gender or status) and
data warehousing environments with complex queries and aggregations. Instead of storing
individual ROWIDs, a bitmap index creates a bitmap (a binary map) for each distinct value,
where each bit represents a row in the table.
Example:
sql
CREATE BITMAP INDEX idx_customers_gender ON customers (gender);
Use code with caution.
This creates a bitmap index on the gender column of the customers table. It's efficient for queries
like SELECT COUNT(*) FROM customers WHERE gender = 'Male' or when used in combination with
other bitmap indexes in complex WHERE clauses.
3. Unique indexes
Description: These indexes enforce uniqueness on one or more columns, preventing
duplicate values. They are often implicitly created when defining PRIMARY KEY or UNIQUE
constraints.
Example:
sql
CREATE UNIQUE INDEX idx_employees_employee_id ON employees (employee_id);
Use code with caution.
This ensures that each employee_id in the employees table is unique, raising an error if a duplicate is
attempted.
4. Composite indexes
Description: Created on multiple columns, these indexes are effective for queries that filter
on combinations of those columns. The order of columns is crucial – queries that use the
leading (left-most) columns of the index will benefit most.
Example:
sql
CREATE INDEX idx_employees_dept_job ON employees (department_id, job_id);
Use code with caution.
This composite index on department_id and job_id will speed up queries like SELECT * FROM
employees WHERE department_id = 10 AND job_id = 'IT_PROG', or queries filtering only
on department_id.
5. Function-based indexes
Description: These indexes store the results of a function or expression applied to one or
more columns. They are useful for queries that frequently utilize such functions or
expressions in their WHERE clauses.
Example:
sql
CREATE INDEX idx_employees_upper_name ON employees (UPPER(first_name));
Use code with caution.
This index allows Oracle to use an index range scan even when querying with a function on the
column, like SELECT * FROM employees WHERE UPPER(first_name) = 'JOHN'.
3|Page
6. Reverse key indexes
Description: A type of B-tree index where the bytes of the index key are physically reversed.
This technique helps distribute inserts evenly across index blocks, reducing contention,
especially in high-volume, sequentially inserted data scenarios like sequence-generated
primary keys.
Example:
sql
CREATE INDEX idx_orders_order_id_reverse ON orders (order_id) REVERSE;
Use code with caution.
This index would help alleviate "hot spots" if many simultaneous insertions of
sequential order_id values were occurring.
7. Domain indexes
Description: Specialized indexes built for complex data types (e.g., spatial data, text
documents) requiring application-specific indexing logic. They leverage the Oracle Data
Cartridge facility and are managed by the application logic.
Example:
sql
CREATE INDEX ResumeTextIndex ON Employees(resume) INDEXTYPE IS TextIndexType PARAMETERS
(':Language English :Ignore the a an');
Use code with caution.
This creates a domain index on a resume (text) column, allowing efficient keyword or full-text
searches using specialized text indexing logic defined by the TextIndexType.
4|Page
-- CTE Query (subquery)
SELECT column1, column2
FROM table_name
WHERE condition
)
-- Main query that uses the CTE
SELECT *
FROM cte_name;
CTEs enhance readability by breaking down complex queries, allow reusability within a single query,
and are essential for recursive queries involving hierarchical data. They can also act as temporary
views and aid in data transformation.
Oracle supports non-recursive and recursive CTEs. Non-recursive CTEs define temporary result sets
for simplifying queries or computing intermediate results. An example of calculating average salary
per department and finding employees earning more can be found.
Recursive CTEs reference themselves for working with hierarchical data and consist of an anchor and
a recursive member combined with UNION ALL or UNION. An example demonstrating how to find an
employee hierarchy using a recursive CTE is available on.
CTEs are temporary and exist only for the query's duration. While they improve readability,
performance can vary, making execution plan checks important. Recursive CTEs can be more complex
to debug. A CTE can reference previously defined CTEs within the same WITH clause. CTEs are
valuable for writing clear, organized, and often more efficient SQL queries in Oracle.
indexing in oracle
Understanding indexing in Oracle
In Oracle databases, an index is a crucial schema object designed to enhance the performance of
data retrieval operations. Imagine it like a book's index – it provides a quick way to locate specific
information without having to scan through the entire table.
Here's how indexes work and what you should consider:
5|Page
1. How indexes work
When you create an index on a column or set of columns, Oracle creates a sorted data
structure that stores the values from those columns along with pointers (ROWIDs) to the
corresponding rows in the table.
When a query needs to access data in indexed columns, Oracle can quickly locate the
required rows using the index rather than performing a full table scan. This significantly
reduces the amount of I/O operations and speeds up data retrieval.
2. Types of indexes
Oracle supports several types of indexes, each with specific strengths:
B-Tree Indexes: These are the default and most common type of index in Oracle, suitable for
most scenarios, particularly columns with high cardinality (many unique values) and OLTP
(Online Transaction Processing) environments.
Bitmap Indexes: These are efficient for columns with low cardinality (few unique values, like
gender or status flags) and often used in data warehousing environments with complex
queries and aggregations.
Unique Indexes: These ensure that all values in the indexed columns are unique, often used
to enforce primary key or unique constraints.
Composite Indexes: These indexes are created on multiple columns and are beneficial for
queries that filter on combinations of those columns.
Function-Based Indexes: These indexes store the results of a function or expression applied
to columns, useful for queries that frequently use such expressions.
Domain Indexes: These are specialized indexes tailored for complex data types (e.g., spatial,
text) and require using the Oracle Data Cartridge.
Reverse Key Indexes: These reverse the bytes of the index key to distribute inserts more
evenly across the index blocks, reducing contention in high-insert environments like Oracle
Real Application Clusters.
3. Benefits of indexing
Improved Query Performance: Indexes significantly speed up data retrieval by allowing
Oracle to quickly locate the required rows without performing full table scans.
Reduced I/O Operations: By storing a subset of the data and using pointers, indexes reduce
the amount of data that needs to be read from disk, leading to faster query execution.
Optimized Joins: Indexes on join columns can drastically improve the performance of queries
involving multiple tables by facilitating efficient matching of rows.
Enforced Uniqueness: Indexes can be used to enforce unique constraints on columns,
guaranteeing data integrity.
Faster Sorting: When queries require sorting results based on indexed columns, Oracle can
leverage the sorted index structure to avoid performing separate sort operations.
6|Page
4. Drawbacks of indexing
Storage Overhead: Indexes consume additional disk space, and the size can be significant for
large tables and multiple indexes.
Reduced Write Performance: Any changes to the indexed data (inserts, updates, or deletes)
require Oracle to also update the corresponding indexes, which can slow down DML (Data
Manipulation Language) operations.
Increased Maintenance Overhead: Indexes need regular maintenance, including rebuilding to
address fragmentation and updating statistics to ensure the optimizer uses them effectively.
Potential for Over-Indexing: Creating too many indexes can negatively impact performance
by increasing storage consumption, slowing down writes, and potentially confusing the query
optimizer.
7|Page
Oracle offers several types of indexes, each designed to optimize specific types of queries and data
access patterns. Here's a breakdown of the main types with examples:
1. B-tree indexes
Description: The default and most common type, B-tree indexes are well-suited for a wide
range of workloads, including high-cardinality columns (many distinct values) and OLTP
(Online Transaction Processing) systems. They store sorted key values and ROWIDs (pointers
to table rows) in a balanced tree structure, enabling efficient searching and range scans.
Example:
sql
CREATE INDEX idx_employees_lastname ON employees (last_name);
Use code with caution.
This creates a B-tree index on the last_name column of the employees table. It will speed up queries
that filter or sort by last name, like SELECT * FROM employees WHERE last_name = 'Smith'.
2. Bitmap indexes
Description: Ideal for low-cardinality columns (few distinct values, like gender or status) and
data warehousing environments with complex queries and aggregations. Instead of storing
individual ROWIDs, a bitmap index creates a bitmap (a binary map) for each distinct value,
where each bit represents a row in the table.
Example:
sql
CREATE BITMAP INDEX idx_customers_gender ON customers (gender);
Use code with caution.
This creates a bitmap index on the gender column of the customers table. It's efficient for queries
like SELECT COUNT(*) FROM customers WHERE gender = 'Male' or when used in combination with
other bitmap indexes in complex WHERE clauses.
3. Unique indexes
Description: These indexes enforce uniqueness on one or more columns, preventing
duplicate values. They are often implicitly created when defining PRIMARY KEY or UNIQUE
constraints.
Example:
sql
CREATE UNIQUE INDEX idx_employees_employee_id ON employees (employee_id);
Use code with caution.
This ensures that each employee_id in the employees table is unique, raising an error if a duplicate is
attempted.
4. Composite indexes
Description: Created on multiple columns, these indexes are effective for queries that filter
on combinations of those columns. The order of columns is crucial – queries that use the
leading (left-most) columns of the index will benefit most.
Example:
sql
CREATE INDEX idx_employees_dept_job ON employees (department_id, job_id);
Use code with caution.
This composite index on department_id and job_id will speed up queries like SELECT * FROM
employees WHERE department_id = 10 AND job_id = 'IT_PROG', or queries filtering only
on department_id.
8|Page
5. Function-based indexes
Description: These indexes store the results of a function or expression applied to one or
more columns. They are useful for queries that frequently utilize such functions or
expressions in their WHERE clauses.
Example:
sql
CREATE INDEX idx_employees_upper_name ON employees (UPPER(first_name));
Use code with caution.
This index allows Oracle to use an index range scan even when querying with a function on the
column, like SELECT * FROM employees WHERE UPPER(first_name) = 'JOHN'.
7. Domain indexes
Description: Specialized indexes built for complex data types (e.g., spatial data, text
documents) requiring application-specific indexing logic. They leverage the Oracle Data
Cartridge facility and are managed by the application logic.
Example:
sql
CREATE INDEX ResumeTextIndex ON Employees(resume) INDEXTYPE IS TextIndexType PARAMETERS
(':Language English :Ignore the a an');
Use code with caution.
This creates a domain index on a resume (text) column, allowing efficient keyword or full-text
searches using specialized text indexing logic defined by the TextIndexType.
9|Page
This creates an IOT where the student_id acts as the primary key and the table data itself is stored in
that order. Queries filtering or sorting by student_id will be very fast.
Remember to consider the cardinality of the indexed columns, the types of queries being executed,
and the frequency of data modifications when choosing the most appropriate index type for your
10 | P a g e
The basic syntax for a CTE is as follows:
sql
WITH cte_name (column1, column2, ...) AS (
-- CTE Query (subquery)
SELECT column1, column2
FROM table_name
WHERE condition
)
-- Main query that uses the CTE
SELECT *
FROM cte_name;
Use code with caution.
CTEs enhance readability by breaking down complex queries, allow reusability within a single query,
and are essential for recursive queries involving hierarchical data. They can also act as temporary
views and aid in data transformation.
Oracle supports non-recursive and recursive CTEs. Non-recursive CTEs define temporary result sets
for simplifying queries or computing intermediate results. An example of calculating average salary
per department and finding employees earning more can be found.
Recursive CTEs reference themselves for working with hierarchical data and consist of an anchor and
a recursive member combined with UNION ALL or UNION. An example demonstrating how to find an
employee hierarchy using a recursive CTE is available.
CTEs are temporary and exist only for the query's duration. While they improve readability,
performance can vary, making execution plan checks important. Recursive CTEs can be more complex
to debug. A CTE can reference previously defined CTEs within the same WITH clause. CTEs are
valuable for writing clear, organized, and often more efficient SQL queries in Oracle.
Common Table Expressions: When and How to Use Them
What are Common Table Expressions (CTEs)? A Common Table Expression (CTE) is the result set of a
query which exists temporarily and for use only within the cont...
Common Table Expression (CTE) in PLSQL | Part 1
11 | P a g e
Non-Recursive CTEs: These are standard CTEs that do not reference themselves. They are primarily
used to simplify complex queries or to compute intermediate res...
2 Bob Johnson 1
3 3 Williams 1
4 David Brown 2
5 Eve Davis 2
6 Frank Miller 3
12 | P a g e
Recursive CTEs require a termination condition. The example provided naturally terminates when the
hierarchy is fully traversed. You can also add an explicit condition in the recursive
member's WHERE clause.
Considerations
Consider performance for deep hierarchies and large datasets, and think about indexing. Oracle CTE
recursive members have limitations, such as not permitting outer joins or aggregate functions, which
can be used in the outer SELECT.
By understanding the structure, functionality, and potential issues of recursive CTEs, you can
effectively use them for complex hierarchical data challenges in Oracle SQL.
Temporary CTE
1. How they work
You define one or more CTEs using the WITH keyword at the beginning of your SQL
statement.
Each CTE is given a name and a query that defines the result set for that name.
The CTE can then be referenced in the main query or in subsequent CTEs within the
same WITH clause, similar to how you would reference a regular table or view.
Once the main query completes its execution, the CTEs and their result sets are
automatically discarded and are no longer available.
2. Benefits of using temporary CTEs
Improved Readability and Maintainability: CTEs allow you to break down complex queries
into smaller, more manageable, and understandable parts, making the code easier to read
and maintain.
Enhanced Reusability within a Single Query: If you need to perform the same subquery or
calculation multiple times within a single larger query, you can define it once as a CTE and
then reference it as needed, reducing code duplication.
Support for Recursive Queries: Recursive CTEs, which can refer to themselves within their
definition, are essential for working with hierarchical or tree-structured data.
Simplified Complex Operations: CTEs can simplify complex joins, aggregations, and data
transformations by providing a modular approach to building queries.
3. Example
Let's imagine you have a table named employees with columns
like employee_id, first_name, last_name, department_id, and salary. You want to find all employees
who earn more than the average salary for their respective departments.
Here's how you can achieve this using a CTE:
sql
WITH DepartmentAverageSalary AS (
SELECT
department_id,
AVG(salary) AS avg_salary
FROM
employees
GROUP BY
13 | P a g e
department_id
)
SELECT
e.employee_id,
e.first_name,
e.last_name,
e.salary,
das.avg_salary AS department_avg_salary
FROM
employees e
JOIN
DepartmentAverageSalary das
ON
e.department_id = das.department_id
WHERE
e.salary > das.avg_salary;
Use code with caution.
Explanation
This example uses a CTE named DepartmentAverageSalary to calculate the average salary per
department. The main query then joins the employees table with this CTE to filter for employees
whose salary exceeds their department's average.
4. Key considerations
CTEs are temporary and exist only for the duration of a single SQL statement. They are not physically
stored in the database. While they can improve readability, their performance can vary, and complex
or nested CTEs may present debugging challenges. Using CTEs can help in writing more organized
and efficient Oracle SQL queries, especially for complex operations and hierarchical data.
14 | P a g e
Let's say you have an employees table with columns
like employee_id, first_name, last_name, department_id, and salary. You want to rank employees
within each department based on their salary in descending order.
sql
SELECT
employee_id,
first_name,
last_name,
department_id,
salary,
ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS department_rank
FROM
employees;
15 | P a g e
Syntax
sql
RANK() OVER (
[PARTITION BY partition_column1, partition_column2, ...]
ORDER BY order_by_column1 [ASC | DESC], order_by_column2 [ASC | DESC], ...
)
Use code with caution.
Clauses
OVER: Specifies the window or set of rows over which the function operates.
PARTITION BY: (Optional) Divides the result set into partitions (groups of rows).
The RANK() function then assigns separate ranks to each partition. If omitted, the function
treats the entire result set as a single partition.
ORDER BY: (Mandatory) Sorts the rows within each partition (or the entire result set
if PARTITION BY is omitted) before assigning ranks.
Example: Ranking employees by salary within departments
Consider an employees table with columns such
as employee_id, first_name, last_name, department_id, and salary. The goal is to rank employees
within each department based on their salary in descending order.
sql
SELECT
employee_id,
first_name,
last_name,
department_id,
salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS department_rank
FROM
employees;
Use code with caution.
In this example:
The PARTITION BY department_id clause divides the employees into separate groups based
on their department.
The ORDER BY salary DESC clause sorts employees within each department in descending
order of salary.
The RANK() function assigns a rank to each employee within their respective department,
with the highest earner receiving rank 1.
Rows with equal values in the ORDER BY clause receive the same rank. For example, if two
employees have the same highest salary within a department, they will both be ranked 1.
The RANK() function creates gaps in the ranking sequence for subsequent rows when ties
occur. For instance, if two employees are tied for rank 1, the next rank assigned will be 3,
skipping rank 2.
Key points about RANK()
Tied values are handled by assigning them the same rank.
Gaps are introduced in the ranking sequence after tied ranks.
An ORDER BY clause is required to define the ranking order.
A PARTITION BY clause can be combined to rank data within specific groups.
It's useful for identifying top-N results within a group or overall.
RANK() vs. DENSE_RANK() vs. ROW_NUMBER()
16 | P a g e
Function Handles Ties Creates Behavior Example Output
Gaps (assuming ties for
1st)
NTILE() Function:
The NTILE() function is an analytic function in Oracle used to divide a result set into a specified
number of approximately equal groups (buckets), assigning a bucket number to each row within
those groups.
Example 1: Basic NTILE() usage
Let's say you have a table of employees with their salaries, and you want to categorize them into
three salary groups (low, medium, and high) across the entire company, regardless of their
department.
Query:
sql
SELECT
employee_id,
first_name,
last_name,
salary,
NTILE(3) OVER (ORDER BY salary ASC) AS salary_group
FROM
employees
ORDER BY
salary_group,
salary;
Use code with caution.
Explanation:
NTILE(3): This divides the employees into 3 groups.
OVER (ORDER BY salary ASC): This sets the ordering for the grouping. Employees are ordered
by salary in ascending order. The NTILE() function then assigns bucket numbers based on this
order.
Result: The query returns each employee's information with a salary_group column (1, 2, or
3), indicating their salary group. Rows with lower salaries are in group 1, middle salaries in
group 2, and higher salaries in group 3. The NTILE() function aims to make the groups as even
in size as possible.
Example 2: NTILE() with PARTITION BY
Imagine performing the same salary grouping, but within each department separately.
17 | P a g e
Query:
sql
SELECT
employee_id,
first_name,
last_name,
department_id,
salary,
NTILE(3) OVER (PARTITION BY department_id ORDER BY salary DESC) AS department_salary_group
FROM
employees
ORDER BY
department_id,
department_salary_group,
salary DESC;
Use code with caution.
Explanation
NTILE(3): This divides the employees into 3 groups within each partition.
OVER (PARTITION BY department_id ORDER BY salary DESC): This uses PARTITION BY. It
divides employees into partitions based on their department_id. Within each
partition, ORDER BY salary DESC sorts employees by salary in descending order.
Then, NTILE() assigns bucket numbers (1, 2, or 3) to rows within each department
independently.
Result: Employees are grouped into three salary buckets, and the ranking for each
department begins at 1. The top earners in Department A are in group 1 for Department A,
and the top earners in Department B are in group 1 for Department B, independently.
Key points and considerations
Even Distribution: NTILE() tries to distribute rows evenly among the buckets. If the total rows
aren't evenly divisible by the number of buckets, remaining rows are distributed one by one
to the first buckets. For example, with 10 rows and 4 buckets, the first two buckets might
have 3 rows each, and the last two might have 2 rows each.
Mandatory ORDER BY: An ORDER BY clause within the OVER() is required to define the order
for assigning bucket numbers.
PARTITION BY for Grouping: The PARTITION BY clause creates separate groups for NTILE() to
operate on independently.
Data Segmentation: NTILE() is useful for segmenting data into quantiles (quartiles, deciles,
percentiles) for analysis and reporting.
Limitations: NTILE() doesn't support the windowing clause (e.g., ROWS BETWEEN or RANGE
BETWEEN), and the argument expression cannot contain subqueries or other analytic
functions.
NTILE() helps group data into custom buckets, making it easier to analyze distributions, identify
top/bottom performers, or implement custom ranking logic.
LEAD() function
The LEAD() function in Oracle is an analytic function that retrieves a value from a subsequent row in
the result set or a specific partition within the result set. It's part of the SQL window functions
family, blog.devops.dev and enables comparing values across rows without needing complex self-
joins, thus simplifying analysis and improving readability. The LEAD() function is particularly useful for
trend analysis and comparing current values with future values.
18 | P a g e
Syntax
sql
LEAD (expression [, offset [, default_value]]) OVER (
[PARTITION BY partition_column1, partition_column2, ...]
ORDER BY order_by_column1 [ASC | DESC], order_by_column2 [ASC | DESC], ...
)
Use code with caution.
Clauses
expression: The column or expression from which to retrieve the value. www.sqltutorial.org
offset: (Optional) The number of rows forward from the current row to look ahead. The
default is 1.
default_value: (Optional) The value to return if the offset extends beyond the result set or
the partition boundary. If omitted, the default is NULL.
PARTITION BY: (Optional) Divides the result set into partitions (groups of rows).
The LEAD() function operates independently within each partition.
ORDER BY: (Mandatory) Sorts the rows within each partition (or the entire result set
if PARTITION BY is omitted). This defines the order for determining the next
row. LearnSQL.com
Example: Comparing current and next month's sales
Imagine you have a table named monthly_sales with columns month_year and revenue. You want to
see the current month's revenue and the next month's revenue side-by-side to analyze month-over-
month growth.
sql
SELECT
month_year,
revenue,
LEAD(revenue, 1) OVER (ORDER BY month_year) AS next_month_revenue,
(LEAD(revenue, 1) OVER (ORDER BY month_year) - revenue) AS revenue_difference,
(LEAD(revenue, 1) OVER (ORDER BY month_year) - revenue) / revenue * 100 AS growth_percentage
FROM
monthly_sales
ORDER BY
month_year;
Use code with caution.
In this example:
LEAD(revenue, 1) OVER (ORDER BY month_year): This retrieves the revenue value from the
row that is one row ahead of the current row (i.e., the next month), based on
the month_year order.
revenue_difference: This calculates the difference between the next month's revenue and
the current month's revenue, highlighting the absolute change.
growth_percentage: This calculates the month-over-month growth
percentage, LearnSQL.com providing a relative measure of change.
Output analysis
For the last row in the result set, where there is no subsequent row according to the defined order,
the LEAD() function returns NULL for next_month_revenue, and
consequently, revenue_difference and growth_percentage will also be NULL. You can specify
a default_value to override this behavior, like using 0 instead of NULL when there is no subsequent
row.
19 | P a g e
Applications of LEAD()
Trend Analysis: Identify upward or downward trends in sales, stock prices, or other time-
series data.
Comparing Consecutive Values: Calculate the difference between a current value and the
next value for change analysis.
Identifying Gaps: Detect missing values or interruptions in sequences of data, such as
production schedules or dates.
Forecasting and Budgeting: Use the LEAD() function to estimate future values for planning
and budgeting.
By understanding and utilizing the LEAD() function, you can perform sophisticated data analysis and
gain deeper insights from your Oracle database, especially when dealing with sequential or time-
series data.
LAG function:
The LAG() function in Oracle is an analytic function used to access data from a preceding row within
the same result set or partition, without requiring a self-join. It's a key member of the window
functions family and provides access to previous rows based on a specified offset and order.
The LAG() function is useful in scenarios where you need to compare values between consecutive
rows, such as analyzing trends over time, calculating the difference between a current value and a
previous value, or detecting changes in status.
Syntax
sql
LAG (expression [, offset [, default_value]]) OVER (
[PARTITION BY partition_column1, partition_column2, ...]
ORDER BY order_by_column1 [ASC | DESC], order_by_column2 [ASC | DESC], ...
)
Use code with caution.
Clauses
expression: The column or expression from which to retrieve the value. DataCamp
offset: (Optional) The number of rows to look back from the current row. The default value is
1.
default_value: (Optional) The value returned if the offset extends beyond the partition
boundary or the beginning of the result set. If omitted, the default is NULL.
PARTITION BY: (Optional) Divides the result set into partitions. The LAG() function operates
independently within each partition.
ORDER BY: (Mandatory) Sorts the rows within each partition or the entire result set, defining
the sequence in which previous rows are determined.
Example: Calculating previous day's revenue
Suppose you have a table named daily_sales with columns sales_date and revenue. You want to
calculate the difference between the current day's revenue and the previous day's revenue.
sql
SELECT
sales_date,
revenue,
LAG(revenue, 1) OVER (ORDER BY sales_date) AS previous_day_revenue,
revenue - LAG(revenue, 1) OVER (ORDER BY sales_date) AS revenue_change
FROM
daily_sales
20 | P a g e
ORDER BY
sales_date;
Use code with caution.
In this example:
LAG(revenue, 1) OVER (ORDER BY sales_date): This retrieves the revenue value from the row
that is one row before the current row, based on the sales_date order.
revenue - LAG(revenue, 1) OVER (ORDER BY sales_date): This calculates the difference
between the current day's revenue and the previous day's revenue, providing the daily
change.
Output analysis
For the first row in the result set (or the first row in a partition when PARTITION BY is used), there is
no previous row to retrieve the value from. In this case, the LAG() function will return NULL for
the previous_day_revenue column, DataCamp unless a default value is specified.
Applications of LAG()
Sales Trends: Track daily, monthly, or yearly sales performance and compare current figures
with previous periods.
Financial Analysis: Analyze stock prices, account balances, or other financial metrics over
time.
Identifying Gaps: Detect missing values or interruptions in sequences, such as production
schedules or dates.
Data Validation: Identify missing or duplicate rows in a sequence.
The LAG() function is a powerful tool for performing advanced analytics directly within SQL,
streamlining tasks like time-series analysis, trend identification, and data validation. By mastering its
usage, you can gain valuable insights from your Oracle database, particularly when dealing with
sequential or time-dependent data.
SUM() OVER() function in Oracle PL/SQL
The SUM() OVER() function in Oracle is an analytic function that calculates the sum of a set of values
within a specified window or partition of a result set. It allows you to compute running totals,
cumulative sums, or sums within specific groups without needing complex self-joins or nested
queries. It's a powerful tool for analyzing trends, calculating totals, and creating summary reports
directly within your SQL queries.
Syntax
sql
SUM (expression) OVER (
[PARTITION BY partition_column1, partition_column2, ...]
[ORDER BY order_by_column1 [ASC | DESC], order_by_column2 [ASC | DESC], ...]
[windowing_clause]
)
Use code with caution.
Clauses
expression: The numeric column or expression whose values are to be summed.
PARTITION BY: (Optional) Divides the result set into partitions (groups of rows).
The SUM() function operates independently within each partition, calculating a separate sum
for each group.
ORDER BY: (Optional, but often used for cumulative sums) Sorts the rows within each
partition or the entire result set. When used with a windowing clause, it defines the order in
which the sum is calculated.
21 | P a g e
windowing_clause: (Optional) Defines the set of rows within the partition that
the SUM() function considers for each row. Common windowing clauses include:
o ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Calculates a
cumulative sum from the beginning of the partition up to the current row. This is the
default if ORDER BY is specified without a windowing clause.
o ROWS BETWEEN <N> PRECEDING AND CURRENT ROW: Sums values from N rows
before the current row up to the current row.
o ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING: Sums values from
the current row to the end of the partition.
o ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: Sums all
values within the entire partition (equivalent to GROUP BY but returns the sum for
each row).
Example 1: Calculating cumulative sum
Let's say you have a table named daily_sales with columns sales_date and revenue. You want to
calculate the running total of revenue for each day.
sql
SELECT
sales_date,
revenue,
SUM(revenue) OVER (ORDER BY sales_date ROWS BETWEEN UNBOUNDED PRECEDING AND
CURRENT ROW) AS cumulative_revenue
FROM
daily_sales
ORDER BY
sales_date;
Use code with caution.
In this example:
SUM(revenue) OVER (ORDER BY sales_date ROWS BETWEEN UNBOUNDED PRECEDING AND
CURRENT ROW): This calculates the cumulative sum of revenue. The ORDER BY
sales_date clause ensures the sum is calculated based on the chronological order of sales
dates. ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW specifies that for each
row, the sum should include all preceding rows in the current partition, up to and including
the current row.
Example 2: Calculating sum within a department
Imagine you have an employees table with department_id and salary. You want to display each
employee's salary along with the total salary for their department.
sql
SELECT
employee_id,
first_name,
last_name,
department_id,
salary,
SUM(salary) OVER (PARTITION BY department_id) AS department_total_salary
FROM
employees
ORDER BY
department_id,
22 | P a g e
employee_id;
Use code with caution.
In this example:
SUM(salary) OVER (PARTITION BY department_id): This calculates the sum of salary within
each department. The PARTITION BY department_id clause ensures that the sum is
calculated separately for each department. Since no ORDER BY or windowing_clause is
specified, it defaults to summing all values within the partition.
Applications of SUM() OVER()
Running Totals: Track cumulative sales, expenses, or other metrics over time.
Moving Averages: Calculate rolling sums for trend analysis.
Group Aggregations: Display aggregated sums alongside individual row data without
using GROUP BY.
Percentage of Total: Calculate the percentage that each row or a subset of rows contributes
to the total sum within a partition or the entire result set.
By utilizing the SUM() OVER() function, you can perform sophisticated aggregations and calculations
directly within your SQL queries, making your code more efficient, readable, and powerful for data
analysis and reporting.
AVG OVER Function
The AVG() OVER() function in Oracle is an analytic function that computes the average (arithmetic
mean) of a numeric expression within a defined window or partition of a result set. It's similar to the
aggregate AVG() function but instead of returning a single average for a group, it returns the average
for a window of rows, allowing you to retain all the original rows in the result set while still seeing
the average for a relevant subset of those rows. This makes it incredibly useful for tasks like
calculating moving averages, departmental averages alongside individual data, and trend analysis.
Syntax
sql
AVG ( [ DISTINCT | ALL ] expression ) OVER (
[PARTITION BY partition_column1, partition_column2, ...]
[ORDER BY order_by_column1 [ASC | DESC], order_by_column2 [ASC | DESC], ...]
[windowing_clause]
)
Use code with caution.
Clauses
expression: The numeric column or expression for which you want to calculate the average.
DISTINCT: (Optional) If specified, AVG() calculates the average of only the unique values
of expression within the window.
ALL: (Optional) If specified or omitted, AVG() calculates the average of all values (including
duplicates) within the window.
PARTITION BY: (Optional) Divides the result set into partitions or groups of rows.
The AVG() function operates independently within each partition, calculating a separate
average for each group.
ORDER BY: (Optional, but often used) Sorts the rows within each partition or the entire result
set. When used with a windowing clause, it defines the order in which the average is
calculated.
windowing_clause: (Optional) Defines the set of rows within the partition that
the AVG() function considers for each row. This clause lets you define a "sliding window" for
calculations:
23 | P a g e
o ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Calculates the
cumulative average from the beginning of the partition up to the current row.
o ROWS BETWEEN <N> PRECEDING AND CURRENT ROW: Calculates a moving average
over a fixed number of N preceding rows plus the current row.
o ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING: Calculates the
average from the current row to the end of the partition.
o ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING:
Calculates the average of all values within the entire partition, similar to a GROUP
BY but includes the average on every row.
24 | P a g e
Explanation
AVG(sale_amount) OVER (ORDER BY sale_date ROWS BETWEEN 2 PRECEDING AND CURRENT
ROW): This calculates the moving average of sale_amount.
ORDER BY sale_date: Orders the sales data by date, which is crucial for a time-based moving
average.
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW: Defines the window for calculation,
including the current row and the two preceding rows.
Key points about AVG() OVER()
Analytic vs. Aggregate: AVG() OVER() is an analytic function, returning the average for each
row based on the window specification, whereas the aggregate AVG() (without OVER())
returns a single average per GROUP BY group.
Flexibility with Windows: The windowing_clause provides great flexibility to define the scope
of the average calculation.
NULL Values: The AVG() function ignores NULL values by default during its calculation.
No Implicit Grouping: Unlike GROUP BY, AVG() OVER() does not implicitly group rows,
allowing you to see both the detail and the aggregate in the same result set.
Performance: For large datasets, using analytic functions is often more efficient than self-
joins for similar calculations.
The AVG() OVER() function is a powerful tool for various data analysis tasks in Oracle SQL, offering a
flexible and efficient way to calculate averages over different scopes and windows.
Partition by function
The PARTITION BY clause is a crucial component of analytic functions (also known as window
functions) in Oracle SQL. It enables you to divide a result set into smaller, non-overlapping groups or
"partitions" based on the values in one or more specified columns. Analytic functions then operate
independently on each of these partitions, calculating values within the context of that specific
group, and the calculations restart for each new partition.
In simpler terms, you can think of PARTITION BY as grouping rows together before applying an
analytic function, but unlike GROUP BY, it doesn't collapse the original rows into a single summary
row per group. Instead, it retains all the original rows in the result set, adding a new column that
displays the calculated value (e.g., sum, average, rank) for the partition to which each row belongs.
Syntax
The PARTITION BY clause is used within the OVER() clause of an analytic function:
sql
analytic_function() OVER (PARTITION BY partition_column1, partition_column2, ... [ORDER BY
order_by_column [ASC | DESC], ...])
Use code with caution.
Clauses
analytic_function(): This is the function you want to apply to the partitions
(e.g., SUM(), AVG(), ROW_NUMBER(), RANK(), LEAD(), LAG()).
OVER(): This keyword indicates that the function is an analytic function and operates over a
window of rows.
PARTITION BY partition_column1, partition_column2, ...: This clause specifies the columns
that define the partitions. Rows with the same values in these columns will be part of the
same partition. GeeksforGeeks
ORDER BY order_by_column [ASC | DESC], ...: (Optional) This clause specifies the order of
rows within each partition. This is especially important for functions that depend on the
order of rows, like ROW_NUMBER(), RANK(), LEAD(), and LAG().
Example: Calculating department-level statistics
25 | P a g e
Let's imagine you have a table named employees with
columns employee_id, first_name, last_name, department_id, and salary. You want to display each
employee's details along with their department's average salary and their individual rank based on
salary within their department.
sql
SELECT
employee_id,
first_name,
last_name,
department_id,
salary,
AVG(salary) OVER (PARTITION BY department_id) AS avg_department_salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS salary_rank_in_dept
FROM
employees
ORDER BY
department_id,
salary DESC;
Use code with caution.
Explanation
AVG(salary) OVER (PARTITION BY department_id): This calculates the average salary for each
department. The PARTITION BY department_id clause ensures that the AVG() function
operates on each department separately. The average calculated for a department is
displayed on every row belonging to that department.
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC): This assigns a rank to
each employee within their respective department based on their salary in descending order.
The PARTITION BY department_id clause divides the employees into separate departments,
and the ORDER BY salary DESC clause ranks them within each department. The ranking
restarts from 1 for each new department.
Key benefits of PARTITION BY
Retaining Row Details: Unlike GROUP BY, PARTITION BY allows you to perform calculations on
groups of rows while still displaying all the individual rows of the original query.
Contextual Calculations: You can perform calculations relevant to specific subsets of your
data without the need for complex subqueries or self-joins.
Enhanced Reporting and Analysis: It greatly simplifies the generation of reports and analyses
that require comparing individual values to group-level aggregates or ranking within specific
categories.
Flexibility with Analytic Functions: It can be used with a wide array of analytic functions,
including aggregation, ranking, and value-based functions, to perform diverse calculations.
By using PARTITION BY with analytic functions, you can write more efficient, readable, and powerful
SQL queries to gain deeper insights from your Oracle database.
CASE in Oracle:
The CASE WHEN statement in Oracle PL/SQL, similar to IF-THEN-ELSIF statements, enables you to
implement conditional logic, executing different blocks of code based on conditions. It's often
favored for its readability, especially when dealing with numerous conditions or transforming
values.
Oracle supports two main forms of CASE statements in PL/SQL:
1. Simple CASE statement
26 | P a g e
Description: This form compares a single expression (the selector) to multiple potential
values. Once a match is found, the corresponding code block is executed, and
the CASE statement terminates.
Syntax:
sql
CASE selector
WHEN expression1 THEN
-- sequence of statements 1
WHEN expression2 THEN
-- sequence of statements 2
...
[ELSE
-- default sequence of statements]
END CASE;
Use code with caution.
Example: For an example of the simple CASE statement, please refer to GeeksforGeeks.
2. Searched CASE statement
Description: This form evaluates multiple independent Boolean conditions (expressions) in
the WHEN clauses. The first WHEN clause whose condition evaluates to TRUE triggers the
execution of its associated code block, and the CASE statement
terminates. www.plsqltutorial.com
Syntax:
sql
CASE
WHEN condition_1 THEN
-- sequence of statements 1
WHEN condition_2 THEN
-- sequence of statements 2
...
[ELSE
-- default sequence of statements]
END CASE;
Use code with caution.
Example: For an example of the searched CASE statement, please refer to GeeksforGeeks.
Important notes
Evaluation Order: Conditions or WHEN expressions are evaluated in the order they are listed.
Once a match or a TRUE condition is found, subsequent clauses are not evaluated.
ELSE Clause: This clause is optional. If omitted in a CASE statement and no match is found,
a CASE_NOT_FOUND exception is raised. For CASE expressions, NULL is returned if no match
is found.
CASE Statement vs. CASE Expression: CASE statements control program flow within PL/SQL
blocks, while CASE expressions return a value.
IF-THEN-ELSIF Alternative: CASE statements can be more readable and maintainable than
nested IF-THEN-ELSIF structures.
Data Type Consistency: return_expr values in CASE expressions must have compatible
datatypes to avoid an ORA-00932 error.
Using CASE WHEN statements in PL/SQL can improve the organization and readability of conditional
logic in your Oracle code.
27 | P a g e
IF NULL() function
While IFNULL() is a common function in some other SQL databases like MySQL, Study.com it's not a
native function in Oracle PL/SQL or SQL. The equivalent function in Oracle that provides the same
functionality is NVL(). Another common function, supported by the SQL standard, is COALESCE().
What IFNULL() does (and its Oracle equivalent NVL())
The purpose of IFNULL() (or Oracle's NVL()) is to check if a given expression is NULL. If the expression
is NULL, it returns an alternative value that you specify. Otherwise, it returns the original value of the
expression.
Oracle's NVL() function
Syntax:
sql
NVL(expression, replacement_value)
Use code with caution.
Example
Let's say you have a table named employees with columns employee_id, first_name, last_name,
and commission_pct. The commission_pct column might contain NULL values for employees who
don't receive a commission. You want to display the commission as 0 when it's NULL instead of
leaving it blank.
Query using NVL():
sql
SELECT
employee_id,
first_name,
last_name,
NVL(commission_pct, 0) AS commission
FROM
employees;
Use code with caution.
Explanation
In this query, NVL(commission_pct, 0) checks the value of commission_pct for each employee.
If commission_pct is not NULL (meaning the employee receives a commission), the
original commission_pct value is returned.
If commission_pct is NULL, the value 0 is returned instead.
This ensures that the commission column in your result set will always display a numeric value,
making it easier to perform calculations or display more meaningful results in reports.
Key points to remember
In Oracle, use NVL() or COALESCE() instead of IFNULL().
NVL() takes two arguments: the expression to check and the replacement value.
COALESCE() is more versatile as it can handle multiple arguments and returns the first non-
NULL value in a list.
The data type of the replacement_value should be compatible with the data type of
the expression to avoid errors.
NVL() and COALESCE() are powerful tools for handling NULL values in your Oracle SQL
queries, improving data integrity and readability.
The COALESCE() function in Oracle is a standard SQL function that returns the first non-
NULL expression in a list of expressions. It's an incredibly versatile function for handling NULL values
and is often preferred over NVL() when you need to check multiple expressions.
28 | P a g e
Syntax
sql
COALESCE(expression1, expression2, expression3, ..., expressionN)
Use code with caution.
How it works
The COALESCE() function evaluates the expressions from left to right. As soon as it encounters an
expression that is not NULL, it returns that expression's value and stops evaluating the rest of the
arguments. If all expressions in the list evaluate to NULL, then COALESCE() returns NULL.
Example 1: Providing a default value for a nullable column
Consider the employees table, where the commission_pct column may contain NULL values. There is
also a bonus column that could be NULL. The query shows a commission or bonus, or 0 if both
are NULL.
sql
SELECT
employee_id,
first_name,
last_name,
commission_pct,
bonus,
COALESCE(commission_pct, bonus, 0) AS final_incentive
FROM
employees;
Use code with caution.
Explanation
In this query, COALESCE(commission_pct, bonus, 0) checks the values in this order:
If commission_pct is not NULL, its value is returned as final_incentive.
If commission_pct is NULL, it then checks bonus. If bonus is not NULL, its value is returned.
If both commission_pct and bonus are NULL, it finally returns 0.
This ensures that the final_incentive column always contains a non-NULL value.
Example 2: Selecting the first available contact information
Imagine a customers table with contact columns: email_primary, email_secondary,
and phone_number. The goal is to display the first available contact method for each customer.
sql
SELECT
customer_id,
customer_name,
COALESCE(email_primary, email_secondary, phone_number, 'No contact info available') AS
preferred_contact
FROM
customers;
Use code with caution.
Explanation
COALESCE(email_primary, email_secondary, phone_number, 'No contact info
available') checks the contact columns in the specified order.
It returns the first non-NULL contact information found.
If all three contact columns are NULL, it defaults to the string 'No contact info available'.
Key points about COALESCE()
29 | P a g e
Standard SQL Function: COALESCE() is part of the SQL standard, which makes the code more
portable across different database systems.
Multiple Arguments: It can handle two or more arguments, which makes it more flexible
than NVL() for multiple null checks.
Returns First Non-NULL: It stops evaluating arguments as soon as it finds a non-NULL value.
Returns NULL if All Are NULL: If all arguments are NULL, the function returns NULL.
Data Type Compatibility: All expressions within COALESCE() should be of compatible data
types. Oracle implicitly converts data types if possible, but explicit casting might be necessary
in some cases to avoid errors or unexpected behavior.
Evaluation Order: The order of expressions is important, as it determines the precedence for
which non-NULL value is returned.
COALESCE() is a versatile and powerful function that significantly simplifies handling NULL values in
Oracle SQL, which leads to more concise and readable queries.
The LIMIT clause, commonly used in databases like MySQL and PostgreSQL to restrict the number of
rows returned by a query, is not a native SQL feature in Oracle Database. However, Oracle provides
several alternatives to achieve the same functionality:
1. ROWNUM pseudocolumn
Description: ROWNUM is an Oracle pseudocolumn that assigns a sequential number to each
row returned by a query, starting with 1. This pseudocolumn is assigned before any explicit
sorting (defined by ORDER BY) takes place in the query, potentially leading to unexpected
results if not used carefully.
Query Example:
sql
SELECT employee_id, first_name, last_name
FROM employees
WHERE ROWNUM <= 5;
Use code with caution.
Explanation: This query returns the first 5 rows as they are retrieved from
the employees table. The order may not be consistent if the table has no explicit order. To
ensure consistent results, use a subquery for ordering before applying the ROWNUM filter.
30 | P a g e
sql
SELECT employee_id, first_name, last_name
FROM employees
ORDER BY last_name
OFFSET 10 ROWS
FETCH NEXT 5 ROWS ONLY;
Use code with caution.
Explanation: This example skips the first 10 rows and returns the next 5.
Key considerations
Performance: Performance can vary; review execution plans.
Consistency: Always use ORDER BY for consistent results.
Best Practices: Optimize queries and consider indexing and using subqueries or CTEs.
Understanding these alternatives allows you to effectively limit query results in Oracle SQL
Understanding the ROWNUM pseudocolumn in Oracle PL/SQL
In Oracle Database, ROWNUM is a pseudocolumn, not a function, that automatically assigns a
sequential number to each row returned by a query. The numbering starts from 1 for the first row, 2
for the second, and so on. However, it's crucial to understand how ROWNUM is assigned to use it
correctly and avoid unexpected results.
How ROWNUM is assigned and its limitations
ROWNUM is assigned to a row after it passes the WHERE clause predicate phase,
but before any ORDER BY, GROUP BY, or HAVING clauses are applied. This order of operations has
significant implications:
ROWNUM vs. ORDER BY: If you use ROWNUM and ORDER BY in the same query, the ORDER
BY will reorder the rows after ROWNUM values have been assigned. This means the row
with ROWNUM = 1 might not be the "first" row according to your desired sort order. To get a
properly ordered top-N result, you must use a subquery to order the data first, then apply
the ROWNUM filter on the outer query.
o Example (Incorrect): This might not return the employees with the highest salary.
sql
SELECT employee_id, salary FROM employees WHERE ROWNUM <= 5 ORDER BY salary DESC;
o Example (Correct): This will return the top 5 employees based on salary.
sql
SELECT employee_id, salary
FROM (SELECT employee_id, salary FROM employees ORDER BY salary DESC)
WHERE ROWNUM <= 5;
Use code with caution.
Conditions on ROWNUM: Conditions like ROWNUM > 1 or ROWNUM = 5 will often return no
rows. This happens because ROWNUM is incremented only after a row passes
the WHERE clause. If the first row doesn't satisfy ROWNUM > 1, the next row is also
evaluated against ROWNUM = 1, and so on.
Common uses of ROWNUM
1. Top-N Queries: Retrieve the first N rows from a result set, often combined with a subquery to
ensure proper ordering.
sql
SELECT * FROM (SELECT employee_id, salary FROM employees ORDER BY salary DESC) WHERE
ROWNUM <= 10;
Use code with caution.
31 | P a g e
2. Pagination: Retrieve specific pages of data, typically using a combination of ROWNUM in a
subquery to assign row numbers and then filtering in the outer query.
sql
SELECT * FROM (SELECT a.*, ROWNUM rnum FROM (SELECT * FROM employees ORDER BY
employee_id) a) WHERE rnum BETWEEN 11 AND 20;
3. Assigning Unique Values (Less Common): ROWNUM can be used to assign unique numbers
to rows in an UPDATE statement, Oracle Help Center although ROW_NUMBER() is generally
preferred for this purpose.
sql
UPDATE my_table SET column1 = ROWNUM;
Duplicates Always unique Can assign duplicates if ties in ORDER BY are handled
with RANK() or DENSE_RANK()
Use in WHERE Direct use (e.g., WHERE Requires a subquery to filter on the generated row number
ROWNUM <= N)
32 | P a g e
1. Purpose: This clause divides the result set into independent groups or partitions. The
analytic function then operates on each partition separately, and the calculation
restarts for each new partition.
2. Example:
sql
SELECT employee_id, department_id, salary,
AVG(salary) OVER (PARTITION BY department_id) AS avg_dept_salary
FROM employees;
Use code with caution.
In this example, AVG(salary) calculates the average salary for each
distinct department_id independently. The average for one department is not affected by the
salaries in other departments.
2. ORDER BY order_by_clause (Optional but crucial):
1. Purpose: This clause specifies the logical order of rows within each partition (or the
entire result set if PARTITION BY is omitted). This ordering is critical for analytic
functions whose results depend on the sequence of rows, such as:
o Ranking functions (ROW_NUMBER(), RANK(), DENSE_RANK())
o Value-based functions (LAG(), LEAD())
o Aggregate functions used for running or cumulative calculations
(e.g., SUM() or AVG() with a windowing_clause).
2. Example:
sql
SELECT employee_id, department_id, salary,
ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank_in_dept
FROM employees;
Use code with caution.
Here, ROW_NUMBER() assigns ranks based on salary in descending order within each department.
The ranking starts from 1 for each new department.
3. windowing_clause (Optional):
1. Purpose: This clause defines a "sliding window" of rows within the current partition
(or the entire result set) that the analytic function should consider for each row. It
further refines the set of rows that the function operates on. It is typically used with
aggregate functions to perform calculations like moving averages or running totals.
2. Common examples:
o ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Includes all
rows from the start of the partition up to the current row. Used for
cumulative sums or averages.
o ROWS BETWEEN <N> PRECEDING AND CURRENT ROW: Includes the current
row and the N preceding rows. Used for moving averages.
o RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING:
Includes all rows in the entire partition. (Default if only PARTITION BY is
specified for aggregate functions).
3. Example:
sql
SELECT sales_date, revenue,
SUM(revenue) OVER (ORDER BY sales_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
AS three_day_moving_sum
FROM daily_sales;
33 | P a g e
Use code with caution.
In this case, SUM(revenue) calculates the sum of revenue for the current sales_date and the two
preceding sales_dates, creating a three-day moving sum.
Key benefits of the OVER() clause
Enables Analytic Functions: It's the gateway to all analytic functions in Oracle SQL.
Contextual Calculations: Allows you to perform calculations on a subset of related rows
without losing the detail of individual rows.
Eliminates Self-Joins: Often replaces complex self-joins or correlated subqueries, leading to
more efficient and readable code.
Powerful for Reporting: Ideal for creating reports that require comparisons, rankings, running
totals, and other window-based calculations.
By mastering the OVER() clause and its components (PARTITION BY, ORDER BY, windowing_clause),
you unlock a vast array of powerful analytic capabilities in Oracle SQL, enabling you to derive richer
insights from your data.
34 | P a g e