Step 1 - CREATE TABLE and INSERT:
● This step involves creating a new table using a CREATE TABLE statement
and then inserting data into that table.
● The INSERT statement is inserting data from multiple sources (presumably
the result of various queries or joins) into the newly created table.
● There is also an aggregate function, a scalar aggregate (SCALAR
AGGREGATE Operator), that seems to be used to evaluate the count of
rows being inserted.
Step 2 - UPDATE:
● This step involves an UPDATE operation on a table. The table is being
updated with data from various sources.
● Nested loop joins are used in the process. This means that for each row in
the source data, the system looks up and updates the corresponding row
in the target table. Nested loop joins are typically used for semi-join
operations.
Step 3 - CREATE INDEX:
● In this step, an CREATE INDEX operation is performed on a table. It's
creating an index on the specified table.
● Indexes are used to speed up data retrieval operations in SQL queries.
Step 4 - EXECUTE:
● This step involves executing a previously cached SQL statement
(SSQL_ID).
● The statement appears to have been cached and is now being executed.
Limiting Factors:
● There are references to limiting I/O cost (Limit io_cost_actual) and
limiting temporary database space (Limit tempdb_space). These limits
are applied for the specific login/role 'itt425600u'.
● These limits may be put in place to control the resource usage and
prevent queries from consuming excessive resources.
Table and Index Information:
● The plan references specific tables and indexes. For example, it mentions
the table names (e.g., 'rds_tmp', 'vend_master') and index names (e.g.,
'vend_masterI1', 'part_masterI8').
Join Types:
● There are different types of joins mentioned, such as inner joins and left
semi-joins, which dictate how data from different tables is combined.
Aggregation:
● Aggregate operations (e.g., SUM or COUNT) are used to process data, as
seen in the SCALAR AGGREGATE Operator. Aggregation is typically used to
summarize or calculate values from multiple rows.
**Index idx1:
● create index idx1 on tempdb..rds_tmp(sku_no,inv_type)
This index is created with two columns: sku_no and inv_type. It's essential to
understand how this index is used in the context of your query.
**Index idx3:
● create index idx3 on tempdb..rds_tmp(vpl_no,vend_no)
This index includes the columns vpl_no and vend_no and might be used for queries
that filter or join based on these columns.
In Sybase, when a query is executed, the query optimizer decides whether to use an
index or perform a table scan based on factors like data distribution, cardinality, and
query complexity. Let's discuss some general considerations for the query execution
plan:
● Index idx1:
● The idx1 index with columns sku_no and inv_type is used to optimize
queries that involve these columns. For example, it can help when
selecting or joining rows based on these columns. The specific execution
plan might include index scans or index seeks on this index when such
queries are executed.
● Index idx3:
● The idx3 index with columns vpl_no and vend_no is likely used for
queries that filter or join data based on these columns. The query
optimizer may choose to use this index to improve the performance of
such operations.
● Update Operations:
● For the update operations that involve joining, like the second and third
updates where you join the tempdb..rds_tmp table with other tables, the
query optimizer might use the indexes on joining columns for efficient
matching.
● Aggregation:
● In the aggregation step to create tempdb..rds_tmp_body, the query
optimizer may choose to perform an aggregation operation without
utilizing indexes.
● Final SELECTs:
● The final SELECT statements do not involve any explicit indexes but are
likely to perform table scans or use indexes if appropriate for the WHERE
conditions and sorting (ORDER BY).
2;
Statement 1 - Creation of a Table, Insert, and Indexing:
● It starts by creating a table.
● Then, it inserts data into the table using a nested loop join operation.
● The scan operation references a table named CIS..part_master using
the index part_masterI8.
● The total estimated I/O cost for this statement is 77138.
Statement 2 - Index Creation:
● This statement creates an index on the table #t_sku.
● The estimated I/O cost is 0.
Statement 3 - Unspecified Operation (GOTO):
● It appears to be an unspecified operation, possibly moving the execution
to another point in the code.
● The estimated I/O cost is 0.
Statement 4 - Table Creation, Insert, and Nested Loop Join:
● Creates a table.
● Inserts data into the table using a nested loop join operation.
● The scan operation references a temporary table #t_sku and another table
CIS..inv_qty using an index inv_qtyI3.
● The total estimated I/O cost for this statement is 934.
Statement 5 - Execution of a Cached Query:
● Executes a previously cached statement with an ID.
● The estimated I/O cost is 0.
Statement 6 - Declaration:
● This statement declares something (not specified in the provided log).
● The estimated I/O cost is 0.
Statement 7 - Insert, Hash Vector Aggregate:
● Inserts data into a table with an estimated I/O cost of 59251.
● This statement involves a hash vector aggregate operation and a nested
loop join.
Statement 8 - Recompilation Due to Tabmissing:
● This indicates that the query plan was recompiled due to a missing table,
but the plan itself is not shown.
Statement 9 - Select from Table:
● Executes a SELECT statement on a table named tempdb..rds_tmp.
● The estimated I/O cost is 29.
Statement 10 - Recompilation Due to Tabmissing:
● Similar to Statement 8, this indicates a recompilation due to a missing
table.
Statement 11 - Another Select from Table:
● Similar to Statement 9, this statement executes a SELECT statement, but
the table name is different (tempdb..rds_tmp_body).
● The estimated I/O cost is 27.
Statement 12 - Another Recompilation:
● Indicates another recompilation due to a missing table, but the plan is not
shown.
QUERY PLAN FOR STATEMENT 1 (at line 1): This part provides a header indicating
that it's describing the execution plan for Statement 1.
Optimized using Serial Mode: It informs you that the query plan was optimized for
execution in serial mode, indicating that parallel processing is not used for this
particular query.
Steps: The execution plan contains multiple steps (in this case, Steps 1 and 2).
● STEP 1:
■ The type of query is CREATE TABLE: This step represents a
CREATE TABLE operation, indicating that a new table is being
created.
● STEP 2:
■ The type of query is INSERT: This step represents an INSERT
operation, where data is being inserted into a table.
Operators: Each step may contain one or more operators. In this case, there are five
operators under the root operator.
● ROOT:EMIT Operator (VA = 5): This is the root operator responsible for
emitting the final result of the query.
● INSERT Operator (VA = 4): This operator represents the INSERT operation.
It is marked as having a "direct" update mode.
● NESTED LOOP JOIN Operator (VA = 3): This operator is a nested loop join,
which is an inner join operation. It is used to combine data from different
sources efficiently.
● SCAN Operator (VA = 0): This scan operator retrieves data. It is part of a
complex expression involving an OR List.
● RESTRICT Operator (VA = 2): This operator represents a restrict
operation. It appears to have certain parameters and conditions specified,
such as cost and estimated row count.
● SCAN Operator (VA = 1): This scan operator retrieves data from the table
CIS..part_master. It is used to access data from the part_master table.
Indexes: The execution plan specifies the index used for accessing the data. In this
case, it mentions part_masterI8, which seems to be an index used for
optimizing data retrieval from the part_master table.
I/O Cost: The execution plan provides the estimated I/O cost for the statement,
which is an indicator of how resource-intensive the operation is expected to be.
In this case, it's 77138.
Partitioning Types in Sybase:
Range Partitioning: Range partitioning involves dividing the data based on a
specified range of values for a specific column. For example, you can partition a
sales table by date, where each partition contains data for a specific date range.
List Partitioning: List partitioning allows you to group rows based on specific values
in a designated column. For instance, you can partition a customer table based
on the customer's country.
Hash Partitioning: Hash partitioning uses a hash function to distribute data across
partitions. This method provides a uniform distribution of data and is useful for
ensuring load balancing.
Composite Partitioning: Composite partitioning combines multiple partitioning
methods to meet specific requirements. For example, you can use both range
and hash partitioning on a single table.
Advantages of Partitioning:
Performance: Partitioning can significantly improve query performance. When data
is divided into partitions, the database engine can quickly identify which
partition(s) contain the relevant data, reducing the amount of data that needs to
be scanned.
Manageability: Large tables can be challenging to manage, but partitioning makes it
easier. You can manage individual partitions separately, making maintenance
tasks such as backup and index rebuilding more manageable.
Data Archiving: Aging or infrequently accessed data can be moved to lower-cost
storage devices without impacting the rest of the data. This is useful for data
archiving and compliance purposes.
Scalability: Partitioning can be used to scale out your database horizontally by
placing different partitions on different storage devices or servers. This allows
you to distribute the workload and take advantage of parallel processing.
How to Implement Partitioning in Sybase:
Create Partition Function: Define the partitioning strategy, specifying which column
to partition on and how data should be divided among partitions. This can involve
range, list, hash, or composite partition functions.
Create Partition Scheme: Associate a partition function with a partition scheme,
which determines where the data for each partition will be stored. This can
involve filegroups or different disks.
Alter Table: Use the ALTER TABLE statement to apply partitioning to an existing table
or when creating a new table.
-- Create a partition function (range partitioning by date)
CREATE PARTITION FUNCTION pf_sales_date_range(DATETIME)
AS RANGE RIGHT FOR VALUES ('2023-01-01', '2023-02-01', '2023-03-01');
-- Create a partition scheme (storing each partition on a separate filegroup)
CREATE PARTITION SCHEME ps_sales_date_range
AS PARTITION pf_sales_date_range
TO (fg_jan, fg_feb, fg_mar, fg_other);
-- Create a table with partitioning
CREATE TABLE sales (
sale_id INT,
sale_date DATETIME,
amount DECIMAL(10, 2)
) ON ps_sales_date_range(sale_date);
In this example, the sales table is partitioned by the sale_date column, using a range
partitioning strategy.
1. Page Size:
● In Sybase, data pages are typically organized into 2K, 4K, 8K, 16K, or 32K sizes.
The choice of page size depends on the system configuration and the
requirements of the database.
2. Record Storage:
● Data pages store rows or records of a table. Each row of data is stored as a
record on a data page.
● A single data page can hold multiple rows, depending on the size of each row
and the page size. If a row is too large to fit within a page, it is divided into
multiple pages in a process known as row chaining.
3. Data Page Types:
● In Sybase, data pages come in different types, including:
● Data Pages: These pages store actual table data.
● Index Pages: These pages store index entries for fast data retrieval.
● Text/Image Pages: These pages are used for large text or binary objects
(LOBs) like BLOBs and CLOBs.
● Work Pages: Used for sorting and temporary storage during query
processing.
● Log Pages: Used for recording transaction log information.
4. Organization:
● Data pages are organized into structures like extents, segments, and databases.
Extents are groups of contiguous data pages, segments contain extents, and
databases consist of multiple segments.
● Data pages are managed using allocation and deallocation mechanisms, with
page allocation controlled by the database engine.
5. Disk I/O:
● When data needs to be read or written to/from the database, it is done in terms
of whole data pages. Disk I/O operations are optimized for page-sized units,
which is more efficient than reading/writing individual records.
● Efficient use of data pages is crucial to minimize I/O and improve database
performance.
6. Data Page Allocation:
● The database engine in Sybase manages data page allocation and release. When
you insert, update, or delete rows in a table, data pages are allocated and de-
allocated dynamically.
● Data pages are allocated to tables, indexes, and other objects, depending on their
storage requirements.
7. Maintenance:
● Over time, data pages can become fragmented as rows are inserted, updated,
and deleted. Database administrators can perform maintenance tasks like
reorganizing data pages to optimize performance.
● It's essential to regularly rebuild indexes and defragment data pages to ensure
efficient data storage and retrieval.
8. Page Locking:
● In Sybase, data pages can be locked to ensure data consistency during
concurrent access by multiple users. Locking prevents data contention and
ensures data integrity.
9. Data Caching:
● To enhance performance, modern RDBMS systems, including Sybase, use
caching mechanisms to keep frequently accessed data pages in memory (RAM)
rather than repeatedly reading them from disk.