19 Jan 2012

Explanation on DB2 Explain

When an SQL is executed against or bound to a DB2 database, DB2 Optimizer tool defines the access path used to access the data. This access path is defined according to tables’ statistics generated by DB2 Runstats tool.

The Explain command details the access path defined by DB2 and allows you to analyze how the data will be accessed and how you can improve the command’s performance.

A table called PLAN_TABLE which has your user id as its owner is required for working on DB2 Explain. You can create it by using the below SQL.

CREATE TABLE [userid].PLAN_TABLE LIKE DEFAULT.PLAN_TABLE;

Once the above mentioned table is ready, follow the below steps.

Step One:

Execute the Explain command on your general SQL statement which you want to analyze:

EXPLAIN PLAN SET QUERYNO = 1 FOR [your sql statement here]

This command will put the Explain information in the PLAN_TABLE.

Step Two:

Execute this SQL command to view the Explain information:

SELECT * FROM PLAN_TABLE WHERE QUERYNO = 1 ORDER BY TIMESTAMP, QUERYNO, QBLOCKNO, PLANNO, MIXOPSEQ WITH UR;

QUERYNO should be the same used in the explain command on Step 1.

Step Three:

Look at these fields for important information:

PLANNO – Number of steps necessary to process the query indicated in QBLOCKNO.
METHOD – Indicate joins method used for the step (PLANNO).
ACCESTYPE – Method used to access the table.
MATCHCOLS – Number of index key used for index scan (when ACCESTYPE is I, IN, M, MX).
ACCESSNAME – Name of the index used for index scan (when ACCESTYPE is I, IN, M, MX).
INDEXONLY – Indicates if the index alone is enough to carry out the step.
PREFETCH – Indicates if data pages can be read in advance by prefetch.
COLUMN_FN_EVAL – Indicates when aggregate functions are evaluated.

Step Four: Analyze the results using the following tips:

Is data accessed through an index?

ACCESSTYPE:

I – Index. This is the best access after the one-fetch index. It uses the index to retrieve rows. The number of index columns used for matching is represented in MATCHCOLS.
I1 – One-fetch index access. Is the best access possible as it requires retrieving only one row. However, it applies only to statement with a MAX or MIN function.
N – Index scan with IN keyword in the predicate. In the example: T(IC1, IC2, IC3, IC4). Command: Select * from T where IC1 = 1 AND IC2 (in 1,2,3) AND IC3 > 0 and IC4 = 1. MATCHCOLS will be 3 and ACCESSTYPE will be N. The IN-List scan will be performed as three matching index scan: (IC=1, IC2=1, IC3>0), (IC=1, IC2=2, IC3>0) and (IC=1, IC2=3, IC3>0). If parallelism is supported they will execute in parallel.
MX – Multiple index scan. More than one index is used to access a table. It is an efficient access path when no single index is efficient and a combination of index provides efficient access.
R – Table space scan. This is the worst type of access as the entire table will be searched to process the query.

MATCHCOLS: The number of index columns matched on an index scan.

If it is 0 all index keys and RIDs are read.
If one of the matching predicates is a range there will be no more matching columns. Example for the index on T(IC1, IC2, IC3, IC4) for the following command the IC3 predicate won’t be used: Select * from T where IC1=1 and IC2 > 1 and IC3 = 1. The position of the columns in the index is used to decide that IC3 won’t be used.

INDEXONLY : If the columns needed for a SQL statement can be found in the index DB2 will not access the table. INDEXONLY performance is very high.

PREFETCH: Prefetching determines in advance if a set of data pages is about to be used and then reads the entire set into a buffer with a single asynchronous I/O operation.

S – Sequential prefetch: data pages read in advance are accessed sequentially. Table space scan always uses sequential prefetch.
L – List prefetch: one or more indexes are used to select the RIDs list in advance.
D =- Dynamic prefetch: the pages to be accessed will be non sequential.
Blank – Prefetch not expected.

SORTs: They add an extra step to the accessed data.

METHOD=3 – These sorts are used for ORDER BY, GROUP BY, SELECT DISTINCT or UNION.
SORTC_UNIQUE, SORTC_ORDERBY, SORTC_GROUP_BY – Indicates an extra sort for an UNIQUE, ORDER BY and GROUP BY clause.

In order to create necessary tables, it’s better to run

CALL SYSPROC.SYSINSTALLOBJECTS(‘EXPLAIN’,’C’,NULL,CURRENT SCHEMA)