DB2 explain output |
The original question I tried to answer was related to REOPT and a query like this:
select id, s
from betw
where id between ? and ?
Is it possible to force an application to always reoptimize a statement, so that depending on the given range either an index scan or table scan is used. DB2 has the concept of multi-phase statement compilation, optimization, and execution. The individual steps are briefly explained in the introduction to this article on improving query performance by using REOPT bind option. Using that option it is possible to have (complex) statements re-optimized before every execution, e.g., to address different values in input variables.
The mentioned statement concentrator is a means for reducing query compilation and optimization effort by replacing literals in similar statements by parameter markers. It can have lots of benefits, but also could cause suboptimal access plans if values are not distributed evenly.
For my tests I used a JDBC-based SQL GUI to send statements like "explain plan for select id, s from betw where id between 20 and 30" to DB2, changing the upper bound to much higher values. Without the statement concentrator turned on this resulted in access plans with either an index scan or table scan. Here is such an index-based access plan:
Access Plan:
-----------
Total Cost: 20.3338
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
11.996
FETCH
( 2)
20.3338
3
/---+----\
11.996 98304
IXSCAN TABLE: HLOESER
( 3) BETW
13.5606 Q1
2
|
98304
INDEX: HLOESER
BETWIX
Q1
With the statement concentrator turned on, all statements variations are transformed into a statement with variables. In the explain output the statements that have been rewritten by an active statement concentrator can be identified because of an extra section labeled "effective statement":
Effective Statement:
-------------------
select
id,
s
from
betw
where
id between :L0 and :L1
After getting this far I wanted to apply an optimization profile to cause the specific statement to be always reoptimized. Optimization profiles provide many ways for performance tuning on a, at least sometimes, low-level basis. They are less intrusive than database hints (see Oracle-style hints in DB2 :), can be turned on and off on different levels, and provide all kinds of options. The first step to use profiles is to prepare the DB2 tools catalog for them:
db2 "call sysinstallobjects('opt_profiles', 'c', '', '')"
An optimization profile is created by writing an XML document according to the XML schema found here. I created the following profile:
<?xml version="1.0" encoding="UTF-8"?>
<OPTPROFILE>
<STMTMATCH EXACT='FALSE'/>
<STMTPROFILE ID="REOPT betw">
<STMTKEY>
<![CDATA[select id,s from betw where id between :L0 and :L1 ]]>
</STMTKEY>
<OPTGUIDELINES>
<REOPT VALUE="ALWAYS" />
</OPTGUIDELINES>
</STMTPROFILE>
</OPTPROFILE>
The optimization profile consists of a single so-called statement profile which is labeled "REOPT betw". It is for our SQL statement and I decided to use the version as seen in the effective statement. As a guideline to the optimizer the "REOPT ALWAYS" is passed. What I included in the profile is the switch to turn on inexact statement matching. It causes DB2 to consider more variations of the SQL statement for optimization by our rule. Once the optimization profile is complete, it can be inserted into the catalog table. For that I created a file "betw.csv" with a profile name and reference to the XML file:
"HLOESER","PROFILE_BETW","betw.xml"
That file is then imported into DB2:
db2 "import from betw.csv of del modified by lobsinfile insert_update into systools.opt_profile"
Whenever profiles are changed, it is a good idea to flush the cache:
db2 flush optimization profile cache
Usage of an optimization profile can be enabled in different ways, e.g., in an application by setting a registry variable like this:
set current optimization profile="HLOESER"."PROFILE_BETW"
Once this is done there is feedback in the DB2 explain output on whether profiles are in use and if yes, which one. This could be in the form of diagnostic information pointing to errors as shown below or in a section "profile information" that gives details about the name of the profile and the statement identifier:
A problem is reported
Extended Diagnostic Information:
--------------------------------
Diagnostic Identifier: 1
Diagnostic Details: EXP0005W The optimization profile could not be
processed or found in the OPT_PROFILE table in the
SYSTOOLS schema.
Diagnostic Identifier: 2
Diagnostic Details: EXP0001W An unexpected error occurred processing
or applying an optimization profile or embedded
optimization guideline. Line number "0", character
number "0".
Success, the profile is in use
Profile Information:
--------------------
OPT_PROF: (Optimization Profile Name)
HLOESER.PROFILE_BETW
STMTPROF: (Statement Profile Name)
REOPT betw
So far, so good. An optimization profile has been created and depending on the which statement is executed by DB2, the explain output shows that my statement level guideline has been considered. To test whether the guideline is really applied, I executed several statements like "explain plan for select id, s from betw where id between 20 and x". X was in the range of 21 to 20000. With the statement concentrator on the explain output showed that all statements ended up in the following optimized statement.
SELECT
Q3.ID AS "ID",
Q3.S AS "S"
FROM
(SELECT
Q1.$C0
FROM
(VALUES
0) AS Q1
WHERE
(:L0 <= :L1 SELECTIVITY 1.000000)
) AS Q2,
HLOESER.BETW AS Q3
WHERE
(Q3.ID <= :L1) AND
(:L0 <= Q3.ID)
All statements had the same complex access plan:
Access Plan:
-----------
Total Cost: 20.469
Query Degree: 0
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
167.205
19
NLJOIN
( 2)
20.469
NA
/------+-------\
1 167.205
NA 19
TBSCAN FETCH
( 3) ( 4)
0.000186812 20.4688
NA NA
| /---+----\
0 167.205 98304
NA 19 NA
TABFNC: SYSIBM IXSCAN TABLE: HLOESER
GENROW ( 5) BETW
Q1 13.6489 Q3
NA
|
98304
NA
INDEX: HLOESER
BETWIX
Q3
Did my guideline work? I wasn't sure so this called for deeper look at the optimization. Read on here for the second part of the journey.