Today I wanted to try out using DB2 optimization profiles for a statement impacted by the statement concentrator. In part 1 I gave the background, showed how I created an optimization profile and that a simple explain statement didn't return what I expected. In this final part I am going to look at DB2 section actuals to hopefully proof that my optimization guideline works as I had hoped.
Because all my "explain plan for select ..." statements resulted in the same access plan, I decided to use session actuals to look at how statements are really executed within DB2. The actuals are kind of a live log of the real statement execution costs and the applied access plan. The first step towards session actuals is to have a workload and an activity event monitor and to switch the monitor on:
db2 "create workload betw applname('python') collect activity data with details,section"
db2 "create event monitor betwmon for activities write to table"
db2 "set event monitor betwmon state 1"
The above statements create a workload which collects section data. The application name (APPLNAME) is "python" because I use a Python script (see below) for parts of the testing.
Script:
import ibm_db
conn = ibm_db.connect("hltest","hloeser","secretpw")
ibm_db.exec_immediate(conn, 'set current optimization profile="HLOESER"."PROFILE_BETW"')
ibm_db.exec_immediate(conn, 'select id, s from betw where id between 2 and 20')
Once the script has been executed, the session actuals can be turned into explain information and then that information formatted. To identify which data entry to transform a query like the following one can be used. The table name consists of the generic prefix "ACTIVITYSTMT" and the monitor name:
select appl_id,uow_id,activity_id,stmt_text from ACTIVITYSTMT_BETWMON
The output then serves as input to the "explain_from_activity" procedure:
CALL EXPLAIN_FROM_ACTIVITY('*LOCAL.hloeser.150227075721',1,2, 'BETWMON',NULL,?,?,?,?,?)
The procedure takes the logged execution statistics for the section and turns it into regular explain data. The explain data can be stored in an user copy of the DB2 explain tables. The tool db2exfmt is then used to format the explain data to typical DB2 explain output. To identify what data to format the procedure above returns the needed information, something like this:
Value of output parameters
--------------------------
Parameter Name : EXPLAIN_SCHEMA
Parameter Value : HLEXP
Parameter Name : EXPLAIN_REQUESTER
Parameter Value : HLOESER
Parameter Name : EXPLAIN_TIME
Parameter Value : 2015-02-27-11.55.21.008920
Parameter Name : SOURCE_NAME
Parameter Value : SYSSH200
Parameter Name : SOURCE_SCHEMA
Parameter Value : NULLID
Parameter Name : SOURCE_VERSION
Parameter Value :
Return Status = 0
Calling db2exfmt with that information as input:
db2exfmt -d hltest -w 2015-02-27-11.55.21.008920 -n SYSSH200 -s NULLID -e HLEXP
Long story short, with all the setup I managed to get different access plans and costs for the query with the BETWEEN predicate, depending on the range of values. It was either an index or a table scan. db2exfmt also showed a "Explain from section", the use of my statement-level optimization guideline as shown earlier, and that the statement concentrator still was on (see "effective statement").
Here the two access plans as result from the REOPT ALWAYS in the optimization profile:
Access Plan:
-----------
Total Cost: 590.416
Query Degree: 0
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
98302
4850
TBSCAN
( 2)
590.416
NA
|
98304
NA
TABLE: HLOESER
BETW
Q1
Access Plan:
-----------
Total Cost: 20.3399
Query Degree: 0
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
18.9981
19
FETCH
( 2)
20.3399
NA
/---+----\
18.9981 98304
19 NA
IXSCAN TABLE: HLOESER
( 3) BETW
13.5646 Q1
NA
|
98304
NA
INDEX: HLOESER
BETWIX
Q1
As a last test I removed the line from my script which sets the current optimization profile. With that all statements turn back into the generic and complex access plan:
Access Plan:
-----------
Total Cost: 20.469
Query Degree: 0
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
167.205
19
NLJOIN
( 2)
20.469
NA
/------+-------\
1 167.205
NA 19
TBSCAN FETCH
( 3) ( 4)
0.000186812 20.4688
NA NA
| /---+----\
0 167.205 98304
NA 19 NA
TABFNC: SYSIBM IXSCAN TABLE: HLOESER
GENROW ( 5) BETW
Q1 13.6489 Q3
NA
|
98304
NA
INDEX: HLOESER
BETWIX
Q3
So what is the conclusion? Applying the statement concentrator turns similar statements into a generic one to cut down on query compilation and optimization costs. This is useful in many cases. If queries use predicates like the BETWEEN it can result in suboptimal plans. In such a case a DB2 optimization profile might be useful. But how to proof it? It turns out that session actuals are very useful to dig deeper into what is happening within the database engine. With few steps it is possible to look at how a statement is really executed. And sometimes a database engine does exactly as the user is hoping for... ;-)
Update on 02-Jul-2015: Some additional information can be found in part 3 of this journey.