How to Avoid Dramatic Access Path Changes in DB2 for z/OS Dynamic SQL (or at Least Fix Them Quickly)

By Kurt Struyf, Lone Star Consulting

This article is based on a recent tuning engagement at a customer site and describes the countermeasures that were enacted to restore system performance. Those countermeasures include a daily process to establish an access path baseline and sample jobs leading up to a “BIND QUERY”.

The Problem: Access Path Shift

After a simple RUNSTATS, some boundaries were crossed (e.g. a table grew over a certain number of rows), changing the access path of two dynamic SQL statements. These statements became significantly more expensive even though each statement by itself still ran relatively fast. Most importantly, both statements were executed very often. Despite their short duration, their high priority workload class and new (more expensive) access path caused them to pretty much eat up all the CPU on the customer’s machine.

Dynamic SQL statements can remain in the dynamic statement cache for a long period of time. When the statement is refreshed, its access path could change dramatically. This article shows one way of limiting the impact of such an access path shift.

It’s important to understand which types of problematic queries are best suited for this approach. This process does not intend to protect the system from a dynamic statement that runs once every so often and consumes “too much” CPU (the Resource Limit Facility and other tools can handle that already). The ideal queries for this project are those that suddenly consume 200%, 300%, or even 1,000% more CPU time but still run relatively fast. The overall consumption of this type of query becomes problematic because it runs frequently and now uses more CPU time per execution.

Daily Process

Step 1: Once a day (after the morning CPU peak), we run an EXPLAIN of the entire dynamic statement cache. This fills up the DSN_STATEMENT_CACHE_TABLE, it however doesn’t fill up the PLAN_TABLE that gives an overview of the access strategy, chosen by DB2.

EXPLAIN STMTCACHE ALL;

The EXPLAIN statement stores its output in the DSN_STATEMENT_CACHE_TABLE.

SELECT * FROM "KURT"."DSN_STATEMENT_CACHE_TABLE";

The actual output contains many more columns than what is shown below.

-------------------------------------------------------------------------------------------------------------------------
STMT_ID STMT_TOKEN COLLID             PROGRAM_NAME INV_DROPALT INV_REVOKE INV_LRU INV_RUNSTATS CACHED_TS                
------- ---------- ------------------ ------------ ----------- ---------- ------- ------------ --------------------------
     30 ?          DSNDYNAMICSQLCACHE DSNREXX      N           N          N       N            2014-11-16-11.01.03.029640
     78 ?          DSNDYNAMICSQLCACHE SYSLH200     N           N          N       N            2014-11-17-09.46.00.905866
     18 ?          DSNDYNAMICSQLCACHE DSNREXX      N           N          N       N            2014-11-16-11.01.02.947494
     72 ?          DSNDYNAMICSQLCACHE SYSLH200     N           N          N       N            2014-11-17-09.46.00.431261
     13 ?          DSNDYNAMICSQLCACHE DSNREXX      N           N          N       N            2014-11-16-09.27.25.959934
     88 ?          DSNDYNAMICSQLCACHE ADBMAIN      N           N          N       N            2014-11-17-09.56.37.979615
     42 ?          DSNDYNAMICSQLCACHE DSNREXX      N           N          N       N            2014-11-16-14.07.59.505195
     23 ?          DSNDYNAMICSQLCACHE DSNREXX      N           N          N       N            2014-11-16-11.01.02.977857
     44 ?          DSNDYNAMICSQLCACHE EDBAB017     N           N          N       N            2014-11-16-16.02.27.109151
     32 ?          DSNDYNAMICSQLCACHE DSNREXX      N           N          N       N            2014-11-16-11.01.03.064246
     76 ?          DSNDYNAMICSQLCACHE SYSLH200     N           N          N       N            2014-11-17-09.46.00.880061
      4 ?          DSNDYNAMICSQLCACHE SYSLH200     N           N          N       N            2014-11-15-22.24.01.421931
     62 ?          DSNDYNAMICSQLCACHE SYSLH200     N           N          N       N            2014-11-17-09.23.54.458424
     20 ?          DSNDYNAMICSQLCACHE DSNREXX      N           N          N       N            2014-11-16-11.01.02.970022
     56 ?          DSNDYNAMICSQLCACHE SYSLH200     N           N          N       N            2014-11-17-09.18.59.240636
     66 ?          DSNDYNAMICSQLCACHE SYSLH200     N           N          N       N            2014-11-17-09.35.39.778570
     48 ?          DSNDYNAMICSQLCACHE SYSLH200     N           N          N       N            2014-11-17-05.39.03.051802
     37 ?          DSNDYNAMICSQLCACHE DSNREXX      N           N          N       N            2014-11-16-11.01.03.102593
      6 ?          DSNDYNAMICSQLCACHE SYSLH200     N           N          N       N            2014-11-15-23.15.06.448847

Step 2: Right after Step 1, generate individual EXPLAIN statements for all the STMT_IDs in the DSN_STATEMENT_CACHE_TABLE that were cached on that day. Each EXPLAIN statement will write out the access path of a specific query to the PLAN_TABLE.

The generated EXPLAIN statement looks like this:

EXPLAIN STMTCACHE STMTID 76;

A REXX will generate the individual EXPLAIN statements based on the following SELECT:

SELECT 'EXPLAIN STMTCACHE STMTID ' || CHAR(STMT_ID) || ' ;'
FROM "KURT"."DSN_STATEMENT_CACHE_TABLE"
WHERE DATE(CACHED_TS) = CURRENT DATE;

Use a DSNTIAD with PARMS(RC0), as some of the cached statements might no longer exist in the dynamic statement cache by the time they are explicitly explained.

//SELDB2T EXEC PGM=IKJEFT01,DYNAMNBR=25,REGION=6144K      
//STEPLIB  DD  DSN=DB2.SDSNLOAD.DB2T,DISP=SHR             
//SYSEXEC  DD  DISP=SHR,DSN=COMP.KST.SYSTEM.REXX         
//SYSTSPRT DD  SYSOUT=*                                   
//SELDB2T.DD010 DD   DSN=COMP.KST.WORK.EXPLAIN.DB2T,     
//             DISP=(,CATLG,CATLG), UNIT=SYSDA, SPACE=(CYL,(10,10),RLSE),
//             RECFM=FB,LRECL=2000,BLKSIZE=32000          
//SYSTSIN DD *                                            
  EXPLSEL DB2T KURT                                    
/*                                                        
//*    EXPLAIN STATEMENT CACHE PER STMTID                 
//EXPLAINP EXEC PGM=IKJEFT01,DYNAMNBR=20                  
//SYSTSPRT DD  SYSOUT=*                                   
//SYSPRINT DD  SYSOUT=*                                   
//SYSUDUMP DD  SYSOUT=*                                   
//SYSTSIN  DD  *                                          
  DSN SYSTEM(DB2T)                                        
  RUN PROGRAM(DSNTIAD)  PLAN(DSNTIAD) PARM('RC0') -      
       LIB('DB2.RUNLIB.LOAD.DB2T')                        
  END                                                     
//SYSIN    DD  DISP=SHR,DSN=COMP.KST.WORK.EXPLAIN.DB2T

This daily process will establish a baseline for future comparison, making it an especially useful reference point for frequently executed SQL statements.

When a Performance Problem Arises

Step 1: Using the SQL_TEXT returned by DB2, identify the statement causing the problem. For demonstration purposes, suppose the problematic SQL statement is:

SELECT FOREIGNKEY FROM SYSIBM.SYSCOLUMNS
WHERE NAME = 'CREATESTMT'
AND TBNAME = 'SYSROUTINES_SRC' AND TBCREATOR = 'SYSIBM';

Determine if this statement was previously known in the dynamic statement cache and, by extension, in DSN_STATEMENT_CACHE_TABLE. If it shows up in DSN_STATEMENT_CACHE_TABLE, compare its current access path to the access path you captured back when the statement still ran efficiently. If their CPU costs differ significantly (an access path shift), continue to Step 2.

SELECT *
FROM "KURT"."DSN_STATEMENT_CACHE_TABLE"
WHERE STMT_TEXT LIKE 'SELECT FOREIGNKEY FROM SYSIBM%';

Be careful, this statement could return lots of rows; more WHERE clause predicates might be needed to narrow the scope.

Step 2: Determine if OPTHINTS=YES in the active DSNZPARM. If not, set it to YES and assemble and activate the DSNZPARM.

Step 3: To influence the optimizer to use the a different access path for the problematic query, copy the desired access path into DSN_USERQUERY_TABLE via an INSERT statement. Pay attention that all involved tables reference the appropriate schema. Since multiple access path copies might exist in DSN_STATEMENT_CACHE_TABLE for the same STMT_TEXT, make sure you identify the correct STMT_ID as well as the EXPLAIN timestamp of the access path you want to copy.

INSERT INTO "KURT"."DSN_USERQUERY_TABLE" (
QUERYNO,
QUERY_TEXT,
HINT_SCOPE
)
SELECT
STMT_ID,
STMT_TEXT,
0           --> hint_scope 0 indicates a system-wide hint
FROM "KURT"."DSN_STATEMENT_CACHE_TABLE"
WHERE STMT_ID = 50     --> i.e. stmt_id 73 is today's bad access path, stmt_id 50 has the prior/"good" path
AND EXPLAIN_TS = '2014-11-17-10.00.49.710000' --> Explain timestamp of the good/wanted access path
;

The INSERT statement shown above specifies a system-wide optimization hint that will be applied to every invocation of the query, regardless of how it is executed (static, dynamic, Application A or Application B), as long as the statement text is the same.

Step 4: Issue a BIND QUERY command. You must omit the LOOKUP option or specify LOOKUP(NO). BIND QUERY creates optimization hints for every row in DSN_USERQUERY_TABLE (and its related tables), storing the final objects in the system catalog (SYSIBM.SYSQUERY and SYSIBM.SYSQUERYPLAN). Because BIND QUERY only works with the entire DSN_USERQUERY_TABLE and not a specific query, we'll want to revisit this table later for cleanup. 

BIND QUERY LOOKUP(NO)

Expected output :

BIND QUERY LOOKUP(NO)                             
DSNT280I  DB2T BIND QUERY FOR QUERYNO = 50 SUCCESSFUL
DSNT290I  DB2T BIND QUERY COMMAND COMPLETED

Step5 (optional) : Verify the row(s) where inserted into SYSIBM.SYSQUERY and SYSIBM.SYSQUERYPLAN.

First, SYSIBM.SYSQUERY:

--------------------------------------------------------------------------
QUERYID QUERY_HASH       SCHEMA QUERY_SEC_HASH   QUERY_HASH_VERSION SOURCE
------- ---------------- ------ ---------------- ------------------ ------
    21 .._.Â.#`....Á#.â        ..îç......ÈÁ¦%Â.                  1      0

Note that the QUERYID in SYSIBM.SYSQUERY will be different than what you inserted into QUERYNO, as QUERYID is as an identity column that DB2 generates by default. The QUERYID will, however, match the hint’s associated rows in SYSIBM.SYSQUERYPLAN (shown below).

---------------------------------------------------------------------------------------------------------
QUERYID COPYID PLAN_VALID IBMREQD QBLOCKNO PLANNO METHOD CREATOR TNAME         TABNO ACCESSTYPE MATCHCOLS
------- ------ ---------- ------- -------- ------ ------ ------- ------------ ------ ---------- ---------
   21      0 Y          N              1      1      0 SYSIBM  SYSDATATYPES      1 R                  0

Step 6 (optional): Run the original SQL statement (through DSNTEP2 or a similar utility) to verify that the optimization hint was successfully applied. If it was, you should get SQLCODE +394 as a warning :

***INPUT STATEMENT:
   SELECT FOREIGNKEY FROM SYSIBM . SYSCOLUMNS
   WHERE NAME = 'CREATESTMT'
   AND TBNAME = 'SYSROUTINES_SRC' AND TBCREATOR = 'SYSIBM'
SQLWARNING ON SELECT    COMMAND, PREPARE   FUNCTION
 RESULT OF SQL STATEMENT:
 DSNT404I SQLCODE = 394, WARNING:  USER SPECIFIED OPTIMIZATION HINTS USED DURING ACCESS PATH SELECTION

Step 7: To avoid accidentally reusing the same optimization hint during next BIND QUERY statement, remove all rows from the DSN_USERYQUERY_TABLE. This won’t affect your OPTHINT, which has already been deployed to the system catalog tables (SYSIBM.SYSQUERY and SYSIBM.SYSQUERYPLAN).

DELETE FROM "KURT"."DSN_USERQUERY_TABLE";

Deleting a specific QUERYNO is possible as well.

Step 8 (optional): When the performance problem is solved and the optimization hint is no longer needed, remove the OPTHINT.

FREE QUERY QUERYID(xx)

Replace the xx with the hint’s QUERYID from SYSIBM.SYSQUERY.

Be very careful! Performing a FREE QUERY without a QUERYID clause will free all OPTHINTS at once. This can have a major impact on performance and should be avoided.

Step 9 (optional): Applications that use the statements that are being “helped” by an OPTHINT have to be able to deal with positive SQLCODEs (e.g. SQLCODE +394 issued during the PREPARE) as a warning that an optimization hint is being used.

If error handling is done simply by testing for SQLCODE <> 0, these applications will now fail. Rewriting them to deal with these specific warnings is the ideal solution, but if that is not feasible, consider using DSNZPARM SUPPRESS_HINT_SQLCODE_DYN.

You can suppress SQLCODEs +394 and +395 for dynamic SQL statements by enabling the SUPPRESS_HINT_SQLCODE_DYN subsystem parameter. Be aware this is a system wide parameter and will immediately take effect for all statements.

Conclusion

The process/way of working described in this article is by no means tested under all conditions and should be applied with caution. It demonstrates a strategy that did the trick for a particular customer in this scenario.You will need to run your own testing during implementation and devise your own validation criteria.

As always, preparation is half the battle, and understanding the current access paths of your most frequently run dynamic queries is a helpful starting point for any SQL performance tuning endeavor.

Recent Stories
Some of my favorite tunings tips and tricks for DB2 by Joe Geller

SQL Tuning for Developers – Part 1

The Key To Mastering Db2 Locking On Row-Store Tables