Predicate Re-write on DB2 11 for z/OS

I’ve spent the last 17 years teaching SQL Performance & Tuning on DB2 for z/OS.  I’ve been trying to get SQL developers to think about the predicates they write from DB2’s perspective instead of what makes sense to humans.  The predicate:
      DATE(TS_COL) = ‘2016-11-01’

This makes perfect sense to us humans, but has been historically difficult for DB2 to interpret.  On DB2 for z/OS we have referred to this as a “stage 2 non-indexable predicate.”  In practical terms this means that DB2 will need to run the DATE function for every row in order to determine if the condition is true. The use of an index to match the column is impossible.  A better way to write predicates like this one would be a direct comparison of the column to values:
     TS_COL BETWEEN ‘2016-11-01-00.00.00.000000’ AND ‘2016-11-01-24.00.00.000000’

This method is not as intuitive, but turns out to be much easier for DB2 optimization.

Subtle Impacts

Consider a table (T1) that includes an index on C1, C2, C3 and a query like this:

SELECT …
FROM T1
WHERE C1 = ?
     AND C2 = ?
     AND DATE(C3) = ‘2016-11-01’

In the past, DB2 would be able to use the index and match 2 columns.  If the third predicate had been coded differently a third column could be matched.  Since an index is used and multiple columns matched, performance is probably not terrible, but could be better.

DB2 11 Re-Write

DB2 11 for z/OS will automatically re-write the predicate above prior to optimization.  For statements bound or prepared on DB2 11, the query above may match 3 columns in the index without any coding changes.  This change will also allow DB2 to consider other indexes that include C3 that may not have been desirable or possible before.

Other re-writes include:

YEAR(DATE_COL) = 2016
becomes            
DATE_COL BETWEEN ‘2016-01-01’ AND ‘2016-12-31’

SUBSTR(CHAR_COL,1,1) = ‘S’        
becomes            
CHAR_COL BETWEEN ‘S’ AND ‘S’ *   

value BETWEEN C1 AND C2          
becomes            
C1 <= value AND C2 >= value

Access Path Stability on DB2 11

The re-writes described above are a great reason to rebind applications once you migrate to DB2 11.  There are many opportunities to improve access paths without any coding changes.  Beginning with DB2 10 you could use APREUSE on the REBIND to encourage DB2 to reuse the access paths from the previous BIND or REBIND.  We discuss this feature more here.  If you make use of APREUSE(ERROR) and DB2 re-writes your predicate, you are very likely to fail.  A new option, APREUSE(WARN) will not treat an increase in the number of matched columns for an index as a difference.  This will allow you to achieve a certain amount of stability while still allowing some of the positive changes described above.

It will be good, however, to REBIND applications without any stability options at some point after migrating to DB2 11.

Conclusion

While it is always best to code the “right” way, DB2 11 will bail us out of some common mistakes that hurt performance.  For those more familiar with DB2 for LUW, these rewrites and others have been part of that product for quite some time.

David Simpson
Themis Education
dsimpson@themisinc.com

 

*The SUBSTR function will be transformed to a BETWEEN only if the second argument is a “1”.  DB2 shows the transformation as shown, however, there is actually a low values and high values appended to the endpoints.

1 Like
Recent Stories
Some of my favorite tunings tips and tricks for DB2 by Joe Geller

SQL Tuning for Developers – Part 1

The Key To Mastering Db2 Locking On Row-Store Tables