Potential SQL Performance Improvements in DB2 11 for z/OS
By Daniel L Luksetich, IDUG DB2 11 White Paper Editorial Committee
As with any new release of DB2 for z/OS there comes a host of SQL performance improvements. Some of these you get straight away, some with a bind or next prepare and some require a change in the SQL syntax. There are several key performance changes in the area of query transformation, which will enable several types of predicate constructs that are stage 2 non-indexable predicates in previous versions to stage 1 indexable in DB2 11. This includes the potential index matching for predicates containing YEAR, DATE, and SUBSTR functions (beginning with column 1) against table columns. There is also improved index matching for some predicates containing “OR IS NULL” and some compound predicates containing IN and OR. One very exciting enhance is the ability for Db2 to push predicates into materialized view and table expressions, which has been a limiting factor in many application and database designs. DB2 11 extends the usage of hash join, a long time staple on DB2 for LUW, and has had very limited usage in prior releases of DB2 for z/OS.
Improved Unique/DISTINCT and GROUP BY processing comes for free in DB2 11 for z/OS
One feature that the IDUG team has been investigating is improvements to Unique/DISTINCT and GROUP BY processing. This is an enhancement that will be realized the next time a statement is compiled (prepare or bind) after migrating to DB2 11, and the improvements can be significant, but it’s not something that is easily detectable by the average developer or DBA.
Certain SQL statements will be impacted, such as:
- SELECT DISTINCT
- GROUP BY
- Single MAX or MIN aggregate function
- GROUP BY with a single MAX or MIN function
- Certain non-correlated subqueries (not transformed)
The way this is going to work for DISTINCT is that prior to DB2 11 the duplicate values from non-unique indexes (if a non-unique index was used and provides order) were read by the index manager and passed to sort where sort did not actually sort, but instead eliminated the duplicate values. Now, with DB2 11 those duplicate values will be removed by the index manager (again, if a non-unique index that provides order is used) and fewer values passed on in the query processing. Similarly for GROUP BY prior to DB2 11, it was the DB2 runtime component that removed the duplicate entries if the index was scanned in order and sort was not required. In DB2 11, index manager will skip over the duplicates and avoid passing them to later stages of query processing. These cases can change the cost analysis within the optimizer, which can influence things such as index selection and table access sequence. So, access paths can change as a result of this enhancement.
How do you know when this is happening? Well, it’s not so obvious since for most situations it will not show up in a PLAN_TABLE (it may for non-correlated subqueries, but I haven’t found a good example yet in my testing). The evidence is actually in the explain table called DSN_DETCOST_TABLE. In versions prior to DB2 11 the sort was not exposed in the PLAN_TABLE, but for DISTINCT, there was a row in the DSN_DETCOST_TABLE representing the cost of sort removing the duplications. For GROUP BY with sort avoided, there wasn’t a row in any explain table, since the sort was avoided. With DB2 11 that extra cost (and row in DSN_DETCOST_TABLE) is removed from the EXPLAIN output. In its place will be a value set in a new column of DSN_DETCOST_TABLE called IXSCAN_SKIP_DUPS with a value of “Y” indicating that this new processing will occur.
A simple test of the following query against the DB2 sample table EMP proves the change if the non-unique index on the WORKDEPT column is used for the access path:
SELECT DISTINCT WORKDEPT from EMP;
In the case of the query EXPLAINed under DB2 10 the DSN_DETCOST_TABLE explain table contains two rows, one row representing the cost associated with passing all the qualifying rows from the index into the sort process at run time. Notice the number of rows processed is 42, which is the number of entries in the index.
SCCOLS SCROWS SCRECSZ SCPAGES
1 4.200000E+01 4 3.307086E-01
Under DB2 11 there is only one row in DSN_DETCOST_TABLE reflecting the number of rows processed as 1 and indicating index skipping has occurred. All “SC” columns are set to zero as no runtime sort processing will be invoked.
As mentioned before this change in cost will have an effect on access path selection, and of course it will result in a reduction in CPU consumption for impacted types of SQL constructs.
For more information on this subject, please look out for the full IDUG DB2 11 Technical White Paper which will be published later in 2013.