DB2 11.1 Performance – A Query Optimization Perspective
By now, most DB2 for Linux, UNIX and Windows users have probably heard about the release of DB2 11.1 that became generally available in June of this year. If not, you can read some good blogs about it on the IDUG website, especially these ones:
DB2 for LUW V11.1 Coming Soon to a Server Near You
Scaling DB2 BLU Horizontally with DB2 for LUW V11.1
DB2 for LUW V11.1 BLU Performance
In this article I’d like to provide more information on some performance optimizations in DB2 11.1 that will benefit queries accessing both row and column-organized (BLU) tables. These performance optimizations help queries that contain correlated sub-selects that can be very expensive to execute. Let’s look at a specific example:
UPDATE CUST C SET LOGIN =
(SELECT C_LOGIN FROM CUST_STAGING CS
WHERE C.CUST_ID = CS.CUST_ID)
SELECT 1 FROM CUST_STAGING CS2
WHERE C.CUST_ID = CS2.CUST_ID);
This UPDATE statement includes a common pattern where a staging table contains changes to be applied to a target table. The SET clause contains a scalar subquery that locates the new value of the column based on a matching column (CUST_ID in this example). Since multiple rows are being updated, an EXISTS subquery predicate is included to ensure that only rows in the target table that have matching keys in the source table are updated. Prior to DB2 11.1, the optimizer would choose to execute this statement by first joining the CUST and CUST_STAGING tables to determine the CUST_IDs of the CUST rows that should be updated. The plan would then probe the CUST_STAGING table for each CUST_ID to retrieve the source LOGIN value. If more than one row for a given CUST_ID value existed in CUST_STAGING, an SQLCODE -811 would be returned, indicating that the scalar subquery produced more than one row. The first step in query DB2 query optimization is to rewrite the query to an internal form which looks like this:
You can see the full SQL text for the internal, or transformed, version of the SQL by checking the “optimized” SQL shown in the explain facility.
The next step in query optimization is generate the access plan, which specifies the runtime operations that are to be used to execute the query. The access plan corresponding to this optimized SQL is:
The access plan above is produced by the db2exfmt command line tool. It contains the same information as the Data Server Manager (DSM) Visual Explain, except in a plain text form. The number above each operator name is an estimate of the number of rows processed by the operator. The CTQ, or Column-organized Table Queue, operator transfers data between column and row-organized processing. Operators below the CTQ process column-organized data using one set of DB2 sub-agents. The CTQ converts the columnar data to row format and passes it to a different set of DB2 sub-agents. DB2 sub-agents can be working on both sides of the CTQ at the same time in parallel, allowing multiple processors cores to be used to improve query performance. In this particular example, 2 CTQs are necessary because of the correlated sub-select. The EXISTS subquery is implemented as a hash join (HSJOIN) below CTQ (7) and the scalar-sub-select is implemented as a table scan (TBSCAN) below CTQ (11). CTQ (11) is executed for every row returned by CTQ (7), so about 12 million times. As you can imagine, this is very expensive. Another performance issue is that SORT (5) is done using row-organized data. DB2 11.1 can now sort data in columnar format, so this is another missed performance opportunity.
The correlated sub-select can also be expensive to execute when row-organized tables are accessed. The access plan for this same example using row-organized tables shows a similar pattern. Because of the correlated access, TBSCAN (9) on the inner of NLJOIN (7) is executed 12 million times. An index could be created on CUSTOMER_STAGING_R.CUST_ID to improve performance, but this will still be an expensive operation.
DB2 11.1 uses a different query transformation for this same UPDATE statement and the optimized SQL looks like this:
There are a couple of important differences. One is that the correlated sub-select has been replaced by a join, so now all 3 table accesses are joined in the same sub-select. The second is that a GROUP BY operation with a COUNT has been added in order to count the number of duplicate rows returned from the CUST table. If each probe of the scalar sub-select returns at most 1 row, no duplicate CUST rows should be returned. If some duplicate rows are returned, then an SQLCODE -811 should be raised. The count for each CUST row id is checked in a CASE statement in the next sub-select which uses a RAISE_ERROR function to return the -811 error if the count is greater than 1.
Let’s look at the access plan for this new version of the optimized SQL:
The first thing to notice is that there is only 1 CTQ and all major operators occur below it, which means that all processing is done on columnar data. The second observation is that the expensive correlation is gone. Additionally, the SORT operator used in the previous version of the plan is replaced with a UNIQUE operator in order to remove duplicates returned by the joins. DB2 BLU processing implements duplicate removal using an efficient hashing operation so sorting isn’t necessary. Finally, an extra GRPBY operator is added to count the number of duplicate rows returned from the CUSTOMER table in order to determine if an SQLCODE -811 should be raised. The additional overhead to execute this GRPBY is more than offset by the savings from avoiding the expensive correlated scans of CUSTOMER_STAGING.
The DB2 11.1 access plan for the row-organized version of this example shows a similar improvement. The expensive correlation is removed and an extra GRPBY operator is introduced to catch the possibility of more than one row being returned by the scalar sub-select.
This decorrelation optimization can be used in other types of SQL statements too. For example:
SELECT CUST_ID, LAST_NAME,
(SELECT CITY FROM CUST_ADDR CA WHERE
C.ADDR_ID = CA.ADDR_ID)
FROM CUST C;
This SQL statement contains a correlated scalar subquery in the SELECT list. Prior to DB2 11.1, this also resulted in an expensive correlated access plan. Again, let’s compare the old and new access plans:
Prior to DB2 11.1, all the rows from the outer table (CUSTOMER) needed to be converted from column to row format and then CTQ (5) needed to be executed 12 million times to compute the scalar subquery for each outer row. The DB2 11.1 access plan is much more efficient:
Notice how there is only 1 CTQ and it is above the access to both tables. This is because the scalar subquery has been converted to a join. As in the previous example, a GRPBY operator is introduced to detect duplicate rows produced by the scalar subquery in order to log an SQLCODE -811.
It is also worth noting that the expensive correlation could have been avoided to begin with had a primary or unique key constraint been defined on the columns used to locate the matching rows in the scalar subquery. Specifically, this would’ve been the CUSTOMER_STAGING.CUST_ID column in the first example and the CUSTOMER_ADDRESS.ADDR_ID column in the second example. Although indexes can’t be created on DB2 column-organized tables, it is possible to create enforced primary and unique key constraints which will result in a unique index being implicitly created. However, it isn’t always possible to add these types of constraints because of application or schema design limitations, which is why these query transformations were introduced.
These optimizations are just a few of the performance improvements available in DB2 11.1. Give it a try and see for yourself!