My favorite performance items in DB2 LUW v9.7 - #4: statement concentrator

The problem

When processing more complex SQL statements, DB2 can often exploit literal values in the statement, to produce a more efficient access plan.  But for simple (and particularly, repetitive) SQL, the scope for this kind of optimization is minimal, and the use of literals can add both CPU and memory overhead.  Consider the fictional series of statements

SELECT c1 FROM t WHERE c2 = 1
SELECT c1 FROM t WHERE c2 = 2
SELECT c1 FROM t WHERE c2 = 3
:

Even though they differ by only the literal being compared, DB2 will compile them separately, and store them separately in the dynamic SQL cache.   Imagine how much CPU and memory could be consumed by 100 or 1000 or 10000 similar statements!   This kind of situation often comes up when SQL statements are built on the fly, by concatenating SQL keywords and literal values.

Note that I use SELECTs in the above example, but probably more common are bulk INSERTs, with literals used in the VALUES clause.  The same issue of compilation & memory overhead exists here too:

INSERT INTO t (c1, c2) VALUES (1,1)
INSERT INTO t (c1, c2) VALUES (1,2)
:

The preferred way to handle this is to prepare a single statement using a parameter marker, and execute it multiple times, passing in the required sequence of values:

SELECT c1 FROM t WHERE c2 = ?
INSERT INTO t (c1, c2) VALUES (?,?)

This results in only a single SQL compilation, and a single statement in the dynamic SQL cache, for each statement with parameter markers - a huge savings!   However, if you've got an application using literals inefficiently, it might not be easy or even possible to change it to use parameter markers.

Enter the v9.7 statement concentrator

The statement concentrator is enabled at the database level by setting the configuration parameter STMT_CONC to LITERALS.  It can also be explicitly enabled or disabled at the connection & statment levels for CLI/ODBC applications.

When enabled, the statement concentrator essentially looks for statements which differ only by literal values.  When it finds them, it uses a modified statement syntax containing a host variable (similar in behavior to a parameter marker) instead of the literal, to transparently simulate the execution of any matching statements with literals.   So our series of literal-heavy SQL above would collapse to one version of each statement in the dynamic SQL cache.  In fact, we can see the effect of this by looking at the STMT_TEXT column of MON_GET_PKG_CACHE_STMT, which would show something like the following for our test statements:

SELECT c1 FROM t WHERE c2 = :L0
INSERT INTO t (c1, c2) VALUES (:L0, :L1)

Interestingly, if we use the older-style GET SNAPSHOT FOR DYNAMIC SQL instead of MON_GET_PKG_CACHE_STMT, we'll see entries for both the original versions of the statements with the literals, and the modified versions produced by statement concentrator.

Rules of thumb for using statement concentrator

How much benefit can we expect from statement concentrator?   Like most performance questions, the answer is, 'it depends'.  If your system is laden down by literal-heavy 'almost duplicate' simple statements, the benefit can easily be 10-20% or more.  The best way to see if this is the case is to look at the statements in MON_GET_PKG_CACHE_STMT - the more 'almost duplicates' there are, the more statement concentrator is likely to help.  

Statement concentrator currently defaults to OFF.  Should you turn it on, even without some indication it will help?  Possibly, but I wouldn't really recommnd it.  Statement concentrator processing may be less expensive than the SQL compilation it looks to replace, but the process of generating the replacement statements by substituting host variables for literals - essentially 'looking for work' -  is not free.  If, in your environment, you don't have many of these 'almost duplicate' statements, statement concentrator will invest work in preparing to substitue statements that aren't there.

Is the performance with statement concentrator 'fixing' poor SQL as good as if the SQL used parameter markers in the first place?  Probably not quite.  If the application uses parameter markers in the first place, DB2 can avoid having to generate replacement statements & search for matches.  Likewise, a clever application can prepare the SQL statement (with parameter markers) once and then use it many times without preparing again - whereas statement concentrator relies each statement variant being PREPAREd (i.e., compiled) as part of the subtitution process.

So, for top performance of repeated, simple statements, the use of parameter markers is still recommended.  But with v9.7 statement concentrator enabled, you now have the option of letting DB2 transparently find & replace similar statements, providing a very easy way to cut overhead and improve performance.

1 Like
Recent Stories
Tips for getting the best INSERT performance

Statistics Event Monitors - terrible name, terrific feature

Measuring OVERHEAD and TRANSFERRATE on DB2 LUW - the why & how