After 30+ years there are few things that people can do or say during my day to day DBA activities that upset me. However, there is still one thing that can get me to walk out of a meeting! That one thing is the meeting in which people sit around and debate….” I think it will work like this”, “That design will never perform properly”, “you’ll have to denormalize the initial design”. Overthinking and trying to predict Db2 performance is a slippery slope, and so rather than “thinking” about Db2 performance why not have the database tell you how it will perform. The key to this is the Db2 accounting trace, and subsequent Db2 accounting reports. The accounting reports are useful for monitoring production performance, but also predicting when REORGs should be performed, and as a part of benchmarking database design choices. In short, accounting reports are an essential tool in all around database development and performance monitoring!
Db2 has an extensive built-in tracing facility that is extremely important when analyzing the impact of various settings, as well as overall performance. There are several types of Db2 traces.
- Statistics trace, which collects performance metrics at a subsystem level.
- Accounting trace, which collects thread level performance metrics and is really an essential key to understanding and improving application performance.
- Performance trace, which is not set by default and can be used to track specific threads or sets of threads at an extremely detailed level.
- Audit trace, which collects information about Db2 security controls and can be used to ensure that data access is allowed only for authorized purposes.
- Monitor trace, which enables attached monitor programs to access Db2 trace data through calls to the instrumentation facility interface (IFI).
Traces can be set upon Db2 startup via the system installation parameters, or they can be started via Db2 commands. Typically Audit, Monitor, and Performance traces are set on demand and for specialized purposes. Statistics and accounting traces are typically set at Db2 startup. The most common settings are statistics classes 1,3,4,5, and 6 and accounting classes 1,2,3,7, and 8.
All of these traces are important and worthy of an understanding, but the focus of this article will be on the accounting trace. You can find descriptions of trace records in prefix.SDSNIVPD(DSNWMSGS).
Db2 Accounting Trace
The accounting trace collects thread level performance metrics and is really an essential key to understanding and improving application performance. The accounting trace records produced are typically directed towards system management facility (SMF) datasets and hopefully your friendly system programmer is externalizing these SMF records for Db2 subsystems into separate datasets for analysis. The accounting trace data these days is typically externalized upon the termination of each individual thread. So, you get one record per thread. This means that for large scale batch jobs, CICS transactions that use protected threads, distributed threads with the ACCUMAC subsystem parameter (this parameter determines whether DB2® accounting data is to be accumulated by the user for DDF and RRSAF threads) set at something other than NO, or high performance DBATs, you can get multiple transactions bundled together into the same accounting trace record. By default, Db2 only starts accounting trace class 1, which I believe is seriously not enough to accurately diagnose and track the performance of applications. My recommended accounting trace classes to set are 1,2,3,7, and 8.
- Class 1 = Total elapsed time and CPU used by a thread while connected to Db2 at the plan level.
- Class 2 = Total elapsed time and CPU used by a thread within Db2 at the plan level. This is a subset of class 1. The elapsed time is broken into suspension time and CPU time.
- Class 3 = Total time waiting for resources in Db2 at the plan level. This is a subset of the class 2 time and is equal to the class 2 suspension time.
- Class 7 and 8 = These are, respectively, the class 2 and 3 times divided between packages utilized during thread execution.
Reporting on Db2 Accounting Data
Unless you are in possession of some Db2 SMF reporting software, creating meaningful accounting reports can be a challenge. IBM does provide a Db2 SMF report utility, called DSN1SMFP, which can be invoked via a standard batch z/OS job to read uncompressed Db2 SMF data. A Google search on DSN1SMFP will lead to plenty of information if no other commercial SMF reporting software is available for use at your site. Most Db2 for z/OS customers have a commercial SMF reporting product in house, which makes reporting much easier.
One very important thing to consider regarding Db2 accounting trace data is that it does not matter much if you cannot identify applications to monitor and tune. Reporting software allows for filtering and reporting on a variety of fields, the most important being primary AUTHID, correlation ID, and plan name. Using these fields to identify different applications allows for filtering on a specific application so that they can be monitored over time. It is important to establish baseline performance metrics for the applications you are interested in monitoring and tuning. If the application is remote then the client workstation values set by default or by the application can also be used for report filtering.
The standard accounting long report will contain an incredible amount of information, and this can be very overwhelming. However, by concentrating on a few important numbers, especially over time, will tell a lot about the health of an application and the database it is accessing. Here are some fields that can be found on an accounting long report in order of relative importance, in response to common concerns regarding elapsed time and CPU consumption. These priorities may vary depending upon what specific site concerns in regards to elapsed time, CPU, logging, or concurrency (locking).
- Class 1 and 2 elapsed and CPU time. Ever get the call that says, “Hey my application is not responding, what’s wrong with Db2?” Well, get your hands on an accounting report! The first thing you should look at is the class 1 and class 2 elapsed and CPU times. Is there a system resource being constrained? Is the application spending the majority of class 1 time doing class 2 things? If yes, it’s a Db2 issue and if no, then it’s not a Db2 issue. What portion of the class 2 time is elapsed time versus CPU time? If class 2 CPU is a minor portion of the elapsed time, then there may be class 3 suspensions that need to be investigated. If the majority is CPU time, then you need to look at the quantity of SQL issued and perhaps even divide the number of statements into the CPU time to get an average. If the average is relatively high, you likely have an opportunity to tune some statements, but if the CPU time per statement is very low, you may have a quantity of SQL issue.
- Class 3 synchronous I/O suspension time. If time spent here is excessive then you need to look at the clustering of the data in your tables and the input to your transactions. Buffer tuning can help, as well as additional indexes. There are many opportunities here for SQL and database tuning. The first thing to do is identify the objects and SQL statements involved. If separate buffer pools are in use for the application then examining the individual buffer information in the accounting report could help narrow it down to certain table space or indexes. If synchronous log I/O write time is a relatively high contributor to total I/O time it may indicate a problem with updates and free space allocated in table spaces. Row relocations are potential contributors to synchronous log I/O as well as synchronous write I/O in general.
- Class 3 Lock and latch suspension time. This indicates the number of locking and latching events that incurred a suspension, as well as the time spent in suspension. A little bit of time spent here is quite normal, but if locking represents a majority of the elapsed time then you have a concurrency issue that needs to be addressed. Look for additional information such as deadlocks and timeouts, and check the console log for messages related to logging conflicts. Sometimes excessive latch time can be an indication of limited CPU resources, so don’t be fooled by this. Double check against unaccounted for time (class 2 time elapsed time – class 2 CPU time – class 2 suspension time) and if there is significant unaccounted for time associated with excessive latch wait time then that is likely due to the system being CPU constrained during the thread execution.
- Class 3 global contention time. Any significant values here are going to indicate a concurrency problem with a resource across members of a data sharing group. Look for things such as identity columns and sequence objects (increasing cache will help) or frequently updated tables shared across applications (system affinities or alternate locking sizes or strategies may be required). Large values here could also indicate undersized coupling facility resources.
- Time spent in routines (nested class 1 and class 2 times). If the application is utilizing stored procedures, user-defined functions, and triggers, then the time spent in these routines is broken out from the remaining Db2 and application time. This is important if you are using these routines for processing, and with an accounting report you can attach a cost to that processing. Also, if one or more routines are having a performance issue, these numbers can indicate as much and you can then direct your attention to the package level information if it exists for the routines (always for triggers and stored procedures).
- Other class 3 I/O wait time. While asynchronous I/O’s are more expensive than synchronous I/O’s, applications typically do not wait due to the fact that the asynchronous I/O’s are independent of the application process. If there is excessive time spent here, there may be a DASD performance issue or some buffer pool separation (separating sequentially accessed objects from pools that contain synchronously accessed objects), and size increases may be required.
- SQL activity and commits. These are good counts to gauge how much work an application is doing and how that work changes over time. I typically divide elapsed and CPU time by SQL and commits to get a “per unit” measurement that can be used to direct me to specific tuning of statements, REORG opportunities, or adjustments to the size of the units of work. CPU trending over time up for a relative stable number of statements and commits is a good indication that a REORG is needed somewhere.
- Deadlocks and timeouts. These counters should be zero, but if they’re not you can go to the console log for the period of time the thread was active to look for messages that will detail the conflicts.
- Buffer pool statistics. The buffers are one of the most important resources in a Db2 subsystem, and buffer pool usage statistics are delivered in an accounting trace at the thread level. It is also possible to gather these statistics at the package level if the appropriate trace class and IFCIDs are set (although there may be a higher trace cost). The information presented here is extremely valuable to see how an individual application is using (and impacting) the buffers. These statistics can be used together with the statistics trace buffer information and the information derived from using a DISPLAY BUFFERPOOL command to get a really concise picture of the buffer activity! You can also use the buffer activity to correlate total I/O activity with a certain buffer to more closely identify which database objects are an impact to performance.
- Package level statistics. When accounting trace classes 7 and 8 are active various elapsed, CPU, SQL, and suspension time is gathered at the package level. This is extremely valuable information for static embedded applications in that you can identify which programs should get the most tuning attention (the programs that use the most resources). These package level statistics will also report on triggers, stored procedures, and non-inline and external UDFs used by the application.
Testing Application and Database Ideas
Now, back to those meetings where everyone is guessing on how Db2 will perform based upon certain design decisions. Well, it doesn’t have to be a guess. It is quite simple, and not time consuming at all for a DBA, perhaps teamed up with a developer, to set up and monitor some tests to prove out those ideas. The accounting report plays an extremely valuable role in monitoring these tests. Contrary to what might be popular belief, it takes only a modest effort to set up some tables and populate them with test data. There are a variety of testing tools at your disposal; REXX, SPUFI, DSNTEP2, db2batch, and IBM Data Studio, to name a few. Run a test, produce an accounting report, change something, repeat. No more guessing as Db2 then tells you the best design choice for your application and database. I have done these dozens, if not more, times during my career. This link can help get you started. Simple DBA Tools for the Db2 DBA
Spend a little time and effort learning how to produce Db2 accounting reports with whatever tool you may have available. Then set some metric baselines for the most important or most resource intensive applications you have, and start monitoring! Use Db2 accounting reports during design and testing to build a performance appropriate design from the start. It will save considerable firefighting resources upon deployment.