The Basics of SQL Performance and Performance Tuning

There are two major challenges when it comes to SQL tuning. The first and most important is the quantity of SQL statements issued, and the second is the relative performance of each SQL statement. Ever hear the saying that the most efficient SQL statement is the one that is never executed? Well, it’s true. The biggest challenge in today’s fast paced application development environment is keeping the number of SQL statements executed per transaction to a minimum. In many cases, the development utilizes a framework and service-oriented architecture. Many times database tables are mapped one to one with objects in the data services application layer. Repeated calls to a specific service can likewise result in repeated calls to populate an object from a database table. A greedy design where service calls are made repeatedly within a unit of work can result in an extreme degradation in performance. Many times the database is to blame, but database monitors show extreme quantities of reasonably performing SQL statements. There is very little system or database tuning that can be done in these situations. This is why when designing a new database application a focus of the design needs to be a more conscious approach to design for performance.

A more performance-conscious approach should take two forms; the first focuses on calling the database only when necessary, and the second involves combining multiple calls into a single call. The first is conceptually easy; if a data object is already populated and there is no need to refresh it then the database call should be avoided. This may require some additional tests within the application code, but the results can be very effective. The second may be a bit more dramatic, and it is best to do this only when performance is more of a concern versus speed of delivery and the flexibility of the code. I’ve had great success in designing applications where the focus on multi-table SQL statements was concentrated only on the high-performance component of the application, while normal development was in place for the majority of the application logic. For these high performance components you simply need to look at which tables are being accessed during a unit of work, which columns from the tables are required (if at all), and the relationships between the tables. If application SQL complexity is too much for the application developers, or if there is no interest in placing the complexity within the application code, then stored procedures, triggers, and user-defined functions provide a nice alternative for bundling multiple SQL calls into one call. It’s almost impossible to tune a simple statement against a single table with one predicate against a primary key column, but with multiple table access more tuning opportunities exist, and in general the more complex SQL statement will almost always outperform many simple statements providing the same functionality. This is especially true across a network. For example, I have tested programmatic joins versus the equivalent SQL joins for two tables and realized a 30% performance gain for the SQL join.

More complex SQL statements offer more opportunities for the tuning of these individual statements, but before you set off looking into complex SQL tuning you need to step back and understand the overall performance of your application. This is where active monitoring of your production DB2 system is important. Hopefully you have a tool that can analyze the appropriate DB2 trace output to produce meaningful performance reports. There are also DB2 components, such as the dynamic statement cache, that can be utilized to produce performance metrics. Whatever the case may be, you should be looking at the performance of applications at the application level, user level, and then statement level. It is important that if you are going to be tuning SQL you should tune the one that consumes the most resources, or the one that exceeds a specified SLA, first. Regular monitoring should be a part of application and SQL performance tuning, especially when changes are being implemented. The ability to show a real world result is crucial to getting management support for tuning.

When tuning SQL statements I typically utilize two techniques to determine the potential savings to be realized from a database or SQL change; the EXPLAIN facility and benchmarking. EXPLAIN will take as input a SQL statement and externalize the predicted access path within the database engine. There are several EXPLAIN tables that are populated when a statement is EXPLAINed, so it is best to utilize a tool to analyze the access path. You can, if you are adventurous, write your own query against the EXPLAIN tables, and there are various examples available out on the internet. In addition to providing access path information EXPLAIN also provides statement cost information. This is internal cost information that DB2 uses to determine the access path, and it is also information that you can use to compare the predicted cost of two different SQL statements that are logically equivalent. While comparing the statement cost in EXPLAIN is a good start, it is not always the definitive guide to better SQL performance. So, I generally use cost calculations as one part of the predicted savings of a SQL tuning change. I also look at the access path and perform benchmark testing. The benchmark testing is usually performed using REXX or SPUFI on the mainframe, or db2batch or IBM Data Studio on Linux, UNIX, and Windows. If a test database is properly configured and database statistics updated to match production statistics, then you can get a pretty good estimate of the potential performance savings by running a benchmark test and capturing the performance using a monitoring tool or the statement cache. I always back any proposed SQL performance change with both EXPLAIN and benchmark test results.

Where to Look for SQL Performance Basics

If you need to know the basics of coding efficient SQL some of the best places to look are in the DB2 manuals themselves. Chapter 29 of the DB2 11 for z/OS Managing Performance manual is an excellent place to start. For those of you using DB2 for LUW take a look at section 1, Chapter 1 and Chapter 2 of the DB2 10.1 for Linux, UNIX, and Windows Troubleshooting and Tuning Database Performance. Chapter 3 of that last manual also contains a section very similar to Chapter 29 of the z/OS manual.

Summary

While this has been a high level review of SQL performance tuning, I hope it serves as an inspiration to look for specific solutions to your performance issues. Don’t be afraid to test things and perform some benchmarks.

 

Recent Stories
An Introduction to IBM Data Studio

The Basics of SQL Performance and Performance Tuning

Index Decluttering Opportunities in DB2 for Linux, UNIX, and Windows