Understanding DB2 for LUW Buffer Pool I/O
You will get virtually no argument from anybody that buffer pool management is important to good DB2 and application performance. Large buffer pools are a well known wisdom. There are many ideas on buffer pool management, including the number of buffer pools, and the allocation of objects by the type of I/O, by the volume of activity and by the type of object.
The purpose of this short article is to describe the types of I/O performed by DB2 and how some types of I/O interact with each other. The use of a block based area is described.
Buffer Pool Background
In DB2, there are 2 basic types of I/O: logical and physical. Logical I/O are driven by requests for data by applications, primarily by SQL. If the data needed resides in the buffer pool, it can be returned to the application without physical I/O being required.
When physical I/O is needed, additional work is needed: I/O requests are made to devices, non-SSD disks rotate, and eventually data is returned to the application. An I/O causes delays waiting for it to complete and I/O requests consume CPU resources.
There are two major types of I/O performed by DB2. The first type is random I/O, sometimes called synchronous I/O, where a given data request uses an index to find the appropriate page and retrieve the data required. Random I/O is efficient and is even better when no physical I/O is needed due to the data being in the buffer pool. It indicates that indexes are being used, as DB2 is just reading the data needed to satisfy the SQL. A larger buffer pool means more data can be stored and less physical I/O is needed. More data is kept in memory.
The second type of I/O is sequential I/O, also called asynchronous I/O. As its name implies, sequential I/O requires that all or part of a table space or index be read sequentially to find the data required by an application. This usually means a lot of I/O and processing by DB2 to find the result requested, and this normally means no index use and table or index scans. DB2 uses various forms of prefetch to read more pages in a single I/O and reduce I/O wait time.
Due to the large number of pages read in many prefetch operations, sequential I/O can bring a large number of pages into the buffer pool. This can lower the possibility of finding a page in the buffer for subsequent requests. In comparison to random I/O, sequential I/O brings pages in bulk into the buffer pool. The pages from sequential operations are rarely used again by subsequent operations.
Block based area
My message is simple: sequential I/O can have a negative impact on the residency of randomly accessed pages in the buffer pool. The solution is also simple: define a block based area for each buffer pool. The block based area is typically about 1 percent of the entire buffer pool size. This provides low cost assurance that sequential operations will not lower the residency of randomly accessed pages.
Buffer Pool Illustrations
The following diagrams are meant to show buffer pool activity.
Legend for all diagrams
The boxes are meant to represent buffer pool pages.
Randomly accessed pages are efficiently managed by DB2 using a least recently used (LRU) to identify old pages that are best suited for replacement by new operations. It works very well.
Random I/O and the Effect of Sequential I/O
When sequential activity is introduced to the above buffer pool, many pages can be overlaid by these operations. Even if the pages are old, at their next use, they would need to be read again from disk.
Random I/O and the Effect of Sequential I/O with a Block Based Area
The block based area isolates sequential I/O from random I/O. This allows more randomly accessed pages to remain resident, thereby avoiding physical I/O for more pages.
The block based area will be overwritten frequently by future sequential operations. This should not be a problem as sequential access pages are rarely accessed frequently. The block based area can and should be made quite small – about 1% of the total buffer pool size. I have customers who successfully use about 10000 pages for a block area where the buffer pool size is 2.2 million pages.
Comparing LUW block I/O to z/OS
Random and sequential I/O is performed in the same area. There is a user controllable threshold on the amount of a buffer pool that can be used for sequential operations. Vendors have built an industry by offering products to help customers to size buffer pools and place objects according to their access characteristics. By comparison, there are no products for buffer pool tuning on LUW.
Comparing LUW block I/O to Informix
Informix understands about the impact of sequential I/O on buffer pools. Sequential I/O is automatically sent to a light scan buffer. As such, Informix automatically provides the same benefit as using a block based area in DB2 LUW.
Love and Hate Lists
DB2 uses an LRU algorithm to determine which pages are best suited to be replaced by upcoming I/O operations. There are love and hate lists of pages for those that should be either kept or candidates for replacement by new pages. DB2 is also quite efficient at sequential I/O into non-contiguous pages. While the lists help identify which pages should be replaced first, the real issue with sequential I/O is the large number of pages brought in by many sequential operations. A block based area isolates these pages.
DB2 performance, including buffer pool performance, will be an important topic for the remainder of DB2’s life. Some topics are positive; they make things go faster. Bigger buffer pools fall into this category.
Other performance topics are about removing negative effects. Block based buffers fall into this category, and are important for ensuring overall buffer pool performance.
About the Author
Martin Hubel is an independent consultant and has worked extensively with DB2 since 1985. He has been using DB2 on Linux, Unix, and Windows since 1993 and has participated in all beta test programs for these platforms. He is an IBM Champion for IM, and IBM Gold Consultant, and a member of the IDUG Volunteer Hall of Fame. Martin develops and teaches DB2 courses for all DB2 platforms.