Scaling DB2 BLU Horizontally with DB2 11.1
By: Joe Geller & Ian Bjorhovde
The DB2 BLU Engine, which introduced column-organized tables, was released with DB2 10.5 and greatly enhanced in 10.5 Fixpack 4. Even though the performance of analytical queries showed tremendous speedup in comparison to row based tables, there were still many improvements and enhancements waiting to be implemented. Chief among these was support for BLU in a DPF (Database Partitioning Facility) environment. Many data warehouses use DPF to support very large volumes of data by putting the power of massively parallel processing (MPP) to work with parallel processing of queries. With DPF, tables are split with the data rows distributed across multiple database partitions residing on one or more physical servers. Each of these database partitions will process a portion of a query in parallel, with the final result set combined at the end and returned to the client.
DB2 11.1 introduces BLU MPP – support for BLU tables within a DPF environment. This greatly increases the scalability of a BLU system. If you have been paying attention to IBM’s development of dashDB, you may be aware that this cloud-based offering has had the BLU MPP code available for a number of months in its Enterprise MPP.4 offering. DB2 11.1 for Linux, UNIX and Windows brings this work to on-premise databases.
Scaling Out
In DB2 10.5, the recommendation was that BLU could support up to approximately 10TB of raw data, by scaling vertically – increasing the amount of memory and CPU in the server. With DB2 11.1, the recommendation for a single-node system has increased to approximately 20TB of data, due to optimizations in the columnar engine. These are practical limitations based on the number of CPUs and the amount of memory you can put in a single physical machine. While it is possible to use BLU on a single machine with hundreds of CPU cores and terabytes of RAM, this becomes cost prohibitive very quickly.
With the database partitioning feature, you have the ability to scale horizontally across multiple machines. This provides the ability to have many more processors working in parallel and significantly more memory.. BLU works best with large amounts of memory for bufferpools and sortheap, so for a given sized database, even a small cluster can have much larger total bufferpool space which will lead to a reduction in I/O.
BLU MPP scales nearly linearly in either query performance (throughput) or in database size. For example, a 100-partition BLU system could handle a 2 Petabyte database with roughly the same performance as a 20 Tb BLU database on a single database partition.
MPP – a misnomer?
DPF environments run the gamut from 4 partitions to a maximum of 999 partitions. 100 or more partitions certainly qualifies as massively parallel, but a database with “only” 4 – 20 partitions hardly seems massive. However, you do still get the benefit of scaling and parallel processing (inter-partition parallelism, in addition to BLU’s intra-partition parallelism).
Why DPF (MPP) with BLU?
Many data warehouse (and analytics in general) shops (as well as some OLTP shops) use DPF for its performance, scalability and manageability. It is a shared-nothing environment, so each partition has its own CPU, memory and disk dedicated to its portion of the database. Individual queries against very large tables become feasible because of the parallel processing. Analytical processing is also the strength of DB2 BLU with its columnar table organization, SIMD parallelism, actionable compression and synopsis tables.
DPF shops will benefit greatly by gaining the ability to use BLU. In environments with very large databases, the sheer volume of data has prevented many of them from being able to adopt BLU. Migrating from a multi-partition DPF database to a single-partition BLU database added additional challenges to the migration. However, with DB2 11.1 they will be able to get past these restrictions. The combined performance strengths of both will provide tremendous performance against very large databases.
How BLU Integrates with DPF
The biggest gains in performance in BLU occur when DB2 is able to perform processing inside of the columnar BLU engine. In order to maintain the performance of the BLU engine when adding support for DPF, it was imperative that DB2 be able to keep data inside the columnar engine, even when sending data between database partitions.
“Actionable compression” is the term IBM uses for the compression that is applied to each column – it uses frequency-based compression (Huffman coding) and other techniques to achieve very high compression ratios. It is “actionable” because predicates can be applied to the column without uncompressing the data. With multi-partition tables, DB2 does not generate unique compression dictionaries for each partition – each table has a single dictionary that is replicated across all database partitions. This ensures that DB2 does not have to uncompress the table’s data when sending data between partitions.
When a row is inserted into the table, DB2 performs the following steps:
- Apply hashing function to partitioning key columns
- Send row to correct database partition
- Decompose row into columns
- Compress individual column values in insert buffers
- Flush insert buffers to bufferpool for each column
- Write Log Records
- Update synopsis tables
Steps one and two occur on the coordinator partition, and the remainder of the steps happen on the local database partition where the data is stored.
The other technologies that BLU introduced – data skipping, SIMD, core-friendly parallelism and in-memory caching – are still leveraged on each database partition - in parallel.
Other Advances
For hash partitioning to work, you must specify a partitioning key for each table. DB2 uses the column(s) in the partitioning key to determine the database partition. Choosing a proper partitioning key is quite important because it has significant performance implications. A poor choice can lead to data skew (where certain database partitions have significantly more or less data than other partitions), which reduces performance. With DPF, a query only runs as quickly as the slowest database partition.
DB2 11.1 added the ability to use a random value for partitioning a table, instead of relying on one or more columns. This guarantees even distribution of data across all database partitions to maximize parallel processing on the database partitions. However, keep in mind using a random hash partitioning key will preclude DB2 from using collocation when joining tables, so if you are frequently joining multiple fact tables or large, partitioned dimension tables, using a random distribution key may not be the best choice.
Conclusion
Enabling BLU to run in partitioned database environments will open the door for customers with massive databases who weren’t previously able to consider BLU, and will also enable customers already using BLU to expand their capacity well beyond the current single-partition limitations.