Columnar Data Engine (CDE) Storage New Feature

Posted By: Ron Liu Technical Content,

WRITTEN BY RON LIU


Since Db2 11.5.4 which itself laid a new foundation for how columnar inserts were handled by the columnar data engine (CDE), many new features for columnar tables to minimize space usage and to improve insert, update and LOAD performance were added during 11.5 mod packs. The following table describes the features and when they first became available. Because the traditional Db2 form factor supports fallback, the table shows when a feature needs to be explicitly enabled if a feature changes the on-disk table structure and is not compatible with fallback. Db2 warehouse does not support fallback so new features are always enabled by default. 

For Db2 customers, if you are on 11.5.7 or later and have no plan to fallback prior to 11.5.7, it is recommended that you set the following registry variables together for overall best performance and compression.

 

  1. db2set DB2_COL_INSERT_GROUPS=YES
  2. db2set DB2_COL_STRING_COMPRESSION="UNENCODED_STRING:YES"
  3. db2set DB2_COL_SYNOPSIS_SETTINGS="DEFER_FIRST_SYNOPSIS_TUPLE:YES"

 

More information regarding these registry variables can be found here:
Column-organized table variables


Feature

Description

Release

Db2

Db2 Warehouse

Page-based String Compression Type 1

Improves compression for high cardinality string columns with repeating portions of strings that are not encoded by existing compression algorithms

11.5.4

Need Registry Variable to Enable

Registry Variable (2)

Enabled by Default

Page-Based String Compression Type 2

Improved compression when strings (within a page) contain 16 or less unique characters, works well for hex, numeric items like phone numbers, dates, dollar values

11.5.4

Need Registry Variable to Enable

Registry Variable (2)

Enabled by Default

Deferred Synopsis Tuple Creation for Small Base Tables

Reduces synopsis table storage consumption for small base tables

11.5.4

Need Registry Variable to Enable

Registry Variable (3)

Enabled by Default

Reorg Table Recompress Enhancement

Improves performance of Reorg Table Recompress and applies page-based string compression during Reorg Table Recompress

11.5.5

Enabled by Default

Enabled by Default

Trickle Feed Insert Enhancement

Speeds up trickle feed insertion, reduces the memory footprint and size of small tables.

11.5.6

Not available

Enabled by Default

11.5.7

Need Registry Variable to Enable

Registry Variable (1)

Enabled by Default

LOAD Utility Enhancement

Improves overall LOAD processing, and also if the previously mentioned string compression algorithms are enabled allows LOAD to use them

11.5.8

Enabled by Default

Enabled by Default

More details of these features and enhancements can be found in the Db2 release notes:

Db2 V11.5.4

https://www.ibm.com/docs/en/db2/11.5?topic=1154-compression-storage-enhancements

Db2 V11.5.5

https://www.ibm.com/docs/en/db2/11.5?topic=1155-compression-storage-enhancements

Db2 V11.5.7

https://www.ibm.com/docs/en/db2/11.5?topic=1157-compression-storage-enhancements

Db2 V11.5.8

What's new in Compression and storage enhancements for Db2 11.5.8

Best Practices for CDE compression and storage can be found with the URLs below.

Best Practices to Compress Db2 Column-Organized Tables:

IIAS: https://ibm.biz/BdPKP2

Other platforms: https://ibm.biz/BdPKPz

Best Practices for Storing Data in Column-organized Table:
https://community.ibm.com/community/user/hybriddatamanagement/viewdocument/best-practice-for-storing-data-in-c?CommunityKey=71ceaea3-db2c-451d-87d1-51f254454c6a&tab=librarydocuments


Ron Liu is a long time IBMer and a developer in the IBM Db2 Columnar Data Engine (CDE) Data Service team. Prior to his current role, he was a developer for IBM Federation Server, performance engineer for IBM Information Server, performance engineer and Java developer for the IBM data integration products in cloud. In his free time, he loves to go hiking in the nature and do wildlife photography.