Introducing Db2 Click to Containerize

Posted By: George Baklarz Technical Content,

WRITTEN BY GEORGE BAKLARZ AND PHIL DOWNEY


Many customers are looking at modernizing their databases by moving them into a containerized environment such as OpenShift, Kubernetes, or Cloud Pak for Data. What usually makes a DBA nervous is the complexity involved in moving a database into this environment. You need to:

  • Upgrade your database to Version 11.5.x since that is the release level of the Db2U container.
  • Learn how to work in the containerization environment
  • Rebuild the database with export and import commands

Getting to the end goal of containerizing your database takes a lot of time and effort! To streamline this process, a new utility called Db2 Click to Containerize (also known as Db2 Shift) was developed to help customers move their databases quickly and easily from an on-premises Db2 running Linux, to any of the popular containerization environments.

Some benefits of Db2 Shift are:

  • Automated, fast, and secure movement of Linux databases to Hybrid Cloud
  • Massive reduction in time to containerize database workloads
  • The ability to move a database without the need to unload, export, de-crypt, or backup the database
  • Automatic upgrades from Db2 Version 10.5 and 11.1 to the latest version (11.5.7) of Db2
  • Shifting of all database settings and objects, including external functions located in the Db2 library path
  • Row, Columnar, and Encrypted databases can be moved
  • OLTP, SMP, and MPP databases can be moved (excluding pureScale installations at this time)
  • Easy setup of HADR servers for staged migration

The Db2 Shift program allows a customer to shift their current databases to one of four platforms:

  • OpenShift cluster
  • Kubernetes cluster
  • Cloud Pak for Data
  • Another Db2 instance on premise, on Cloud, or in a Virtual Environment

In addition to directly shifting the database from one location to another, the Db2 Shift program also provides the ability to clone a database for future deployment. This feature is useful for environments where the target server is air-gapped, or unavailable for direct connection from the source server.

Finally, the Db2 Shift program has two modes of operation. For expert users, the Db2 Shift command can be issued with the appropriate options and run directly from a command line or a script. For those users who require more help, the program can also be run in an interactive mode, with detailed instructions and help for the various shift scenarios.

The following figure illustrates the rich deployment options you get with the Db2 Shift utility.


Database Requirements

A Db2 database can be moved if the following conditions are met:

  • The database resides on a Linux server (X64 or Power Linux LE)
  • The database was created with Automatic storage
  • The database is an OLTP, SMP, or MPP system
  • Row or Column mode storage, including encrypted databases
  • Mirror, Archive, and Overflow Logs use Disk only
  • User Defined Functions/Procedures located in the SQL lib Directory
  • Db2 Version 10.5, 11.1, or 11.5 servers can be moved and upgraded at the same time

The following features are not currently supported:

  • pureScale Feature (Not available yet in the Db2u container)
  • Text Extender

There are a few configuration settings which cannot be shifted:

  • Only databases created with automatic storage are supported
  • The system contains external procedures which are not in the standard Db2 library - these will need to be manually recreated and catalogued
  • The LOGARCHMETH1/2 setting only supports DISK as a target in Db2U
  • The database encryption keys will be moved to the new location, but if the target already has encrypted databases, then you will need to manually migrate the encryption key to the target location

Program Overview

The Db2 Shift program is a single executable Linux program (8M in size!) that can be installed in any directory. The program itself is self-contained and does not require any additional libraries to run. The program can be removed by simply deleting the file.

The following Linux environments have been tested as source systems:

  • Linux X64, CentOS (6,7,8), CentOS Stream, Red Hat (6,7,8) Ubuntu 18.04, 20.04, SUSE 11.4
  • Power Linux LE RHEL 8 Source and Target

The program provides options to perform the following operations:

  • Shift a Db2 database to OpenShift, Kubernetes or CP4D
  • Shift a Db2 database to another Db2 instance
  • Create a Cloned copy of the Db2 database for later deployment
  • Deploy a clone into an OpenShift, Kubernetes or CP4D container
  • Deploy a clone into another Db2 instance
  • Initialize HADR between Source and Target POD
  • Initialize HADR between Source and Target Instance
  • Initialize DMC and LDAP Authentication for CP4D
  • Copy Cloned databases to a POD

The shifting of a database from one environment to another requires connectivity between the servers. The process by which Db2 Shift moves data requires a connection to either an OpenShift/Kubernetes/CP4D cluster, a server-less ssh connection, or a local connection.

The Db2 Shift operation must take place under the userid of the INSTANCE owner of the database being shifted. The instance owner must also have ssh server-less connectivity to the TARGET system if a Db2-to-Db2 instance shift is being performed.

To access the TARGET pod in a cluster, the user must have authenticated to OpenShift or Kubernetes and have access to the POD that Db2 is running on.


Db2 Shift User Interface

Db2 Shift can be run either as a traditional command line utility, or as an application with a menu system. The Db2 Shift menu system provides an easy-to-use interface for generating the appropriate shift commands and includes extensive help on the various parameters that need to be supplied.

The Db2 Shift UI is based on a character-based terminal display, like VT100 or 3270 technologies. Using a character-based display format eliminates the need for a graphic display environment (GDM) and significantly reduces the memory requirements for the program. When using a terminal-based UI, only keyboard entries can be used to navigate the screen (instead of mouse movements).

Main Screen

The main Db2 Shift screen provides access to all the scenarios mentioned in the previous section.

 

The Db2 Shift Help provides a general overview of the Db2 Shift program and details on every Db2 Shift scenario. The keyboard help provides a quick guide on how the keys work and the Syntax details provides information on every setting that Db2 Shift uses.

Help Panels

Help screens are available throughout the program to help guide through the shift scenarios. An example of a help screen in shown below.

 

This will result in detailed information about the field being displayed on top of the existing panel.

 

Shift to Db2U on OpenShift or Kubernetes

The following screen is used to shift a Db2 database to a Db2U pod that is running on OpenShift, Kubernetes, or Cloud Pak for Data.

When the user has entered all the required information, pressing the Review key will display the command line version of Db2 Shift that you could use instead.

Summary Screen

Once the user hits the execute key on the summary screen, the Db2 Shift utility will begin the process of shifting your database.

Db2 Shift Execution

 

Various messages will be displayed during the execution of the Db2 Shift command with progress bars indicating the current step in the process. When the execution completes, the UI will be displayed with a success indicator and the contents of the log file.

 

The log file is displayed below the run status. You can scroll through this list to determine what steps failed during the Db2 Shift execution.

A successful run will also display the log file.

Database Analysis

Shifting a database from an instance to a POD requires the Db2 Shift program to check that the source database meets certain criteria including:

  • The source database must be 10.5, 11.1, or 11.5
  • Only databases created with automatic storage are supported
  • The system contains external procedures which are not in the standard Db2 library - these will need to be manually recreated and catalogued
  • The LOGARCHMETH1/2 setting only supports DISK as a target in Db2u
  • The database encryption keys will be moved to the new location, but if the target already has encrypted databases, then you will need to manually migrate the encryption key to the target location

This checking is done when the Db2 Shift program begins execution. Even though the database may meet these requirements, the source database environment may have certain settings which need to be present at the target location. By default, all database settings are moved to the target location. However, none of the instance settings are moved during the shift step unless you explicitly name them.

When the Analyze function is selected, the Db2 Shift program will gather information from the source and target databases and present a report containing the settings that are different between the environments.

 

This example shows many of the errors that can be reported by the Analysis step. Those items in red will stop a shift from occurring, while those in yellow are features which might cause an issue when the database is started in the target location. Details of the setting are available by pressing ^F while the cursor is on the line of the configuration parameter (next page).

 

Some fields will have additional help available through a web link.

 

If you have access to a mouse, you will be able to click on the link in the field help display and have a web page display with more details on the parameter.


Performance

The Db2 Shift utility uses multiple threads to parallelize the movement of the data to the target location. The following are some general observations from early testing.

The number of threads used by the Db2 Shift utility can be set by the user. Using a higher number of threads is useful in situations where the Db2 database has multiple paths defined for a storage group. If the database was created with a single storage path, the threads are competing for I/O on the same device.

Increasing the number of threads will not necessarily result in more throughput. Testing has shown that 4 threads is a good starting point for parallelism, with small incremental benefits as you increase the number. You need to balance the CPU usage by the Shift utility and the impact on workloads running on the server.

The following test system was used to measure the throughput of the Db2 Shift Utility.

Test Scenario

The Db2 Shift command was run from a standard EC2 instance running Db2 on Linux and shifting the database to an EKS cluster with Db2U installed. The database size was approximately 50Gb with 260+M rows of transactional data. The elapsed times of the shift are plotted against the number of threads used for the copying process. 

The best elapsed time was 237 seconds (131s copy time only). Increasing the number of threads did not make a difference to the total elapsed time.

Examining the CPU usage, the average utilization of the threads was almost identical between 4 and 8 threads.

Overlaying the 4 thread CPU utilization and Network throughput demonstrates that the CPUs are busy transmitting as much data onto the network as possible.

 

The results showed that the maximum throughput was capped by the network limit of 5Gb/s. The network throughput was:

  • 1 thread was 1240 Mb/sec
  • 4 threads were 4890 Mb/sec

A CPU thread has a limit on how much data it can push onto the network. By running a test on a single thread, you can determine how many cores you can effectively use during a Shift run. Dividing the network capacity by the single core performance will determine the optimal number of threads to use.

Example:

  • Throughput of one thread is approximately 1.2Gb/s
  • Network limit is 5Gb/s
  • 5/1.2 is approximately 4 threads

This result can also be used to determine your total copy time:

  • Database Size/(Network Limit/8) = elapsed time
  • 50 GB/(5Gbs/8) = 50GB/(.625GBs) = 80s with ideal conditions
  • Tests results were 50GB/(4.89Gbs/8) = 128s ideal (131 observed)

The result was that after 237 seconds, a 50Gb database was moved from a traditional Db2 instance into a Db2U pod running on EKS (Kubernetes). The amount of effort required to use the database after moving to the new location on EKS: zero. That’s why using Db2 Shift will make it easier for you to modernize your existing Db2 databases onto containerized environments like OpenShift, Kubernetes, and Cloud Pak for Data.


Summary

The Db2 Shift utility provides the ability to quickly, and easily, shift your Db2 Linux database into a containerized environment with a minimal amount of effort. The elapsed time to move a database is dependent on the network bandwidth available, but early tests suggest a transmission rate of approximately 1.37Tb/H on a shared 5Gb network.


Resources

Authors

George Baklarz, B. Math, M. Sc., Ph.D. Eng., has spent many years at IBM working on various aspects of database technology. George has written 14 books on Db2 and other database technologies. George is currently part of the Assets and Architecture Team and is one of the inventors of the Click to Containerize technology.

Phil Downey, CompSci and Psychology, is a Principal Architect in the Assets and Architecture team. Phil Downey has over 30 years’ experience in the IT industry and 25 years’ experience working with enterprise database systems across many different industries in several different technical and product management roles. He is experienced in designing and deploying enterprise data architectures and migrating existing applications to them. He is one of the inventors of the Click to Containerize technology.