DB2 LUW 11.1 : Upgrade and Recovery Enhancements
(Melanie Stopfer, IBM)
DB2 for LUW V11.1 introduced a number of important upgrade and recovery enhancements. This article describes each of these enhancements and then provides recipes for tasks which take advantage of these new features. The major enhancements are –
- It is possible to upgrade to DB2 11.1 directly from 9.7 instead of having to go through another version (10.1 or 10.5).
- Upgrade is now a recoverable operation, avoiding the need for an offline backup to be taken during the upgrade procedure. Only applicable to databases at DB2 V10.5 Fix Pack 7 or later and in a single partition or pureScale configuration.
- A system which uses HADR can now be upgraded without needing to reinitialize the standby database after performing an upgrade on the primary database. Also applicable only to databases at DB2 V10.5 Fix Pack 7 or later, but only in a single partition configuration (not pureScale).
Two new log record types were introduced at DB2 V10.5 Fix Pack 7 level in support of these enhancements –
- Database Migration Begin Log Record: written to mark the start of a database upgrade.
- Database Migration End Log Record: written to mark successful completion of a database upgrade.
A side effect of these changes is that procedures for management of active log records for recoverable database need to be different for DB2 V11.1 than they were for previous releases. Prior to V11.1, for recoverable databases, the UPGRADE DATABASE command renamed all log files in the active log path with the extension .MIG. After databases were upgraded successfully, you normally deleted all S*.MIG files manually since they were no longer under standard log management controls. DB2 V11.1 no longer renames log files from previous DB2 versions from S*.LOG to S*.MIG. This means that these log files will be still be acted upon by standard log management activities, including being removed if the database is dropped. In the event of having to reverse an upgrade, care must be taken to manually preserve any files in the active log directories before dropping the database. This will be pointed out in the recipes which follow. The general principles are as follows –
- For logs from V10.1 and V10.5 databases, any unarchived files prior to upgrade will be archived from uplevel and will not require any action.
- For logs from V9.7 database, any unarchived files will not be archived, and will be left in the old log directories. Users will have to back up, archive, and manage the files manually.
The db2cklog tool can assist in manual management of log files, to confirm what logs are from previous DB2 versions –
db2cklog CHECK 80 to 94 ARCHLOGPATH
The default path is the current directory. If a recovery log file is not from the current version, then a DBT warning message is returned.
Recipe for Recovering Through an Upgrade
For recoverable databases, as part of the upgrade procedure to DB2 V11.1, you must devise a recovery plan in case issues arise during the upgrade or post upgrade. The database upgrade is a recoverable operation if you have a backup image and log files from a pre-DB2 V11.1 release and log files from DB2 V11.1. Post upgrade you can restore the backup taken at the pre-DB2 V11.1 release and roll forward to a point in time before any failure.
It is important to remember the following two basic concepts –
- A online backup image from a pre-V11.1 database cannot be restored on a V11.1 system
- A DB2 V11.1 database cannot roll forward through log files created on a pre-V11.1 database.
After an upgrade to DB2 V11.1 has been completed, it is highly recommended that a new online database backup is created as soon as possible so that you do not have to rely on the through-release recovery procedure. This recovery should only be used for failures that occur in the (hopefully small) window between taking the pre-upgrade online backup and completing the post-upgrade online backup, and then only for major failures that can only be recovered by restoring from a recent backup e.g. storage failures or data corruptions.
Since an online backup file must be restored on the version of DB2 at which it was taken, you will either need to keep the software at this version installed or reinstall it if it has been removed. During the recovery process, you will re-migrate the DB2 instance and database to V11.1, so you also will need the V11.1 software to be present. Once the database comes out of roll forward pending state, then all post upgrade tasks should be re-examined to make sure that the database is ready to accept new workload, e.g. packages may need to be rebound. If the database configuration parameter LOGINDEXBUILD was not on during the initial upgrade, then during roll forward catalog indexes are marked bad. The first connect recreates these indexes which might create a delay before data can be accessed.
The recipe could also be used in a cold standby system to recover a database after upgrade and before the new full database online backup completes. The cold standby system should remain at V10.5 Fix Pack 7 or later until the new full database online backup completes on main system. To use the recipe for this scenario, start at Step 9.
This recipe relies on having a valid pre-DB2 V11.1 backup image and all valid log files for all databases.
Because the process involves instance level operations, if you have multiple databases in an instance, they will all have to be recovered and rolled forward if you are carrying out this recovery in situ. COMPLETE A STEP FOR ALL DATABASES IN THE INSTANCE BEFORE MOVING ON TO THE NEXT STEP.
- (instance owner) It is IMPORTANT to protect log files! This is necessary as some or all of log files in active or mirror log paths may not be archived and dropping the database (Step 3) deletes those log files. To get current value of log paths, use “db2 GET DB CFG FOR | grep -i log”. Copy all logs from these paths to another location. Retain this copy until the recovery procedure has completed and a new online full database backup including log files has been completed and verified.
- (instance owner) Record database paths for all databases to be restored. This information is available using “db2 LIST DB DIRECTORY FOR ”.
- (instance owner) Drop database(s) that are to be recovered. If multiple databases exist in the instance and you are performing the recovery in situ, you will need to drop and recover all databases. Use the command “db2 DROP DB ”.
- (instance owner) Stop the V11.1 instance (db2stop command).
- (root on Linux or Windows) Drop the DB2 V11.1 instance using . (Administrator on Windows) Uninstall your DB2 copy through Control Panel > Programs.
- (root on Linux or UNIX) : recreate the instance at pre-V11.1 level using . (Administrator on Windows) Reinstall your DB2 software at pre-V11.1 level using db2setup.
- (instance owner) If required restore instance (DBM CFG) parameter values by using “db2 UPDATE DBM CFG USING ”.
- (instance owner) Start the instance using db2start.
- (instance owner) Restore the database(s) from the pre-DB2 V11.1 backups into the recorded database paths from Step 2 using “db2 RESTORE DB ”
- (instance owner) Roll forward database(s) to a point in time before the failure. Ensure the log archive location is available. Then supply the saved log files (from Step 1), copying the log files back into the active log paths or the overflow log path supplying the OVERFLOW LOG PATH parameter as part of the roll forward. Do not use “AND STOP” on the roll forward. A typical command will be “db2 ROLLFORWARD DB ”. If the roll forward returns an SQL2463N or SQL2464N, it means that the end of release has been reached i.e. the point where the UPGRADE DATABASE command was issued to convert the database to V11.1 format.
- (instance owner) Stop the DB2 instance using db2stop.
- (instance owner) Uncatalog the database(s) using “db2 UNCATALOG DB ”. Make sure before doing this that you have noted the DBPATH, as you will need this later.
- (root on Linux or UNIX) upgrade the instance to DB2 V11.1 using “”. (Administrator on Windows) Upgrade instance to DB2 V11.1 using “db2iupgrade ”.
- (instance owner) Start the DB2 instance using “db2start”.
- (instance owner) Re-catalog database(s) with recorded database paths and authentication (from Step 1) using “db2 CATALOG DB ”.
- (instance owner) For database(s) that need recovery, continue the roll forward until complete using “db2 ROLLFORWARD DB ”.
- Once database(s) come out of roll forward pending state, carry out all usual post upgrade tasks. A high priority should be to obtain a post-upgrade full offline backup to provide a suitable recovery point at V11.1 level.
Upgrading an HADR Environment
For DB2 single-partition users upgrading from DB2 V10.5 Fix Pack 7 or later, HADR environments can be upgraded without the need to re-initialize the standby database after performing an upgrade on the primary database. Re-initialization of the standby is still an option if the user wants, but is no longer the recommended option.
For DB2 pureScale environments upgrading from DB2 V10.5 (all Fix Packs) it is still required to stop HADR before upgrading and re-initialize the standby post-upgrade. This is also the case for single partition databases where either the primary or standby is at a level less than DB2 V10.5 Fix Pack 7.
The following recipes can be applied to HADR databases in Enterprise Server Edition environments that use either single or multiple standby configurations. The HADR upgrade procedure requires that one system/instance be identified to contain all primary databases, and a second system/instance be identified to contain all standby databases. Where necessary, issue graceful takeover to achieve this layout of primary and standby databases before commencing the upgrade.
The HADR primary and standby maintain their database roles during the upgrade procedure. The procedure depends on the DB2 V10.5 Fix Pack 7 or later, the primary database having shipped all log data to the DB2 V10.5 Fix Pack 7 or later standby databases, and the standby databases having replayed all log data received. The DB2 V11.1 standby database cannot replay log data from earlier DB2 versions. The procedure enforces this restriction through db2ckupgrade that is run during db2iupgrade or any db2ckupgrade invocation that establishes an exclusive connection to the database to do proper database consistency checking. Note that db2ckupgrade is no longer just a simple “check” function, but also causes log shipping actions in preparation for upgrade to take place.
Recipe for Upgrading HADR Databases with One Standby
- Review the DB2 software levels using “db2level”, ensuring both the primary and standby database instances are at the same DB2 V10.5 Fix Pack, which must be greater than Fix Pack 7.
- If there is more than one database in the instance, perform role switch if necessary using “db2 TAKEOVER HADR ON DB ” so that all primaries are on the same system during upgrade.
- Ensure you are familiar with steps involved in upgrading a DB2 instance and DB2 databases.
- Ensure both the primary database’s log shipping functionality and the standby database's log replaying functionality is working properly. Use “db2pd -db ” on each database on both instances (primary and standby) to check everything is fine.
- If using the HADR ROS (Read on Standby) feature, ensure that DB CFG parameter LOGINDEXBUILD=ON so that index recreation done during upgrade is sent to standby for replay. This allows read connections to resume post upgrade on standby.
- In case of failures during the HADR upgrade procedure, ensure you are familiar with the Knowledge Center topic "Dealing with failures while upgrading DB2 servers in HADR environments (10.5 Fix Pack 7 or later)".
- All DB2 upgrades, hardware upgrades, and software upgrades should be implemented in test environment before being applied to production system.
- PRIMARY: Monitor HADR databases to reduce the chance of upgrade failures. Ensure primary log shipping and standby log replay are operating and are not lagging using “db2pd -db or “SELECT … FROM MON_GET_HADR”.
- STANDBY: Verify that DB CFG parameter HADR_REPLAY_DELAY=0. This ensures that the standby's log replay position can catch up to the primary's log position in a reasonable amount of time.
- PRIMARY: Ensure log shipping is completed on primary databases and data is transferred to standbys by deactivating the primary databases using “db2 DEACTIVATE DATABASE
- PRIMARY: Ensure applications do not connect so no new log data is generated and standby's log replay position eventually matches primary's log shipping position by stopping the primary instance using “db2stop”.
- PRIMARY: Upgrade the primary DB2 instance using “db2iupgrade ”. The db2iupgrade command calls db2ckupgrade to verify that primary databases are ready for the upgrade to take place. DO NOT BYPASS db2ckupgrade with the “-b” option! The db2iupgrade does not run if db2ckupgrade reports errors. Check the log file if any errors are encountered. For each HADR primary database, db2ckupgrade verifies that there is a valid standby database which can be connected to. Once connection is established, log shipping begins and ships any pending log data, if necessary. The db2ckupgrade then verifies that log shipping position on primary matches log replay position on standby.
- STANDBY: Ensure log shipping is completed on each primary database and data is transferred to standby by deactivating all databases using “db2 DEACTIVATE DATABASE ”.
- STANDBY: Stop the standby instance using “db2stop”.
- STANDBY: Upgrade the standby instance using “db2iupgrade
- STANDBY: Start the standby instance using “db2start” and then upgrade each database using “db2 UPGRADE DATABASE ”. This upgrades metadata objects and, if log validation succeeded in earlier steps, standby starts up and waits for a connection from primary. Replay functionality begins in the background and waits for upgrade log data to be received from the primary. Standby is considered "upgrade in progress". The “UPGRADE DB” commands returns SQL1103W. No user connections allowed while in “upgrade in process” state (any connection attempts will receive SQL1776N RC=9). Progress on the standby can be monitored using “db2pd -db ” in conjunction with DB2 diagnostics log.
- PRIMARY: Upgrade each primary database using “db2 UPGRADE DATABASE ”. Once the primary database's metadata has been upgraded, the primary database looks to connect to the standby within an HADR timeout window. Once a connection has been established, the database upgrade begins and log data is sent to the standby for replay.
- PRIMARY: Upgrade has now been completed. Start using upgraded DB2 11.1 primary databases by activating them using “db2 ACTIVATE DATABASE ”.
- Perform recommended post-upgrade tasks and verify the upgrade has been successful. It is recommended to perform a full online backup of all databases as soon as possible.
Upgrading HADR Servers with Multiple Standbys
With multiple standby databases, you have the added flexibility of upgrading all standby databases together to DB2 V11.1 or leaving some auxiliary standby databases at DB2 V10.5 Fix Pack 7 or later until the primary and principal standby have completed the upgrade procedure in case of upgrade complications. Below we present recipes for each of these methods.
Method 1 : Upgrading all HADR Standbys Together
- Follow Steps 1 through 11 as per the single standby recipe above. However, for all actions related to the standby perform these on all standbys.
- STANDBYS: Ensure that all standby databases are active using “db2pd -db ”.
- PRIMARY: Upgrade the primary DB2 instance using “db2iupgrade ”. The db2iupgrade calls db2ckupgrade to verify each primary database is ready for upgrade. The db2iupgrade does not run if db2ckupgrade reports errors. Check log file if errors are encountered. For HADR primary databases from 10.5 Fix Pack 7 or later, db2ckupgrade verifies that a valid standby database can be connected to for each primary database. Once an HADR connection between a primary and standby database has been established, log shipping begins and ships any pending log data, if necessary. After this has been completed, the log positions on the primary and standby databases are validated. If log positions do not match for some database, then db2iupgrade/db2ckupgrade fails. If a failure occurs, ensure log shipping/replay functionality is still healthy for that database. If it is, then increase DB CFG parameter HADR_TIMEOUT for that database to give the log validation checking more time for the log positions to match. If any standby is unable to catch up to the primary within a reasonable time, remove that standby from the DB CFG parameter HADR_TARGET_LIST. Any standbys which have been removed will need reinitialized post upgrade using a backup image. For any given database, if there are issues getting log positions to match for principal and auxiliary standbys, then HADR has to be stopped on that database which can then be upgraded and HADR reinitialized from a backup.
- STANDBYS: Ensure log shipping is completed on primary databases and data is transferred to standbys. Deactivate all databases using “db2 DEACTIVATE DATABASE ”. Then stop all standby instances using “db2stop”.
- STANDBYS: Upgrade all standby instances using “db2iupgrade ”. Upgrades can be run across all standby instances in parallel. The db2iupgrade command calls db2ckupgrade to verify that all standby databases are ready for upgrade.
- PRINCIPAL STANDBY: It is important that database upgrades are done STARTING WITH THE PRINCIPAL STANDBY. So start the principal standby instance using “db2start” and then upgrade each of the databases using “db2 UPGRADE DATABASE ”. This upgrades the metadata objects, and if log validation succeeded in earlier steps, the principal standby starts up and waits for an HADR connection from the primary. Replay functionality begins in the background and waits for upgrade log data to be received from primary. At this point, the principal standby is considered "upgrade in progress" and the UPGRADE DATABASE command returns “SQL1103W The UPGRADE DATABASE command was completed successfully”. No user connections are allowed while in the "upgrade in progress" state (any connection attempts will receive SQL1776N RC=9). Progress on the standby can be monitored using “db2pd -db ” in conjunction with the DB2 diagnostics log.
- AUXILLARY STANDBYS: Repeat Step 6 for each auxiliary standby whenever convenient. However, it is KEY that all work the databases on the principal standby have been completed as per Step 6 before any auxiliary standby issues an UPGRADE DATABASE command.
- PRIMARY: For each primary database issue the “db2 UPGRADE DATABASE ” command. This upgrades a primary database’s metadata. If log validation succeeded in earlier steps, the primary database will attempt to establish an HADR connection with the principal standby. The upgrade will not proceed unless a connection can be established to a valid principal standby database. If it can, the normal upgrade processing will take place. All log data will be shipped to the principal standby database for replay. At this point, once primary is also able to establish an HADR connection with auxiliary standby databases, log data will also be shipped to these auxiliary standby databases for replay.
- PRIMARY: When the upgrade completes, activate the DB2 V11.1 primary database using “db2 ACTIVATE DATABASE ”.
- STANDBYS: When standby databases have replayed all upgrade log data, standbys are no longer considered in "upgrade in progress state". They will stay activated and normal operations will be resumed. Standby databases enabled for ROS (Read On Standby) will once again accept connections.
- Perform recommended post-upgrade tasks and verify the upgrade has been successful. It is particularly important to take a post-upgrade full online backup as soon as possible after the upgrade has been confirmed as successful.
Method 2 : Upgrading HADR Standbys Separately
- Follow Steps 1 through 11 as per the single standby recipe above.
- PRINCIPAL STANDBY: Ensure all standby databases are active using “db2pd -db ”.
- PRIMARY: Upgrade the DB2 instance using “db2iupgrade
- STANDBYS: To ensure log shipping is completed on primary databases and data is transferred to standbys, deactivate all databases using “db2 DEACTIVATE DATABASE ”. Then stop the instance using “db2stop”. This can be done across all standby instances in parallel.
- PRINCIPAL STANDBY: Upgrade the principal standby instance using “db2iupgrade ”. The db2iupgrade command calls db2ckupgrade to verify that all standby databases are ready for the upgrade.
- PRINCIPAL STANDBY: Start the principal standby instance using “db2start”. Then for each database in the principal standby instance issue “db2 UPGRADE DATABASE ”. This upgrades the metadata objects and, if log validation succeeded in earlier steps, the principal standby starts up and waits for an HADR connection from the primary. Replay functionality begins in the background and waits for upgrade log data to be received from primary. The principal standby is considered "upgrade in progress". The UPGRADE DATABASE command returns “SQL1103W The UPGRADE DATABASE command was completed successfully”. No user connections are allowed while in the "upgrade in progress" state (any connection attempts will receive SQL1776N RC=9). Progress on the standby can be monitored using “db2pd -db ” in conjunction with DB2 diagnostics log.
- PRIMARY: For each database on the primary instance, issue “db2 UPGRADE DATABASE ”. This upgrades the primary database’s metadata. If log validation succeeded in earlier steps, the primary database attempts to establish an HADR connection with the principal standby. Upgrade will not proceed unless this connection is established successfully. Normal upgrade processing takes place with all log data being shipped to the principal standby database for replay.
- PRIMARY: When upgrade on primary database has completed, activate the DB2 V11.1 database using “db2 UPGRADE DATABASE ”.
- PRINCIPAL STANDBY: When the principal standby has replayed all upgrade log data, it is no longer considered in “upgrade in progress” state. It remains activated and normal operations resume. Any standby database enabled for ROS (Read On Standby) will now allow connections once again.
- PRIMARY and PRINCIPAL STANDBY: Confirm that everything is functioning as expected.
- AUXILLARY STANDBYS: Repeat Steps 5, 6, 9 and 10 for each auxiliary standby to upgrade to DB2 V11.1.
- AUXILLARY STANDBYS: Verify auxiliary standbys are healthy.
- Perform recommended post-upgrade tasks.
Conclusion
The enhancements to the upgrade process for DB2 V11.1 have potential to greatly reduce the outage required during a typical upgrade. They also eliminate the exposure as a single point of failure for HADR systems which previously was experienced while standby databases were re-established. It is hoped that the recipes described in this article will help DB2 users make the most of this new functionality as they perform their upgrades to DB2 V11.1.