One of the most painful experiences in the work life of a database administrator (DBA) is probably the accidental drop of a database object on a production subsystem. Most IT shops have procedures to prevent this occurrence, yet it still happens. When it does happen, it is crucial to recover the object into its original state without any loss of data as soon as possible. Every minute during which your applications cannot access data harms your business. These situations are very stressful for a DBA and the recovery process is exposed to many human errors.
This article describes a specific situation in an IBM® DB2® for z/OS® (DB2) environment, where a production table was dropped by a DB2 SYSADM. First, we discuss the implications of dropping a table. Then we cover the steps are necessary for recovery. Finally, we discuss tools you can use to simplify this task.
Implications of dropping a table
Let’s take a close look at what happens when the DROP TABLE MYTBL statement is issued. Besides applications that start failing and losing all data stored in the dropped table, DB2 may also drop more objects. DB2 holds information about all objects in the DB2 catalog. When you issue a drop statement, DB2 starts deleting rows from the catalog that belongs to a dropped table and to any other object that depends on the deleted table. The table below summarizes affected objects:
Dropped table recovery
In general, one of the biggest problems when performing a DB2 recovery is that a DBA does not do it frequently. If a recovery was needed every day, it would indicate that there is a much more serious problem in the environment. Beyond that, dropped object recovery is nearly impossible using only the base utilities provided with DB2. We will now deal with what information is required, how to get it, what steps are necessary to recover from a dropped table situation and where the base utilities are insufficient.
It is highly recommended that before starting a dropped table recovery, create a recovery worksheet. It can be a simple table, such as below, in which you will gradually fill in useful information:
It could also be more complex and structured document. Whichever worksheet type you choose, it is a good starting point as you may repeat some steps more times or you may provide a report with this information afterwards. A worksheet helps keep important information available for whatever reason. It also helps keep you focused during a challenging and stressful experience.
The first required piece of information is the original DDL, with all ALTERs if possible. If you do not have the DDL and you cannot find it, a log analysis tool would be a great help in this situation as you would be able to generate the lost DDL from the log. Using just the base tools, you have very limited options.
You can use the most recent image copy of the DB2 catalog, load it into shadow catalog tables and identify the DDL from there. Not so bad if you have a simple table, but it would be very difficult and error-prone when the object structure is more complex. However still better than running DSN1LOGP and trying to reconstruct the DDL from its output.
Once you have the DDL you can re-create the table and any indexes on the table.
The newly created object can have the same IDs as the old one, or not. It depends on what IDs are available at the time the object is re-created. You can use OBID in a CREATE TABLE statement if it does not identify an existing or previously used OBID from the database. Remark the object IDs for later use. For the new object, you can get this information from the catalog using the simple queries below:
- DBID, OBID for table:
- SELECT NAME DBID, OBID FROM SYSIBM.SYSTABLES WHERE NAME='MYDB.MYTBL' AND CREATOR='USERID';
- DBID, PSID for table space :
- SELECT NAME, DBID, PSID FROM SYSIBM.SYSTABLESPACE WHERE NAME='MYDB.MYTS' AND CREATOR='USERID';
To obtain IDs of the old object, locate the last full image copy. This might not be easy as all information from SYSCOPY and SYSLGRNX was also deleted. You can run the DSN1PRNT job with FORMAT and NODATA parameters and get the IDs from its output. Consider the following example for the universal tablespace:
- DSN1PRNT control cards:
//PRINTIT EXEC PGM=DSN1PRNT,
//STEPLIB DD DSN=DB2.SDSNLOAD,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSUT1 DD DSN=SSID.MYDB.MYTS.D2016209.T115036,
- Header page from DSN1PRNT output where HPGOBID field contains DBID and PSID, and HPGROID contains OBID
PGCOMB='00'X PGLOGRBA='119598D5EF0A'X PGNUM='00000000'X PGFLAGS='38'X
HPGOBID='19D30002'X HPGHPREF='000000B4'X HPGCATRL='00'X HPGREL='D6'X HPGZLD='00'X
HPGCATV='00'X HPGTORBA='000000000000'X HPGTSTMP='20160315083037558950'X
HPGSSNM='D10A' HPGFOID='0001'X HPGPGSZ='1000'X HPGSGSZ='0004'X HPGPARTN='0000'X
HPGZ3PNO='000000'X HPGZNUMP='00'X HPGTBLC='0001'X HPGROID='0005'X
With all IDs and a full image copy you have all information needed to set-up a DSN1COPY job with the OBIDXLAT and RESET options to recover the table from the full image copy. You will need to stop the table space beforehand, to allow DSN1COPY to access the DB2 dataset. Below is an example of DSN1COPY job card statements where the SYSXLAT DD contains pairs of old and new object ids from the illustration above (6611 as DBID, 2 as PSID and 5 as OBID).
//RECOVER EXEC PGM=DSN1COPY,
//STEPLIB DD DSN=DB2.SDSNLOAD,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSUT1 DD DSN=SSID.MYDB.MYTS.D2016209.T115036,DISP=SHR
//SYSUT2 DD DSN= SSID.MYDB.MYTS.I0001.A001,DISP=SHR
//SYSXLAT DD *
Use DSN1COPY with the OBIDXLAT and RESET options to apply any incremental image copies. Alternatively to recovery with DSN1COPY, you can UNLOAD from the image copy and LOAD into a new table. Note that this cannot be done with the base utilities provided with DB2. It is recommended creating a new image copy when you successfully finish recovery from the old image copy(s).
After this step, your object is recovered to a point in time of the last image copy. You are still missing all changes that have been done from the time of the last image copy until the table was dropped. A log analysis tool that can redo the changes from the DB2 log to recover also those missing changes can be a great help. Once you are done also with this task you can start the table space again. If you have not made a new full image copy yet I would strongly recommend doing so.
Once your SELECT statement to quickly check the table is fine, you are out of the woods, but not finished. There are still couple steps that may be required - rebuild the indexes, run a runstats, rebind invalidated packages, or re-creating the objects that are dependent on the recovered table.
Let’s summarize the steps required to perform dropped table recovery:
- Locate the original DDL of the dropped object and re-create it
- Gather the IDs of newly created objects from the DB2 catalog
- Find the most recent full image copy of the object
- Use the DSN1PRNT job and last full image copy to get old object IDs
- Recover the object from the last image copy with OBID translation
- Regenerate and apply all changes applied since the last image copy was taken until the object was dropped
- Rebuild the indexes
- REBIND packages
- Recreate other affected objects
These steps may require slight modifications depending on your scenario. Table space types, whether the dropped table contained an LOB column or an XML column, and other variables.
It is easy to see how time-consuming and error prone the manual recovery of an accidentally dropped object can be. It often takes hours to collect the necessary information and set-up all the jobs. It is also near to impossible to recover from such a situation using the base utilities provided with DB2. Even with the help of more advanced utilities, a manual approach is complicated, even more so when you need to recover a database with many objects.
There are also many physical factors that affects a speed of a recovery process. The speed of the discs where your image copies and log data sets reside, how many archived logs will be needed to reconstruct data changes and many more. But there is one that enormously affects the speed of a log based recovery. It is the Data Capture option. If your table is defined with the DATA CAPTURE NONE, which is the default, DB2 does not register the whole row image in the log for update statements, only the first changed byte of the updated row to the last one. When compression is in place the log record might get bigger. This is fine for reconstructing inserts and deletes, but a different story when it comes to updates. A complete before and after image of the row is required in order to reconstruct an appropriate SQL statement of any data change of the row from the log. In general you have three options to get the complete information for an update statement. Use purely log and locate the log record which represents an insert of that row and combine it with all subsequent updates. The point in the log which contains the insert record might be very far from current so it might take long time to reach it. But rather it is likely that the log dataset is deleted already. Or you can read the current data from the active page set, read the log from the current point in time backwards and apply necessary changes. This could be beneficial if the point in time of interest is not very far from the current. The further you have to go back the longer it will take. Or as a third option you can get the data from the last full image copy, read the log forward and apply necessary changes. But there is more time required to merge the information from the log records and from the image copy. Any of these techniques requires additional time, which might be quite significant depending on the situation, to complete the full row image which is necessary for the recovery process. Hence a very good trade-off is using Data Capture Changes. This option impose an increased size of log records for update statements. However numerous benchmarks have been performed over time and in most cases the extra logging is minimal as well as CPU overhead.
Don’t worry, there is a solution!
There are several vendor tools on the market that automate dropped object recovery to some degree. Let’s take a look how CA Log Analyzer can help you recover a dropped object in a few easy steps. It reduces the time needed for data collection and jobs preparation from hours to minutes.
You need only three things to set-up dropped object recovery with CA Log Analyzer™ for DB2 for z/OS:
- Object Type
- Object Name
- Approximate time range when the object was dropped
That’s all, once you know what was dropped and when, fill this information on the Dropped Object Recovery main menu and CA Log Analyzer will generate and submit a job that:
- Generates UNDO DDL and runs generated SQL to recreate the object
- Determines DBID, PSID and OBID of the old and new objects
- Generates the recovery step and recovers the object from last image copy
- Generates REDO DML and runs generated SQL to apply data changes since the last image copy
- Rebuild indexes
- Recovers STATS from the catalog
- or generates a step to run RUNSTATS if desired
- Generates REBIND step to rebind all invalidated packages.
In short, CA Log Analyzer automates the steps mentioned above required to recover an accidentally dropped object to a point in time when the object was dropped. If you are a geek and want to have a full control over the process, this option is also available.
Watch this short video to see the recovery of a dropped database using CA Log Analyzer in action:
Keep in mind that no tool can save you if you are not well prepared for accidental situations and have prepared possible recovery scenarios, but if your backup and recovery procedures are correctly set and you have a tool such as CA Log Analyzer, you reduce the time of outage and loss of critical data to a minimum. As demonstrated in this article, a DBA using CA Log Analyzer can reduce the time needed to setup recovery jobs to recover a dropped object from hours to minutes.