Log File Size Change and Database Recovery

I came across a tweet recently asking about how to change log file size in an HADR environment.  I made a quick response on Twitter but I thought I would write a little more detail here in a blog post.

In general, a change to the database cfg parm LOGFILSIZ requires the database to be bounced (DEACTIVATE/ACTIVATE) cleanly, meaning no inflight/open transactions (e.g. no indoubt transactions or async undo through DB2_ONLINERECOVERY).

In knowing that when there is any form of database recovery, such as crash recovery, database rollforward or HADR standby replay special handling can be taken.

LOGFILSIZ is one of those database cfg parameters that is stored at the database cfg level and also stored internally in the GLFH found under the database directory.  The GLFH value is used for all of database activation and throughout any database recovery operation.  When you update the database cfg value it is pending until the next clean activation where we then update the GLFH value to match the database cfg value.

I outline the handling of each of the database recovery operations below:

Crash Recovery

Since the database crashed, we know the database was already active so either the LOGFILSIZ change already took place or is pending.  So all of crash recovery will deal with a constant log file size based on the GLFH value and there will be no change during the redo or undo phases of crash recovery.  Once crash recovery's undo phase completes, we normally keep the database up.  If there is a pending LOGFILSIZ change that has not been picked up yet then before completing crash recovery we ask the question: "Are there any inflight/open transactions?".  If not, we then pickup any pending logging database cfg parm changes, including LOGFILSIZ and update the GLFH values.  If there are any inflight/open transactions then the pending database cfg parm stays pending until the next clean activation of the database.

Database Rollforward

I call database rollforward, blind replay.  That is because we start in the past and move forward blindly not exactly knowing what we will see next.  Therefore database rollforward is dictated by the log file size it sees in each log file it needs to replay log data from.  Because we know that a log file size change can only happen on a clean activation, we know that no inflight/open transactions exist at the time we want to replay the first log record in a new log file size.  When this is detected, we move up the recovery replay position to this first log record and update the GLFH value.  We repeat this for as many times we see a log file with a different log file size value.  At the end of database rollforward the database is deactivated.  On the next activation, if there are no indoubt transactions and the LOGFILSIZ database cfg value differs from the value in the GLFH then we update the GLFH value to match the database cfg value.  If there are indoubt transactions then we start crash recovery and the LOGFILSIZ change stays pending.

HADR Standby Replay

In an HADR environment, the standby is like a mirror of the primary trying to stay in sync.  But, Db2 does not replay database cfg updates done on the primary.  The user is responsible for issuing UPDATE DB CFG commands on both sides to ensure they match in case of any takeover.  Yes, I know people hate that and there is an open RFE for this but that is the current process as of 11.5.  When the update is made on the primary and after a DEACTIVATE and clean activation the new log file size will take affect on the primary.  At that time, the log files with the new size will be shipped over to the standby.  The standby will receive these log files and because it knows that a clean activation was done it will see the new size and the standby will start pre-allocating new empty files at the new size.  Then when standby replay sees the first log record in a log file with the new size, just like database rollforward we will update the recovery replay position to this first log record and update the GLFH value.  We repeat this for as many times we see a log file with a different log file size value.  Now a takeover happens on the standby.  Since this role change from standby to primary keeps the database activated the GLFH value prevails for the time being.  But, once this new primary deactivates and re-activates clean the database cfg value will be compared to the GLFH value and any differences will cause the GLFH value to be updated.  So the moral of the story is when issuing UPDATE DB CFG LOGFILSIZ on the primary, it's best to try to do the same thing on your standby at the same time to prepare for takeover.  But, while in replay mode there is no affect to the replay.

Recent Stories
Db2 Advanced Copy Services (ACS)) for snapshot backup/restore with IBM Spectrum Protect Snapshot

New link to the HADR wiki

Log File Size Change and Database Recovery