db2 compression not quite compatible with replication, especially on capture side?

Rui Chen

db2 compression not quite compatible with replication, especially on capture side?

Hi DB2 experts, have some questions on combining compression and replication. We are evaluating the risk turning on db2 compression on capture side. 

From the KC article (ref. link), sounds like there's the risk that db2ReadLog API can't decompress a log record, when the table-level dictionary is lost. Is there any best practice we could follow to avoid getting caught in such situation? It could very well be "don't use compression on capture side"...... 

 

We don't use LOAD or REORG, and understand there are options to keep the old dictionary, but haven't found definitive answers fro KC, about how to keep the old dictionary for some admin operations (eg. admin_move_table, add column to compressed table). We could test things out by ourselves (SYSPROC.ADMIN_GET_TAB_DICTIONARY_INFO seems like a good option), but just wondering if there are already some references we could use. 

 

Also we plan to use adaptive compression, which updates page-level dictionary automatically. From our understanding, page-level dictionary shouldn't matter, since it won't effect the contents in transaction log. Is that correct impression, or should we actually worry about page-level dictionary being lost as well? 

 

Last but not least, we use LUW 10.5 FP8.

 

Thanks for your help.

 

Edited By:
Rui Chen[Organization Members] @ May 24, 2018 - 05:47 PM (America/Eastern)
Rui Chen[Organization Members] @ May 24, 2018 - 05:49 PM (America/Eastern)

Wayne Campbell

db2 compression not quite compatible with replication, especially on capture side?
(in response to Rui Chen)
We replicate compressed tables all the time. We also use IBM’s log analysis tool on these compressed tables. I thought that the rows are uncompressed in the log, but you would need someone smarter than me to verify this.


Wayne Campbell
DB2 DBA
Administrative Office of the Courts
(360) 705-5268
Email: [login to unmask email]<mailto:[login to unmask email]>

From: Rui Chen [mailto:[login to unmask email]
Sent: Thursday, May 24, 2018 2:42 PM
To: [login to unmask email]
Subject: [DB2-L] - db2 compression not quite compatible with replication, especially on capture side?


Hi DB2 experts, have some questions on combining compression and replication. We are evaluating the risk turning on db2 compression on capture side.

From the KC article (ref. link https://www.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.admin.trb.doc/doc/c0053598.html ), sounds like there's the risk that db2ReadLog API can't decompress a log record, when the table-level dictionary is lost. Is there any best practice we could follow to avoid getting caught in such situation? It could very well be "don't use compression on capture side"......



We don't use LOAD, and understand there are options to keep the old dictionary, but haven't found definitive answers fro KC, about how to keep the old dictionary for some admin operations (eg. admin_move_table, add column to compressed table). We could test things out by ourselves, but just wondering if there are already some references we could use.



Also we plan to use adaptive compression, which updates page-level dictionary automatically. From our understanding, page-level dictionary shouldn't matter, since it won't effect the contents in transaction log. Is that correct impression, or should we actually worry about page-level dictionary being lost as well?



Last but not least, we use LUW 10.5 FP8.



Thanks for your help.



-----End Original Message-----

Rui Chen

RE: db2 compression not quite compatible with replication, especially on capture side?
(in response to Wayne Campbell)

Thanks for your reply Wayne. I'm just assuming the log structure is changed b/c it sounds like dictionary is needed to decode the record, according to the KC article. 

The more concerning part is that this table-level dictionary sounds like a single-point-of-failure to db2ReadLog API. If we lose the storage device hosting the dictionary, we pretty much need to load the whole table to target, which is not desirable .... 

Wayne Campbell

db2 compression not quite compatible with replication, especially on capture side?
(in response to Rui Chen)
As I understand, the Compression Dictionary is the first pages of the tablespace, It is also in the image copy of the tablespace and it can get the information from there.

Wayne Campbell
DB2 DBA
Administrative Office of the Courts
(360) 705-5268
Email: [login to unmask email]<mailto:[login to unmask email]>

From: Rui Chen [mailto:[login to unmask email]
Sent: Thursday, May 24, 2018 3:37 PM
To: [login to unmask email]
Subject: [DB2-L] - RE: db2 compression not quite compatible with replication, especially on capture side?


Thanks for your reply Wayne. I'm just assuming the log structure is changed b/c it sounds like dictionary is needed to decode the record, according to the KC article.

The more concerning part is that this table-level dictionary sounds like a single-point-of-failure to db2ReadLog API. If we lose the storage device hosting the dictionary, we pretty much need to load the whole table to target, which is not desirable ....

-----End Original Message-----

Jim Tonchick

db2 compression not quite compatible with replication, especially on capture side?
(in response to Rui Chen)
What does you CDC software vendor say in their documentation about capturing data from compressed tables?

A good CDC product should be able to handle compressed tables as long as they use DB2 supported compression, not a third party compression tool. The CDC product should not be dictating how you data is configured.

Jim Tonchick


-----Original Message-----
From: Rui Chen <[login to unmask email]>
To: DB2-L <[login to unmask email]>
Sent: Thu, May 24, 2018 05:36 PM
Subject: [DB2-L] - RE: db2 compression not quite compatible with replication, especially on capture side?



<div id="AOLMsgPart_2_fa379a93-d39b-4c16-96cc-5236643c576a">
<div class="aolReplacedBody"><p>Thanks for your reply Wayne. I'm just assuming the log structure is changed b/c it sounds like dictionary is needed to decode the record, according to the KC article. </p>
<p>The more concerning part is that this table-level dictionary sounds like a single-point-of-failure to <span>db2ReadLog API.</span> If we lose the storage device hosting the dictionary, we pretty much need to load the whole table to target, which is not desirable .... </p>
<hr size="1" style="color:#ccc"><div id="aolmail_socfooter" style="font-size:80%"><span style="font-weight:bold">Site Links: </span>
<a target="_blank" rel="noopener noreferrer" href="https://www.idug.org/p/fo/st/?post=185928&anc=p185928#p185928">View post online</a>
<a target="_blank" rel="noopener noreferrer" href="https://www.idug.org/p/fo/si/?topic=19">View mailing list online</a>
<a target="_blank" rel="noopener noreferrer" href="mailto:[login to unmask email]">Start new thread via email</a>
<a target="_blank" rel="noopener noreferrer" href="mailto:[login to unmask email]?Subject=Unsubscribe">Unsubscribe from this mailing list</a>
<a target="_blank" rel="noopener noreferrer" href="https://www.idug.org/p/us/to/">Manage your subscription</a>


This email has been sent to: <a href="mailto:[login to unmask email]">[login to unmask email]</a>
<p>
Faster data refresh is here! The long waits and babysitting of unload/load jobs is over. Contact

ESAi to learn about BCV5 & XDM. Be a hero to users with fast on-demand test/QA data provisioning.See

<a target="_blank" rel="noopener noreferrer" href="http://www.ESAIGroup.com/idug">http://www.ESAIGroup.com/idug</a>

</p>
<p>
Use of this email content is governed by the terms of service at:
<a target="_blank" rel="noopener noreferrer" href="http://www.idug.org/p/cm/ld/fid=2">http://www.idug.org/p/cm/ld/fid=2</a></p>

<hr size="1" style="color:#ccc"></div>
</div>
</div>

Rui Chen

RE: db2 compression not quite compatible with replication, especially on capture side?
(in response to Jim Tonchick)

Agree with your point Jim, and in this case it sounds less relevant to CDC vendor software (we actually use Q Replication 10.2.1 for LUW), since CDC still calls db2ReadLog API.

As mentioned earlier, the point we feel risky introducing compression on capture side is how to cope with table-level dictionary loss, which sounds like a (non-recoverable ?!) SPOF to any replication solution (that uses db2ReadLog API). 

Also curious, is it possible to recover the table-level dictionary by rollforward logs, to deal with storage device failure (meaning we don't trigger table-level dictionary rebuilt)?  

Rui Chen

RE: db2 compression not quite compatible with replication, especially on capture side?
(in response to Wayne Campbell)

Thanks Wayne, that's very helpful detail!

So it sounds like there's the opportunity to recover the dictionary from a previous backup image, assuming the dictionary is unchanged since that backup image was taken. 

It actually would be nice if the log record is unchanged regardless of compression config (well, that could be very well true, as i haven't found any definitive answer to it....). 

Edited By:
Rui Chen[Organization Members] @ May 25, 2018 - 12:23 PM (America/Eastern)

Jeff Goss

RE: db2 compression not quite compatible with replication, especially on capture side?
(in response to Rui Chen)

I believe in the case you are considering you would have to recover the source database before continuing with the read log API.  We store old dictionaries for these cases so the API can continue to work after the source table has a new dictionary built.

Page compression isn't a factor here, as the log records for these tables are actually written in just table compressed even though the actual record on disk could be both table and row compressed.  These page level dictionaries can change at any time so the API can't depend on finding those.

Rui Chen

RE: db2 compression not quite compatible with replication, especially on capture side?
(in response to Jeff Goss)

Hi Jeff, really appreciate your input!

So sounds like old dictionary could be restored, instead of rebuilt, by regular db2 restore, which is great! This saves user from having to reload the target table on replica, in case of dictionary loss caused by storage device failure. However i'm still not sure about couple of things, and hope you could share some insights :) 

  1. can we recover "current dictionary" by rollforward logs, given the dictionary is rebuilt after the most recent backup? We are concerned if the dictionary is rebuilt from sampling, then the "current dictionary" may no longer be available, and there would be 3 dictionary used in transaction log: the "old dictionary" in backup file, the "lost rebuilt dictionary" (this would be the "current dictionary"), and the "recovered rebuilt dictionary". If i read KC correctly, DB2 only keeps a single historical dictionary, which prevents us from restarting capture from earlier point in time that need to use the "old dictionary". 

  2. In general, what's the best practice for controlling dictionary rebuilt, if there's a valid reason to do so, or should we actually avoid it, especially if page-level compression also provides sufficient compression ratio?
     
  3. Kind of related to the first question in this post. If we ever need to rebuild dictionary without downtime, how do we know we have the historical dictionary we need? We understand there's a HISTORICAL_DICTIONARY column in SYSPROC.ADMIN_GET_TAB_DICTIONARY_INFO, which is a boolean value, and we only keep 1 historical dictionary (if i read KC correctly...). Most likely we are interested in ADMIN_MOVE_TABLE (we don't use LOAD or REORG), and assume we have to run it multiple times in short period of time (caused by whatever failure, or, eg. operator error etc.), how do we ensure the historical dictionary we need is always available?

The context is that we may need to restart capture from an earlier point in time, across dictionary rebuilt.

Again, we use LUW on 10.5FP8. 

Thanks.

Edited By:
Rui Chen[Organization Members] @ May 31, 2018 - 12:54 PM (America/Eastern)
Rui Chen[Organization Members] @ May 31, 2018 - 12:55 PM (America/Eastern)
Rui Chen[Organization Members] @ May 31, 2018 - 12:56 PM (America/Eastern)
Rui Chen[Organization Members] @ May 31, 2018 - 12:58 PM (America/Eastern)
Rui Chen[Organization Members] @ May 31, 2018 - 12:59 PM (America/Eastern)