Changing limitkey on a Partition (encore w/ problems)

Nick CIANCI

Changing limitkey on a Partition (encore w/ problems)
Hi,
Just thought I'd update the list with some issues we experienced
recently to do with adding in partitions, that we had not seen previously
or encountered in any of our test environments.

Our tale of woe starts with some 480M records across 26 Partitions for a
total of 56 compressed Gb of Table. The partitioning (clustering) index
accounts for 8Gb and is also the logical insert sequence so has high
natural clustering. There are also 6 x NPI's (DPSIs are not really an
option here) that occupy a total of ~170Gb, and their cluster ratio is
"bad". With the H/K window we have available several of the indices are
un-reorg-able.

Without going into the where's and why's we don't change stats (to keep the
Access Path's stable), but do run PDA (Plat DB Analyser) to collect stats
for H/K purposes. Unfortunately we had also been getting audit errors on
PDA, so our stats prior to the problem on these indices were a bit ...
old-ish!

In all of our re-partitioning the keys are structured so that it fills
partitions along an ascending timestamp, and in all cases the final
partition was still empty. so our strategy was to :
(1) ALTER TABLE xxx ALTER PARTITION nnn ENDING AT
('2010-03-01-00.00.00.000000') ; -- from '9999-12-31-24.00.00.000000'
(2) ALTER TABLE xxx ADD PARTITION ENDING AT
('2010-05-01-00.00.00.000000') ;
... repeat ad-nauseum
ALTER TABLE xxx ADD PARTITION ENDING AT (maxvalue) ;
(3) as all altered Partitions are empty the ReOrg should be very fast

Anyway on the night, of the 15 objects we were repartitioning, 14 went
through their REORG SHRLEVEL-NONE quickly and cleanly (including some with
NPIs) but the final one (described above) kept running ... and running ...
and running. In the end we had to call it quits. I won't go in to all the
gory details but essentially we created a clone structure, loaded the data
up, copied the Stats, renamed the tables, re-did the RI, and rebound all
affected packages. For the object we had the trouble with the unload /
reload phase went through blindingly quick, the Index build however didn't.
Some of the indices took between 10-50 minutes to apply 0 Key-RID pairs to
the NPI, however the 3 "bad" indices had shown no sign of completion in the
6 hours before we cancelled the job. We had OmegaMon object analysis
turned on and we saw a lot of sync I/O to the indices, and very little (but
not zero) async I/O.

It was our understanding that during the Unload-Reload phases, DB2 records
the Key-Rid pairs of the partitions it's organised and then updates the
NPI's with that. So empty partitions mean 0 Key-RID pairs, and should mean
things go fast ... yes? Well No! ... not necessarily! What I believe is
happening (although none of the manuals I've read gives you any clues here)
is that the Index is being read and checked against Key-RID list for
updates (instead of the other way around). Further due to the duration I
would have to surmise that it's a leaf page scan of some type. After
discussions with the application developers and being able to run RunStats
against the old object for the 3 "bad" indices we believe there is a high
incidence of index page splitting; in fact when I checked I was surprised
to see FAROFFPOS numbers of around 90% for these indices.

So the moral to this story is ... check your NPI's if you are
re-partitioning, and ReOrg the NPI's before hand.

BTW if anybody has good information on how the utilities work (the
futilities manual just doesn't cut the mustard anymore), particularly how
Key-RID pairs are updated in NPIs during a ReOrg, I'd be very interested to
get a copy..


Regards,
  Nick CIANCI


_______________________________________________________________________
From: IDUG DB2-L [mailto:[login to unmask email] On Behalf Of Mackey, Glenn
Sent: Thursday, November 05, 2009 11:41 AM
To: [login to unmask email]
Subject: [DB2-L] Changing limitkey on a Partition


Hi,
On a DB2 V8.1 z/OS environment.
I have a 700million row table with 24 partitions and the last partition is
getting full.
I plan to add some more partitions, but also decrease the 24th partition to
a lower value from its current high value.
I have tested all this and I know all the steps, except if something goes
wrong.


I plan to ….
1. Alter last partition and lower the limitkey. (It will be higher than the
highest value)
2. Table will be in REORP restricted state for the last partition.
3. Run the REORG with SHRLEVEL(NONE) on the last partition. SHRLEVEL(NONE)
is the only allowable option, cannot use SHRLEVEL(CHANGE)


If, for whatever reason the reorg fails and I have to back out, it looks
like I have one option and that is to reload the whole 700M rows from the
whole tablepace, not just the last partition.


I simulated a failure and my options were. (I took a backup of the last
partition and unloaded the last partition before starting the process.)


1. I cannot reload the last partition. Tried that. Get DSNU212I error. See
below
2. When run a recover on the last partition, it runs successfully still
leaves the partition in REORP status (and check pending) - back where I
started.


3. It seems the only option is to reload the whole tablespace. - 700M
rows. I wanted to avoid doing that.


So, if my reorg fails (cannot use SHRLEVEL CHANGE, I have to use SHRLEVEL
NONE), it looks like I need to reload the whole table - that means I have
to unload it beforehand also.


Any ideas on better ways for a recovery plan.


I am confident that all will go well, but I am preparing if for whatever
reason things do not go well and I have to recover the situation. If I do
not unload the 700m rows first, then I cannot recover in the worst
scenario.



Thank you
Glenn Mackey
GuideOne Insurance, DBA
Mail stop AB1
Page: 515.241.1627
Phone: 515.267.5767
------------------------------------------------------------


DSNU212I csect-name - REORG PENDING ON TABLE SPACE database.index-name
PROHIBITS PROCESSING
Explanation: An attempt was made to execute a utility against a table
space that is in a REORG PENDING state.
Severity: 8 (error)
System Action: Utility processing is not initiated.
User Response: Execute REORG with SHRLEVEL NONE against the partitions in
REORG PENDING state to rebalance the partitions and remove the REORG
PENDING state.

Robert Catterall

Re: Changing limitkey on a Partition (encore w/ problems)
(in response to Nick CIANCI)
Nick, a couple of things:

1) You mentioned looking at FAROFFPOS numbers in the SYSINDEXPART catalog
table for some of your NPIs. FAROFFPOS (actually, FAROFFPOSF) is an
indicator of tablespace organization, not index organization -- and it's a
valid indicator of tablespace organization only if the FAROFFPOSF value is
for the table's clustering index.

For indexes, the best indicator of organization is LEAFDIST, also in
SYSINDEXPART. It's an indicator of how many index leaf pages are between
logically successive leaf pages that should be right next to each other
(i.e., pages that would be adjacent to each other if the index were
perfectly organized). Index page splits cause an index's LEAFDIST value to
increase (if you update your catalog stats), especially if the index
definition has a non-zero value for FREEPAGE (because with no empty pages
within an index, entries shifted out of a split page have to be placed in a
page that's physically at the end of the index). As a general rule of thumb,
some folks look to reorganize an index if the associated LEAFDIST value
exceeds 200.

2) You also indicated that some of your NPIs are "un-reorg-able" due to (I
assume) the short duration of your maintenance window. Short-duration
windows are the norm these days, and at some sites they deal with that by
doing online REORGs of indexes (or tablespaces, for that matter) with a
specification of MAXRO DEFER (and the default of LONGLOG CONTINUE). That
will keep REORG from going into the final phases of the job: last iteration
of logged changes (for SHRLEVEL CHANGE), which drains write access to the
object being REORGed; and data set switch to what had been the shadow data
sets -- that drains read and write access). So, that online REORG job might
be submitted some hours before the start of the maintenance window, and --
assuming that update activity against the target object isn't super-high --
it'll get very close to completion without disrupting application access to
the object. Once the maintenance window has begun (and the target object is
no longer being accessed by application processes), an -ALTER UTILITY
command is issued to change MAXRO for the online REORG to some large number
(thousands of seconds, for example) so DB2 knows for sure that it can
initiate the final stages of the online REORG process. DB2 then completes
the job in a fairly short time (likely within the maintenance window),
because all it has to do is the last iteration of log processing to get the
original and shadow data sets in sync, and and then the data set switch.

Would such a process enable you to reorganize those NPIs that are considered
un-reorg-able at present?

*[Note:* in a DB2 for z/OS V8 environment, even this process (start with
MAXRO DEFER and then do -ALTER UTILITY to switch to MAXRO nnnn, with "nnnn"
being 9999 or some other large number) may not do the trick for a
partition-level online REORG of a subset of a partitioned tablespace's
partitions, if that tablespace has one or more NPIs -- this because the
BUILD2 phase of the utility might not complete within the maintenance window
if the partitions are very large. DB2 for z/OS V9 eliminates BUILD2 in that
situation (because NPIs get reorganized in their entirety in that case).
Some DB2 for z/OS V8 sites with enough spare disk space avoid BUILD2 by
reorganizing partitioned tablespaces in their entirety.]

Robert


On Fri, Jan 8, 2010 at 1:25 AM, Nick Cianci <[login to unmask email]> wrote:

> Hi,
> Just thought I'd update the list with some issues we experienced
> recently to do with adding in partitions, that we had not seen previously
> or encountered in any of our test environments.
>
> Our tale of woe starts with some 480M records across 26 Partitions for a
> total of 56 compressed Gb of Table. The partitioning (clustering) index
> accounts for 8Gb and is also the logical insert sequence so has high
> natural clustering. There are also 6 x NPI's (DPSIs are not really an
> option here) that occupy a total of ~170Gb, and their cluster ratio is
> "bad". With the H/K window we have available several of the indices are
> un-reorg-able.
>
> Without going into the where's and why's we don't change stats (to keep the
> Access Path's stable), but do run PDA (Plat DB Analyser) to collect stats
> for H/K purposes. Unfortunately we had also been getting audit errors on
> PDA, so our stats prior to the problem on these indices were a bit ...
> old-ish!
>
> In all of our re-partitioning the keys are structured so that it fills
> partitions along an ascending timestamp, and in all cases the final
> partition was still empty. so our strategy was to :
> (1) ALTER TABLE xxx ALTER PARTITION nnn ENDING AT
> ('2010-03-01-00.00.00.000000') ; -- from '9999-12-31-24.00.00.000000'
> (2) ALTER TABLE xxx ADD PARTITION ENDING AT
> ('2010-05-01-00.00.00.000000') ;
> ... repeat ad-nauseum
> ALTER TABLE xxx ADD PARTITION ENDING AT (maxvalue) ;
> (3) as all altered Partitions are empty the ReOrg should be very fast
>
> Anyway on the night, of the 15 objects we were repartitioning, 14 went
> through their REORG SHRLEVEL-NONE quickly and cleanly (including some with
> NPIs) but the final one (described above) kept running ... and running ...
> and running. In the end we had to call it quits. I won't go in to all the
> gory details but essentially we created a clone structure, loaded the data
> up, copied the Stats, renamed the tables, re-did the RI, and rebound all
> affected packages. For the object we had the trouble with the unload /
> reload phase went through blindingly quick, the Index build however didn't.
> Some of the indices took between 10-50 minutes to apply 0 Key-RID pairs to
> the NPI, however the 3 "bad" indices had shown no sign of completion in the
> 6 hours before we cancelled the job. We had OmegaMon object analysis
> turned on and we saw a lot of sync I/O to the indices, and very little (but
> not zero) async I/O.
>
> It was our understanding that during the Unload-Reload phases, DB2 records
> the Key-Rid pairs of the partitions it's organised and then updates the
> NPI's with that. So empty partitions mean 0 Key-RID pairs, and should mean
> things go fast ... yes? Well No! ... not necessarily! What I believe is
> happening (although none of the manuals I've read gives you any clues here)
> is that the Index is being read and checked against Key-RID list for
> updates (instead of the other way around). Further due to the duration I
> would have to surmise that it's a leaf page scan of some type. After
> discussions with the application developers and being able to run RunStats
> against the old object for the 3 "bad" indices we believe there is a high
> incidence of index page splitting; in fact when I checked I was surprised
> to see FAROFFPOS numbers of around 90% for these indices.
>
> So the moral to this story is ... check your NPI's if you are
> re-partitioning, and ReOrg the NPI's before hand.
>
> BTW if anybody has good information on how the utilities work (the
> futilities manual just doesn't cut the mustard anymore), particularly how
> Key-RID pairs are updated in NPIs during a ReOrg, I'd be very interested to
> get a copy..
>
>
> Regards,
> Nick CIANCI
>
>
> _______________________________________________________________________
> From: IDUG DB2-L [mailto:[login to unmask email] On Behalf Of Mackey, Glenn
> Sent: Thursday, November 05, 2009 11:41 AM
> To: [login to unmask email]
> Subject: [DB2-L] Changing limitkey on a Partition
>
>
> Hi,
> On a DB2 V8.1 z/OS environment.
> I have a 700million row table with 24 partitions and the last partition is
> getting full.
> I plan to add some more partitions, but also decrease the 24th partition to
> a lower value from its current high value.
> I have tested all this and I know all the steps, except if something goes
> wrong.
>
>
> I plan to ….
> 1. Alter last partition and lower the limitkey. (It will be higher than the
> highest value)
> 2. Table will be in REORP restricted state for the last partition.
> 3. Run the REORG with SHRLEVEL(NONE) on the last partition. SHRLEVEL(NONE)
> is the only allowable option, cannot use SHRLEVEL(CHANGE)
>
>
> If, for whatever reason the reorg fails and I have to back out, it looks
> like I have one option and that is to reload the whole 700M rows from the
> whole tablepace, not just the last partition.
>
>
> I simulated a failure and my options were. (I took a backup of the last
> partition and unloaded the last partition before starting the process.)
>
>
> 1. I cannot reload the last partition. Tried that. Get DSNU212I error. See
> below
> 2. When run a recover on the last partition, it runs successfully still
> leaves the partition in REORP status (and check pending) - back where I
> started.
>
>
> 3. It seems the only option is to reload the whole tablespace. - 700M
> rows. I wanted to avoid doing that.
>
>
> So, if my reorg fails (cannot use SHRLEVEL CHANGE, I have to use SHRLEVEL
> NONE), it looks like I need to reload the whole table - that means I have
> to unload it beforehand also.
>
>
> Any ideas on better ways for a recovery plan.
>
>
> I am confident that all will go well, but I am preparing if for whatever
> reason things do not go well and I have to recover the situation. If I do
> not unload the 700m rows first, then I cannot recover in the worst
> scenario.
>
>
>
> Thank you
> Glenn Mackey
> GuideOne Insurance, DBA
> Mail stop AB1
> Page: 515.241.1627
> Phone: 515.267.5767
> ------------------------------------------------------------
>
>
> DSNU212I csect-name - REORG PENDING ON TABLE SPACE database.index-name
> PROHIBITS PROCESSING
> Explanation: An attempt was made to execute a utility against a table
> space that is in a REORG PENDING state.
> Severity: 8 (error)
> System Action: Utility processing is not initiated.
> User Response: Execute REORG with SHRLEVEL NONE against the partitions in
> REORG PENDING state to rebalance the partitions and remove the REORG
> PENDING state.




--
Robert Catterall
Catterall Consulting
www.catterallconsulting.com

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA *
_____________________________________________________________________

http://www.IDUG.org membership is now free.
Do you have people in your office who are not an IDUG member?
Show them how to access the information and help train the next generation of DB2 Users!
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

[login to unmask email]

Re: Changing limitkey on a Partition (encore w/ problems)
(in response to Robert Catterall)
Nick,

I'm also curious on what is DB2 doing during updating NPIs when reorg an
"empty" part.
I would suggest to run a DSN1LOGP agaisnt that TS/IX object and reorg
utility, maybe
a short time period is enough, to get some clues.
What I could guess now, is some pseudo_delete_entries clearance? I have no
idea about
that internal update NPI process details, hope some IBMer could shed a
light.


William Huang
ICBC Data Center (Shanghai)





Nick Cianci <[login to unmask email]>
·¢¼þÈË: IDUG DB2-L <[login to unmask email]>
2010-01-08 14:25
Çë´ð¸´ ¸ø
IDUG DB2-L <[login to unmask email]>


ÊÕ¼þÈË
[login to unmask email]
³­ËÍ

Ö÷Ìâ
Re: [DB2-L] Changing limitkey on a Partition (encore w/ problems)






Hi,
Just thought I'd update the list with some issues we experienced
recently to do with adding in partitions, that we had not seen previously
or encountered in any of our test environments.

Our tale of woe starts with some 480M records across 26 Partitions for a
total of 56 compressed Gb of Table. The partitioning (clustering) index
accounts for 8Gb and is also the logical insert sequence so has high
natural clustering. There are also 6 x NPI's (DPSIs are not really an
option here) that occupy a total of ~170Gb, and their cluster ratio is
"bad". With the H/K window we have available several of the indices are
un-reorg-able.

Without going into the where's and why's we don't change stats (to keep
the
Access Path's stable), but do run PDA (Plat DB Analyser) to collect stats
for H/K purposes. Unfortunately we had also been getting audit errors on
PDA, so our stats prior to the problem on these indices were a bit ...
old-ish!

In all of our re-partitioning the keys are structured so that it fills
partitions along an ascending timestamp, and in all cases the final
partition was still empty. so our strategy was to :
(1) ALTER TABLE xxx ALTER PARTITION nnn ENDING AT
('2010-03-01-00.00.00.000000') ; -- from '9999-12-31-24.00.00.000000'
(2) ALTER TABLE xxx ADD PARTITION ENDING AT
('2010-05-01-00.00.00.000000') ;
... repeat ad-nauseum
ALTER TABLE xxx ADD PARTITION ENDING AT (maxvalue) ;
(3) as all altered Partitions are empty the ReOrg should be very fast

Anyway on the night, of the 15 objects we were repartitioning, 14 went
through their REORG SHRLEVEL-NONE quickly and cleanly (including some with
NPIs) but the final one (described above) kept running ... and running ...
and running. In the end we had to call it quits. I won't go in to all
the
gory details but essentially we created a clone structure, loaded the data
up, copied the Stats, renamed the tables, re-did the RI, and rebound all
affected packages. For the object we had the trouble with the unload /
reload phase went through blindingly quick, the Index build however
didn't.
Some of the indices took between 10-50 minutes to apply 0 Key-RID pairs to
the NPI, however the 3 "bad" indices had shown no sign of completion in
the
6 hours before we cancelled the job. We had OmegaMon object analysis
turned on and we saw a lot of sync I/O to the indices, and very little
(but
not zero) async I/O.

It was our understanding that during the Unload-Reload phases, DB2 records
the Key-Rid pairs of the partitions it's organised and then updates the
NPI's with that. So empty partitions mean 0 Key-RID pairs, and should
mean
things go fast ... yes? Well No! ... not necessarily! What I believe is
happening (although none of the manuals I've read gives you any clues
here)
is that the Index is being read and checked against Key-RID list for
updates (instead of the other way around). Further due to the duration I
would have to surmise that it's a leaf page scan of some type. After
discussions with the application developers and being able to run RunStats
against the old object for the 3 "bad" indices we believe there is a high
incidence of index page splitting; in fact when I checked I was surprised
to see FAROFFPOS numbers of around 90% for these indices.

So the moral to this story is ... check your NPI's if you are
re-partitioning, and ReOrg the NPI's before hand.

BTW if anybody has good information on how the utilities work (the
futilities manual just doesn't cut the mustard anymore), particularly how
Key-RID pairs are updated in NPIs during a ReOrg, I'd be very interested
to
get a copy..


Regards,
Nick CIANCI


_______________________________________________________________________
From: IDUG DB2-L [mailto:[login to unmask email] On Behalf Of Mackey, Glenn
Sent: Thursday, November 05, 2009 11:41 AM
To: [login to unmask email]
Subject: [DB2-L] Changing limitkey on a Partition


Hi,
On a DB2 V8.1 z/OS environment.
I have a 700million row table with 24 partitions and the last partition
is
getting full.
I plan to add some more partitions, but also decrease the 24th partition
to
a lower value from its current high value.
I have tested all this and I know all the steps, except if something goes
wrong.


I plan to ¡­.
1. Alter last partition and lower the limitkey. (It will be higher than
the
highest value)
2. Table will be in REORP restricted state for the last partition.
3. Run the REORG with SHRLEVEL(NONE) on the last partition. SHRLEVEL(NONE)
is the only allowable option, cannot use SHRLEVEL(CHANGE)


If, for whatever reason the reorg fails and I have to back out, it looks
like I have one option and that is to reload the whole 700M rows from the
whole tablepace, not just the last partition.


I simulated a failure and my options were. (I took a backup of the last
partition and unloaded the last partition before starting the process.)


1. I cannot reload the last partition. Tried that. Get DSNU212I error. See
below
2. When run a recover on the last partition, it runs successfully still
leaves the partition in REORP status (and check pending) - back where I
started.


3. It seems the only option is to reload the whole tablespace. - 700M
rows. I wanted to avoid doing that.


So, if my reorg fails (cannot use SHRLEVEL CHANGE, I have to use SHRLEVEL
NONE), it looks like I need to reload the whole table - that means I have
to unload it beforehand also.


Any ideas on better ways for a recovery plan.


I am confident that all will go well, but I am preparing if for whatever
reason things do not go well and I have to recover the situation. If I do
not unload the 700m rows first, then I cannot recover in the worst
scenario.



Thank you
Glenn Mackey
GuideOne Insurance, DBA
Mail stop AB1
Page: 515.241.1627
Phone: 515.267.5767
------------------------------------------------------------


DSNU212I csect-name - REORG PENDING ON TABLE SPACE database.index-name
PROHIBITS PROCESSING
Explanation: An attempt was made to execute a utility against a table
space that is in a REORG PENDING state.
Severity: 8 (error)
System Action: Utility processing is not initiated.
User Response: Execute REORG with SHRLEVEL NONE against the partitions in
REORG PENDING state to rebalance the partitions and remove the REORG
PENDING state.

Nick CIANCI

Re: Changing limitkey on a Partition (encore w/ problems)
(in response to hhuang@DCCSH.ICBC.COM.CN)
Hi Robert,
thanks for your comments. BTW I still refer to your red-leather
bound (v3 I think it was) DB2 SQL performance manual from time to time (a
good read)!

Sorry you are right in my rush to put this out I got NEAR/FAROFFPOSF
mixed-up with LEAFNEAR/FAR, which was what I'd meant (... mia culpa ... mia
culpa!)

IXNAME LEAFDIST FAROFFPOSF NEAROFFPOSF PSEUDO_DEL_
LEAFNEAR LEAFFAR
X???02 376,013,778 +0.431586745E+09 +0.14847017E+08 897,738
375,590 9,416,166
X???04 1,248,325 +0.196795751E+09 +0.48310866E+08 107
40,840 52,885
X???06 477,073,116 +0.444317495E+09 +0.95139530E+07 4
372,748 10,598,105
X???07 10,887,526 +0.152976384E+09 +0.50046641E+08 42
15,568 167,384
X???08 523,471 +0.403860153E+09 +0.31472571E+08 1,528
447,966 6,839,964
X???09 75,187 +0.188297173E+09 +0.67205692E+08 4,432
503,119 3,637,959

FREEPAGE 0 for all indices + PCTFREE for index 8+9 was 10% the others were
1% FullKeyCard(F) for 7 is 115M & 9 is 390M the others are all ~480M rows.

Fro the Repartitioning SHR-NONE Part ReOrg of the NPI indices 4 & 7
completed inside of 20 minutes, 9 took ~50 minutes; it was index 2,6 & 8
that showed no sign of completion inside 6 hours.

We were able to and did ReOrG some of the indices (.. and you can pretty
much guess which they were). As for DEFERING the switch yes we have done
that in the past with some of our other indices, why we didn't with these
... I can't say. As mentioned our PDA stats for these objects wasn't
working very well, and I suspect that coupled with the administrivia to be
part of the issue. I can't say we experienced a lot of performance related
issues on the old indices, so I suspect we were getting good SQL filtering
coming through those indices without any prefetch requirements.

With respect to ReOrg the table, whilst we probably have the DASD to mirror
the object (sometimes), we don't have the SORTWORK pool to do so, which is
where our other problems lie.

In the new structure that is currently active I've set up permanent weekly
ReOrgs for all the NPIs with PCTFREE of 5% for index 4+7+9 and 10% for
index 2+6+8. Each of the biger index takes about 30-40 minutes, and fits
into our window now. I thought about FREEPAGE for a while, with a smaller
PCTFREE, but if the insert sequence is as random as it seems I belive the
higher PCTFREE to be the better option (so long as we don;t start splitting
pages). I might run a RUNSTATS UPDATE NONE REPORT YES to check on it once
in a while though.

William,
thanks I just saw your e-mail, but don't believe it's Pseudo_delete
related as Stats show index 6 had low Pseudo_Del RIDS but it still took
"forever".


Regards,
  Nick CIANCI




Robert Catterall
<[login to unmask email]
L.COM> To
Sent by: IDUG [login to unmask email]
DB2-L cc
<[login to unmask email]
ORG> Subject
Re: [DB2-L] Changing limitkey on a
Partition (encore w/ problems)
12/01/10 02:59 AM


Please respond to
IDUG DB2-L
<[login to unmask email]
2-L.ORG>






Nick, a couple of things:

1) You mentioned looking at FAROFFPOS numbers in the SYSINDEXPART catalog
table for some of your NPIs. FAROFFPOS (actually, FAROFFPOSF) is an
indicator of tablespace organization, not index organization -- and it's a
valid indicator of tablespace organization only if the FAROFFPOSF value is
for the table's clustering index.

For indexes, the best indicator of organization is LEAFDIST, also in
SYSINDEXPART. It's an indicator of how many index leaf pages are between
logically successive leaf pages that should be right next to each other
(i.e., pages that would be adjacent to each other if the index were
perfectly organized). Index page splits cause an index's LEAFDIST value to
increase (if you update your catalog stats), especially if the index
definition has a non-zero value for FREEPAGE (because with no empty pages
within an index, entries shifted out of a split page have to be placed in a
page that's physically at the end of the index). As a general rule of
thumb, some folks look to reorganize an index if the associated LEAFDIST
value exceeds 200.

2) You also indicated that some of your NPIs are "un-reorg-able" due to (I
assume) the short duration of your maintenance window. Short-duration
windows are the norm these days, and at some sites they deal with that by
doing online REORGs of indexes (or tablespaces, for that matter) with a
specification of MAXRO DEFER (and the default of LONGLOG CONTINUE). That
will keep REORG from going into the final phases of the job: last iteration
of logged changes (for SHRLEVEL CHANGE), which drains write access to the
object being REORGed; and data set switch to what had been the shadow data
sets -- that drains read and write access). So, that online REORG job might
be submitted some hours before the start of the maintenance window, and --
assuming that update activity against the target object isn't super-high --
it'll get very close to completion without disrupting application access to
the object. Once the maintenance window has begun (and the target object is
no longer being accessed by application processes), an -ALTER UTILITY
command is issued to change MAXRO for the online REORG to some large number
(thousands of seconds, for example) so DB2 knows for sure that it can
initiate the final stages of the online REORG process. DB2 then completes
the job in a fairly short time (likely within the maintenance window),
because all it has to do is the last iteration of log processing to get the
original and shadow data sets in sync, and and then the data set switch.

Would such a process enable you to reorganize those NPIs that are
considered un-reorg-able at present?

[Note: in a DB2 for z/OS V8 environment, even this process (start with
MAXRO DEFER and then do -ALTER UTILITY to switch to MAXRO nnnn, with "nnnn"
being 9999 or some other large number) may not do the trick for a
partition-level online REORG of a subset of a partitioned tablespace's
partitions, if that tablespace has one or more NPIs -- this because the
BUILD2 phase of the utility might not complete within the maintenance
window if the partitions are very large. DB2 for z/OS V9 eliminates BUILD2
in that situation (because NPIs get reorganized in their entirety in that
case). Some DB2 for z/OS V8 sites with enough spare disk space avoid BUILD2
by reorganizing partitioned tablespaces in their entirety.]

Robert

On Fri, Jan 8, 2010 at 1:25 AM, Nick Cianci <[login to unmask email]> wrote:
Hi,     Just thought I'd update the list with some issues we experienced
recently to do with adding in partitions, that we had not seen previously
or encountered in any of our test environments.

Our tale of woe starts with some 480M records across 26 Partitions for a
total of 56 compressed Gb of Table.  The partitioning (clustering) index
accounts for 8Gb and is also the logical insert sequence so has high
natural clustering.  There are also 6 x NPI's (DPSIs are not really an
option here) that occupy a total of ~170Gb, and their cluster ratio is
"bad".  With the H/K window we have available several of the indices are
un-reorg-able.

Without going into the where's and why's we don't change stats (to keep
the Access Path's stable), but do run PDA (Plat DB Analyser) to collect
stats for H/K purposes.  Unfortunately we had also been getting audit
errors on PDA, so our stats prior to the problem on these indices were a
bit ... old-ish!

In all of our re-partitioning the keys are structured so that it fills
partitions along an ascending timestamp, and in all cases the final
partition was still empty. so our strategy was to :
(1) ALTER TABLE xxx ALTER PARTITION nnn ENDING AT
('2010-03-01-00.00.00.000000') ;     --
from  '9999-12-31-24.00.00.000000'
(2) ALTER TABLE xxx ADD PARTITION   ENDING AT
('2010-05-01-00.00.00.000000') ;     ... repeat ad-nauseum     ALTER
TABLE xxx ADD PARTITION   ENDING AT (maxvalue) ;
(3) as all altered Partitions are empty the ReOrg should be very fast

Anyway on the night, of the 15 objects we were repartitioning, 14 went
through their REORG SHRLEVEL-NONE quickly and cleanly (including some
with NPIs) but the final one (described above) kept running ... and
running ... and running.  In the end we had to call it quits.  I won't go
in to all the gory details but essentially we created a clone structure,
loaded the data up, copied the Stats, renamed the tables, re-did the RI,
and rebound all affected packages.  For the object we had the trouble
with the unload / reload phase went through blindingly quick, the Index
build however didn't. Some of the indices took between 10-50 minutes to
apply 0 Key-RID pairs to the NPI, however the 3 "bad" indices had shown
no sign of completion in the 6 hours before we cancelled the job.  We had
OmegaMon object analysis turned on and we saw a lot of sync I/O to the
indices, and very little (but not zero) async I/O.

It was our understanding that during the Unload-Reload phases, DB2
records the Key-Rid pairs of the partitions it's organised and then
updates the NPI's with that.  So empty partitions mean 0 Key-RID pairs,
and should mean things go fast ... yes?  Well No! ... not
necessarily!  What I believe is happening (although none of the manuals
I've read gives you any clues here) is that the Index is being read and
checked against Key-RID list for updates (instead of the other way
around).  Further due to the duration I would have to surmise that it's a
leaf page scan of some type.  After discussions with the application
developers and being able to run RunStats against the old object for the
3 "bad" indices we believe there is a high incidence of index page
splitting; in fact when I checked I was surprised to see FAROFFPOS
numbers of around 90% for these indices.

So the moral to this story is ... check your NPI's if you are
re-partitioning, and ReOrg the NPI's before hand.

BTW if anybody has good information on how the utilities work  (the
futilities manual just doesn't cut the mustard anymore), particularly how
Key-RID pairs are updated in NPIs during a ReOrg, I'd be very interested
to get a copy..


Regards,       Nick
CIANCI _______________________________________________________________________

From: IDUG DB2-L [mailto:[login to unmask email] On Behalf Of Mackey, Glenn
Sent: Thursday, November 05, 2009 11:41 AM
To: [login to unmask email]
Subject: [DB2-L] Changing limitkey on a Partition


Hi,
On a DB2 V8.1 z/OS environment. I have a 700million row table with 24
partitions and the last partition is getting full.
I plan to add some more partitions, but also decrease the 24th partition
to a lower value from its current high value.
I have tested all this and I know all the steps, except if something goes
wrong.

I plan to ….
1. Alter last partition and lower the limitkey. (It will be higher than
the highest value)
2. Table will be in REORP restricted state for the last partition.
3. Run the REORG with SHRLEVEL(NONE) on the last
partition.  SHRLEVEL(NONE) is the only allowable option, cannot use
SHRLEVEL(CHANGE)

If, for whatever reason the reorg fails and I have to back out, it looks
like I have one option and that is to reload the whole 700M rows from the
whole tablepace, not just the last partition.

I simulated a failure and my options were. (I took a backup of the last
partition and unloaded the last partition before starting the process.)

1. I cannot reload the last partition. Tried that. Get DSNU212I error.
See below
2. When run a recover on the last partition, it runs successfully still
leaves the partition in REORP status (and check pending) - back where I
started.
3. It seems the only option is to reload the whole tablespace.  - 700M
rows. I wanted to avoid doing that.

So, if my reorg fails (cannot use SHRLEVEL CHANGE, I have to use SHRLEVEL
NONE),  it looks like I need to reload the whole table - that means I
have to unload it beforehand also.

Any ideas on better ways for a recovery plan.

I am confident that all will go well, but I am preparing if for whatever
reason things do not go well and I have to recover the situation. If I do
not unload the 700m rows first, then I cannot recover in the worst
scenario.

Thank you
Glenn Mackey
GuideOne Insurance, DBA
Mail stop AB1
Page: 515.241.1627
Phone: 515.267.5767
------------------------------------------------------------

DSNU212I csect-name - REORG PENDING ON TABLE SPACE database.index-name
PROHIBITS PROCESSING
Explanation:  An attempt was made to execute a utility against a table
space that is in a REORG PENDING state.
Severity:  8 (error)
System Action:  Utility processing is not initiated.
User Response:  Execute REORG with SHRLEVEL NONE against the partitions
in REORG PENDING state to rebalance the partitions and remove the REORG
PENDING state.

--
Robert Catterall
Catterall Consulting
www.catterallconsulting.com




(Embedded image moved to file: pic12035.jpg)IDUG - The Worldwide DB2 User
Community!


The IDUG DB2-L Listserv is only part of your membership in IDUG. If you are
not already an IDUG member, please register here.