Hi,
We recently ran a reorg of a big table and due to past experience of rebalancing a select few partitions of the same table, and reorgs of other tables, we expected this reorg to run for about 10-12 hours. But we were wrong, it took about 1,5 hours. This is a luxury "problem" and to learn for the future we are trying to figure out why it went so quick, my theory right now is: "Abundance of available workspace allowed db2 to make maximal use of parallelism". It would be great if someone could cast some light on if the theory is correct, and maybe more importantly, how does Db2 know when it can grab a certain amount of space (disk/work) to increase performance/parallelism but not starve the environment for other processes that needs the same resources.
Before running the reorg we looked at table statistics (which later turned out to be old) and saw that the table seemed to use about 2,2T, and so we estimated that we would need at least slightly more than double that in workspace for the reorg. We got our space team to temporary grant us 4,5T extra workspace and after the reorg they could see that we used about 3,8T, which is not so off our estimations since the table actually used 1,6T of disk space and 3,8/1,6 = 2,375.
Knowledgecenter "Improving REORG TABLESPACE performance" of cause doesn't mention this at all.
Same goes for "Parallel index building for REORG TABLESPACE"
Some background might be needed, so
The table was
UTS PBR (now UTS PBR2)
DSSIZE 16 G (the reorg materialized DSSIZE 32G)
NUMPARTS 240
SEGSIZE 64
PAGENUM ABSOLUTE (the reorg materialized PAGENUM RELATIVE)
SpaceUsed: about 1,6T
Two indexes, one partitioned and one not partitioned.
The reorg itself was as below with jobcard PGM=DSNUTILB,REGION=0M
REORG TABLESPACE DB.TS
LOG NO
COPYDDN (TAPEXXXX)
PUNCHDDN (PUNCHXXX)
DISCARDDN (DISCXXXX)
UNLDDN (RECXXXXX)
SHRLEVEL CHANGE
DRAIN_WAIT 3
RETRY 10
MAXRO 3
DRAIN ALL
DELAY 180
RETRY_DELAY 30
DRAIN_ALLPARTS NO
FASTSWITCH YES
TIMEOUT TERM
STATISTICS TABLE(ALL) SAMPLE 25 INDEX(ALL)
KEYCARD
FREQVAL NUMCOLS 1 COUNT 10
HISTORY ALL
REPORT NO
UPDATE ALL
SORTDEVT WORKXXXX
Some output from the job:
MAXIMUM UTILITY PARALLELISM IS 489 BASED ON NUMBER OF PARTITIONS AND INDEXES AND STATISTICS
UTILITY PERFORMS DYNAMIC ALLOCATION OF SORT DISK SPACE
NUMBER OF OPTIMAL SORT TASKS = 243, NUMBER OF ACTIVE SORT TASKS = 6
DATA RECORDS WILL BE UNLOADED VIA TABLE SPACE SCAN FROM TABLESPACE
PARTITIONS WILL BE UNLOADED/RELOADED IN PARALLEL, NUMBER OF TASKS = 3
INDEXES WILL BE BUILT IN PARALLEL, NUMBER OF TASKS = 9
NUMBER OF TASKS CONSTRAINED BY ZPARM PARAMDEG_UTIL TO 15
NUMBER OF TASKS CONSTRAINED BY CPUS TO 153
Output from another reorg without the extra 4,5T workspace
NUMBER OF TASKS CONSTRAINED BY CPUS TO 23
NUMBER OF TASKS CONSTRAINED BY ZPARM PARAMDEG_UTIL TO 14
Some observations from the output is that the timestamps in each reorg phase indicate that all (240) table partitions are handled at the same time in parallel. Also, I don't know if it's a coincidence but 240*2+9 will make the 489 specified as "MAXIMUM UTILITY PARALLELISM". 240*2 could be explained by pairing subtasks for the partitioned index "Parallel index building reduces the elapsed time for a REORG TABLESPACE job by sorting the index keys and rebuilding multiple indexes in parallel, rather than sequentially. Optimally, a pair of subtasks processes each index; one subtask sorts extracted keys, whereas the other subtask builds the index." https://www.ibm.com/docs/en/db2-for-zos/12?topic=tablespace-parallel-index-building-reorg
Regards
Johan Sundborg
------------------------------
Johan Sundborg, Swedbank AB
------------------------------