DB2 - L

 View Only
  • 1.  [zOS] Reorg performance

    Posted Nov 03, 2022 09:07 AM

    Hi,

    We recently ran a reorg of a big table and due to past experience of rebalancing a select few partitions of the same table, and reorgs of other tables, we expected this reorg to run for about 10-12 hours. But we were wrong, it took about 1,5 hours. This is a luxury "problem" and to learn for the future we are trying to figure out why it went so quick, my theory right now is: "Abundance of available workspace allowed db2 to make maximal use of parallelism". It would be great if someone could cast some light on if the theory is correct, and maybe more importantly, how does Db2 know when it can grab a certain amount of space (disk/work) to increase performance/parallelism but not starve the environment for other processes that needs the same resources.

    Before running the reorg we looked at table statistics (which later turned out to be old) and saw that the table seemed to use about 2,2T, and so we estimated that we would need at least slightly more than double that in workspace for the reorg. We got our space team to temporary grant us 4,5T extra workspace and after the reorg they could see that we used about 3,8T, which is not so off our estimations since the table actually used 1,6T of disk space and 3,8/1,6 = 2,375.

    Knowledgecenter "Improving REORG TABLESPACE performance" of cause doesn't mention this at all.
    Same goes for "Parallel index building for REORG TABLESPACE"

    Some background might be needed, so
    The table was
    UTS PBR (now UTS PBR2)
    DSSIZE 16 G   (the reorg materialized DSSIZE 32G)
    NUMPARTS 240
    SEGSIZE 64
    PAGENUM ABSOLUTE (the reorg materialized PAGENUM RELATIVE)
    SpaceUsed: about 1,6T
    Two indexes, one partitioned and one not partitioned.

    The reorg itself was as below with jobcard PGM=DSNUTILB,REGION=0M
    REORG TABLESPACE DB.TS
    LOG NO
    COPYDDN (TAPEXXXX)
    PUNCHDDN (PUNCHXXX)
    DISCARDDN (DISCXXXX)
    UNLDDN (RECXXXXX)
    SHRLEVEL CHANGE
    DRAIN_WAIT 3
    RETRY 10
    MAXRO 3
    DRAIN ALL
    DELAY 180
    RETRY_DELAY 30
    DRAIN_ALLPARTS NO
    FASTSWITCH YES
    TIMEOUT TERM
    STATISTICS TABLE(ALL) SAMPLE 25 INDEX(ALL)
    KEYCARD
    FREQVAL NUMCOLS 1 COUNT 10
    HISTORY ALL
    REPORT NO
    UPDATE ALL
    SORTDEVT WORKXXXX

    Some output from the job:
    MAXIMUM UTILITY PARALLELISM IS 489 BASED ON NUMBER OF PARTITIONS AND INDEXES AND STATISTICS

    UTILITY PERFORMS DYNAMIC ALLOCATION OF SORT DISK SPACE
    NUMBER OF OPTIMAL SORT TASKS = 243, NUMBER OF ACTIVE SORT TASKS = 6
    DATA RECORDS WILL BE UNLOADED VIA TABLE SPACE SCAN FROM TABLESPACE

    PARTITIONS WILL BE UNLOADED/RELOADED IN PARALLEL, NUMBER OF TASKS = 3
    INDEXES WILL BE BUILT IN PARALLEL, NUMBER OF TASKS = 9
    NUMBER OF TASKS CONSTRAINED BY ZPARM PARAMDEG_UTIL TO 15
    NUMBER OF TASKS CONSTRAINED BY CPUS TO 153

    Output from another reorg without the extra 4,5T workspace
    NUMBER OF TASKS CONSTRAINED BY CPUS TO 23               
    NUMBER OF TASKS CONSTRAINED BY ZPARM PARAMDEG_UTIL TO 14  

    Some observations from the output is that the timestamps in each reorg phase indicate that all (240) table partitions are handled at the same time in parallel. Also, I don't know if it's a coincidence but 240*2+9 will make the 489 specified as "MAXIMUM UTILITY PARALLELISM". 240*2 could be explained by pairing subtasks for the partitioned index "Parallel index building reduces the elapsed time for a REORG TABLESPACE job by sorting the index keys and rebuilding multiple indexes in parallel, rather than sequentially. Optimally, a pair of subtasks processes each index; one subtask sorts extracted keys, whereas the other subtask builds the index." https://www.ibm.com/docs/en/db2-for-zos/12?topic=tablespace-parallel-index-building-reorg

    Regards
    Johan Sundborg



    ------------------------------
    Johan Sundborg, Swedbank AB
    ------------------------------