suggestions for cutting down I/O delay time

Sankar Sathyamoorthy

suggestions for cutting down I/O delay time

Hello,

I am working on a data correction for prod and testing in QA and I am running 265 update queries in a single batch job. Job is running for 23 hours, I think the delay is mostly coming from I/O delay. All the queries are updating ROLE_TYP_ID to 107 from 3. I am looking for some advice to cut down the I/O delay and improve job run time. 

Table is clustered with primary key seq IPRDC01 index.

(PRDC_ID,

      CHR_TYP_ID,

             CHR_VL_TYP_ID,

                  ROLE_TYP_ID)

 

Query                                                                                   cardinality for columns
UPDATE Db2.prdc_chr_all B                                                         374095320
SET B.ROLE_TYP_ID = 107                                                                       16
WHERE B.CHR_TYP_ID = 171242496                                                  1840
AND B.ROLE_TYP_ID = 3;                                                                          16

NACTIVEF for tablespace is 2262326

265 queries updating 28887153 rows.

Query is using Index scan with 2 matching cols. 

Access path:

--+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----
QNO PNO SQ M TABLE_NAME       A CS INDEX       IO UJOG UJOGC P CE TYPE MJC MS SU
--+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----
01        01 00 0     TPRO_PRDC_CH I   02 IPRDCR03 Y ----            ---- S UPDATE ------    1143 33598
01        01 00 0     TPRO_PRDC_CH R  00                   N ----            ---- S SELECT ------ 2829468 83219648

 

IPRDC03 has two columns: CHR_TYP_ID and ROLE_TYP_ID. 

Some stats from Detector

Buffer Manager Activity
GETPAGE -> 520772417 GETPFAIL -> 0
SYNCREAD -> 31910902 SPFETCH -> 0
LPFETCH -> 4002 DYNPFETCH -> 6506573
PFPAGES -> 192030737 PAGEUPDT -> 145251757
IMWRITE -> 50 REOPT -> 0

Rows Affected
FETCHED -> 0 INSERTED -> 0
UPDATED -> 28887153 DELETED -> 0

 

Wait Times And Counts
IO_WTIME -> 10:37:03.581                            IOWCNT -> 31910948
LOCK_WTIME -> 00:00.047339                     LOCKWCNT -> 26
ORIO_WTIME -> 02:33:09.890                      ORIOWCNT -> 2786318
OWIO_WTIME -> 01:26.746449                     OWIOWCNT -> 29132
SERV_WTIME -> 00:00.000000                     SERVWCNT -> 0
ARCH_WTIME -> 00:00.000000                    ARCHWCNT -> 0
LATCH_WTIME -> 00:00.000000                   LATCHWCNT -> 0
PLATCH_WTIME -> 00:00.000000                PLATCHWCNT -> 0
DRAIN_WTIME -> 00:00.000000                   DRAINWCNT -> 0
CLAIM_WTIME -> 00:00.000000                   CLAIMWCNT -> 0
ARCRD_WTIME -> 00:00.000000                 ARCRDWCNT -> 0
SMSG_WTIME -> 00:00.000000                   SMSGWCNT -> 0
GLOCK_WTIME -> 00:00.000000                 GLOCKWCNT -> 0
LOG_WTIME -> 00:00.212711                       LOGWCNT -> 3
LOB_WTIME -> 00:00.000000                        LOBWCNT -> 0
OCS_WTIME -> 00:00.000000                      OCSWCNT -> 0
SLS_WTIME -> 00:00.000000                       SLSWCNT -> 0
DSS_WTIME -> 00:01.502307                       DSSWCNT -> 1
OTS_WTIME -> 00:04.882355                       OTSWCNT -> 156
PLOCK_WTIME -> 00:00.000000                  PLOCKWCNT -> 0
LLOCK_WTIME -> 00:00.000000                  LLOCKWCNT -> 0
AT_WTIME -> 00:00.000000                         AT_WCNT -> 0

I tried to put all the values in single in query but it resulted in Tablespace scan which I think is due to the query touching most of the pages in the table. 

 

Walter Janißen

AW: suggestions for cutting down I/O delay time
(in response to Sankar Sathyamoorthy)
Hi

I am pretty sure that a tablespace scan is better, i.e. doing all the updates in one statement as you tried to do. When I understand your data right then there were 145 million page updates, but only about 29 million rows updated. So one page was updated about 5 times. With a tablespace scan you will avoid that. Also the access path picked up index 3, but not the clustering index, which also means that a page was read several times.

Kind regards
Walter Janißen [standard_IBM+Champ+7+Yr+Analytics]

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Technische Anwendungsarchitektur
Victoriaplatz 2
D-40198 Düsseldorf
[login to unmask email]<mailto:[login to unmask email]>

ITERGO Informationstechnologie GmbH
Vorsitzender des Aufsichtsrats: Christian Diedrich
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Lothar Engelke, Ina Kirchhof, Dr. Michael Regauer
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf HRB 37996

Von: Sankar Sathyamoorthy [mailto:[login to unmask email]
Gesendet: Dienstag, 12. September 2017 05:45
An: [login to unmask email]
Betreff: [DB2-L] - suggestions for cutting down I/O delay time


Hello,

I am working on a data correction for prod and testing in QA and I am running 265 update queries in a single batch job. Job is running for 23 hours, I think the delay is mostly coming from I/O delay. All the queries are updating ROLE_TYP_ID to 107 from 3. I am looking for some advice to cut down the I/O delay and improve job run time.

Table is clustered with primary key seq IPRDC01 index.

(PRDC_ID,

CHR_TYP_ID,

CHR_VL_TYP_ID,

ROLE_TYP_ID)



Query cardinality for columns
UPDATE Db2.prdc_chr_all B 374095320
SET B.ROLE_TYP_ID = 107 16
WHERE B.CHR_TYP_ID = 171242496 1840
AND B.ROLE_TYP_ID = 3; 16

NACTIVEF for tablespace is 2262326

265 queries updating 28887153 rows.

Query is using Index scan with 2 matching cols.

Access path:

--+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----
QNO PNO SQ M TABLE_NAME A CS INDEX IO UJOG UJOGC P CE TYPE MJC MS SU
--+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----
01 01 00 0 TPRO_PRDC_CH I 02 IPRDCR03 Y ---- ---- S UPDATE ------ 1143 33598
01 01 00 0 TPRO_PRDC_CH R 00 N ---- ---- S SELECT ------ 2829468 83219648



IPRDC03 has two columns: CHR_TYP_ID and ROLE_TYP_ID.

Some stats from Detector

Buffer Manager Activity
GETPAGE -> 520772417 GETPFAIL -> 0
SYNCREAD -> 31910902 SPFETCH -> 0
LPFETCH -> 4002 DYNPFETCH -> 6506573
PFPAGES -> 192030737 PAGEUPDT -> 145251757
IMWRITE -> 50 REOPT -> 0

Rows Affected
FETCHED -> 0 INSERTED -> 0
UPDATED -> 28887153 DELETED -> 0



Wait Times And Counts
IO_WTIME -> 10:37:03.581 IOWCNT -> 31910948
LOCK_WTIME -> 00:00.047339 LOCKWCNT -> 26
ORIO_WTIME -> 02:33:09.890 ORIOWCNT -> 2786318
OWIO_WTIME -> 01:26.746449 OWIOWCNT -> 29132
SERV_WTIME -> 00:00.000000 SERVWCNT -> 0
ARCH_WTIME -> 00:00.000000 ARCHWCNT -> 0
LATCH_WTIME -> 00:00.000000 LATCHWCNT -> 0
PLATCH_WTIME -> 00:00.000000 PLATCHWCNT -> 0
DRAIN_WTIME -> 00:00.000000 DRAINWCNT -> 0
CLAIM_WTIME -> 00:00.000000 CLAIMWCNT -> 0
ARCRD_WTIME -> 00:00.000000 ARCRDWCNT -> 0
SMSG_WTIME -> 00:00.000000 SMSGWCNT -> 0
GLOCK_WTIME -> 00:00.000000 GLOCKWCNT -> 0
LOG_WTIME -> 00:00.212711 LOGWCNT -> 3
LOB_WTIME -> 00:00.000000 LOBWCNT -> 0
OCS_WTIME -> 00:00.000000 OCSWCNT -> 0
SLS_WTIME -> 00:00.000000 SLSWCNT -> 0
DSS_WTIME -> 00:01.502307 DSSWCNT -> 1
OTS_WTIME -> 00:04.882355 OTSWCNT -> 156
PLOCK_WTIME -> 00:00.000000 PLOCKWCNT -> 0
LLOCK_WTIME -> 00:00.000000 LLOCKWCNT -> 0
AT_WTIME -> 00:00.000000 AT_WCNT -> 0

I tried to put all the values in single in query but it resulted in Tablespace scan which I think is due to the query touching most of the pages in the table.



-----End Original Message-----
Attachments

  • image001.png (2.6k)

Alan Gredell

RE: suggestions for cutting down I/O delay time
(in response to Sankar Sathyamoorthy)

The index doesn't support your WHERE clause. Though you have those columns in your existing index, they are not the first columns in that index hence the use of the index requires scanning every single page to determine whether the row pointed to by the index qualifies.  Build an index (even if you only keep it for this process) on the columns in your where clause and run RUNSTATS on the tablespace.  Rebind (if static) or rerun (if dynamic) and you should see a big improvement.  From .5 billion GETPAGES, you should see a drop to a few hundred thousand or perhaps 1 million (as the columns are not very high in cardinality).

Walter Jani&#223;en

AW: suggestions for cutting down I/O delay time
(in response to Alan Gredell)
Hi Alan

According to the explain-output the UPDATE has 1 matching column, because index IPRDCR03 is used, which, I guess, is not well clustered. Therefore I suggested coding all the updates as one update, which leads to a tablespace scan, which in this case is better than any index access.

Kind regards
Walter Janißen [standard_IBM+Champ+7+Yr+Analytics]

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Technische Anwendungsarchitektur
Victoriaplatz 2
D-40198 Düsseldorf
[login to unmask email]<mailto:[login to unmask email]>

ITERGO Informationstechnologie GmbH
Vorsitzender des Aufsichtsrats: Christian Diedrich
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Lothar Engelke, Ina Kirchhof, Dr. Michael Regauer
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf HRB 37996

Von: Alan Gredell [mailto:[login to unmask email]
Gesendet: Dienstag, 12. September 2017 15:21
An: [login to unmask email]
Betreff: [DB2-L] - RE: suggestions for cutting down I/O delay time


The index doesn't support your WHERE clause. Though you have those columns in your existing index, they are not the first columns in that index hence the use of the index requires scanning every single page to determine whether the row pointed to by the index qualifies. Build an index (even if you only keep it for this process) on the columns in your where clause and run RUNSTATS on the tablespace. Rebind (if static) or rerun (if dynamic) and you should see a big improvement. From .5 billion GETPAGES, you should see a drop to a few hundred thousand or perhaps 1 million (as the columns are not very high in cardinality).

-----End Original Message-----
Attachments

  • image001.png (2.6k)

Sankar Sathyamoorthy

RE: suggestions for cutting down I/O delay time
(in response to Alan Gredell)

Thanks Walter, we cancelled it as soon as we saw Table space scan. I will try again and capture the timing to see if there is any improvement with that. 

Alan, 

Currently It uses index (IPRDCR03) with two matching cols 

CREATE INDEX "DB2 ".IPRDCR03
ON "DB2 ".TPRO_PRDC_CHR
(ROLE_TYP_ID ASC
,CHR_TYP_ID ASC
)
NOT CLUSTER

Sankar Sathyamoorthy

RE: suggestions for cutting down I/O delay time
(in response to Sankar Sathyamoorthy)

Thank you Walter.. It worked like a charm! Run time is down to 2 hours from 23 hours. 

Thanks much again!

Phil Grainger

suggestions for cutting down I/O delay time
(in response to Alan Gredell)
Interesting comment in the OPs opening question


“I tried to put all the values in single in query but it resulted in Tablespace scan which I think is due to the query touching most of the pages in the table. ”

That is Db2 telling you that a tabespace scan IS the best access path for your query!

Given the advantages of prefetch and potentially touching each page only ONCE, tablespace scans become a better choice than index access far sooner than people imagine

It’s far too blunt an instrument to decide that a “tablespace scan is bad, I need to use an index” – you need to take into account what you’re doing with the data

If you can find a way to process this “out of hours”, then think about locking the table before you start – remove the overhead of all the page locks

If all else fails, unload the data, correct it and then put it back – it’s hard to see how that would take 24 hours

________________________________

Phil Grainger

Enablement Manager

[login to unmask email]

Direct



+44 (0)118 921 8000

Mobile



+44(0)7808 643 479


E2, Eskdale Road
Winnersh
Berkshire
RG41 5TS


[http://media.cms.bmc.com/images/corp_signature_bmclogo_2014.jpg] http://www.bmc.com

[cid:[login to unmask email]






From: Alan Gredell [mailto:[login to unmask email]
Sent: 12 September 2017 14:21
To: [login to unmask email]
Subject: [DB2-L] - RE: suggestions for cutting down I/O delay time


The index doesn't support your WHERE clause. Though you have those columns in your existing index, they are not the first columns in that index hence the use of the index requires scanning every single page to determine whether the row pointed to by the index qualifies. Build an index (even if you only keep it for this process) on the columns in your where clause and run RUNSTATS on the tablespace. Rebind (if static) or rerun (if dynamic) and you should see a big improvement. From .5 billion GETPAGES, you should see a drop to a few hundred thousand or perhaps 1 million (as the columns are not very high in cardinality).

-----End Original Message-----
BMC Software Limited Registered Office: Building E2, Eskdale Road, Winnersh, Wokingham, Berkshire, United Kingdom, RG41 5TS Registered in England No. 1927903 The content of this email is confidential. If you are not the addressee, you may not distribute, copy or disclose any part of it. If you receive this message in error, please delete this from your system and notify the sender immediately.
Attachments

  • image001.jpg (8k)
  • image002.png (5.9k)

Sankar Sathyamoorthy

RE: suggestions for cutting down I/O delay time
(in response to Phil Grainger)

I agree with your comments Phil, it was a good learning for us. We underestimated the power of Tablespace scan and sequential prefetch but Walter was right in pointing that out. I was amazed by the tremendous improvement in runtime. 

Thanks,

Sankar

Phil Grainger

suggestions for cutting down I/O delay time
(in response to Sankar Sathyamoorthy)
That’s good to know - would you care to share the numbers?

It is a common mistake (I used to be guilty too) to ALWAYS regard a tablespace scan as something to be “fixed”

In reality, the answer to the question “Is a tablespace scan a bad thing?” is “It depends” ☺

________________________________

Phil Grainger

Enablement Manager

[login to unmask email]

Direct



+44 (0)118 921 8000

Mobile



+44(0)7808 643 479


E2, Eskdale Road
Winnersh
Berkshire
RG41 5TS


[http://media.cms.bmc.com/images/corp_signature_bmclogo_2014.jpg] http://www.bmc.com

[cid:[login to unmask email]






From: Sankar Sathyamoorthy [mailto:[login to unmask email]
Sent: 14 September 2017 14:47
To: [login to unmask email]
Subject: [DB2-L] - RE: suggestions for cutting down I/O delay time


I agree with your comments Phil, it was a good learning for us. We underestimated the power of Tablespace scan and sequential prefetch but Walter was right in pointing that out. I was amazed by the tremendous improvement in runtime.

Thanks,

Sankar

-----End Original Message-----
BMC Software Limited Registered Office: Building E2, Eskdale Road, Winnersh, Wokingham, Berkshire, United Kingdom, RG41 5TS Registered in England No. 1927903 The content of this email is confidential. If you are not the addressee, you may not distribute, copy or disclose any part of it. If you receive this message in error, please delete this from your system and notify the sender immediately.
Attachments

  • image001.jpg (8k)
  • image002.png (5.9k)