Inefficient access path

Athithan Rajadurai

Inefficient access path

Query
SELECT
COLUMN_5,
COLUMN_6
FROM DB2TEST.TEST_TABLE 
WHERE COLUMN_1 IN ( ? , ? , ? , ? , ? )
AND COLUMN_2 IN ( 'A' , 'B' , 'C' , 'D' , 'E')
AND COLUMN_3 = ?
AND COLUMN_4 = ?

ORDER BY COLUMN_5, COLUMN_6

 

Access path

PLANNO METHOD CREATOR TNAME TABNO ACCESSTYPE MATCHCOLS ACCESSCREATOR ACCESSNAME INDEXONLY SORTN_UNIQ SORTN_JOIN SORTN_ORDERBY SORTN_GROUPBY SORTC_UNIQ SORTC_JOIN SORTC_ORDERBY SORTC_GROUPBY

1 0 DB2TEST TEST_TABLE 1 N 1 DB2TEST INDEX01 Y N N N N N N N N

2 3                                          0    0                                   N N N N N N N Y N

INDEX Definition

CREATE UNIQUE INDEX DB2TEST.INDEX01
ON DB2TEST.TEST_TABLE
(COLUMN_1 ASC
,COLUMN_2 ASC
,COLUMN_3 ASC
,COLUMN_4 ASC
,COLUMN_5 ASC
,COLUMN_6 ASC
,COLUMN_7 ASC
,COLUMN_8 ASC
)

  1. Why MATCHCOLS is just 1 instead of 4.
  2. Why Db2 requires a seperate ORDER BY 'SORTC_ORDERBY = Y' to process the output when the columns are already present in INDEX .

Could someone please explain the issue and how to make it efficient. Thanks in advance.




Edited By:
Athithan Rajadurai[Organization Members] @ Jul 12, 2018 - 05:28 PM (Asia/Calcutta)
Athithan Rajadurai[Organization Members] @ Jul 12, 2018 - 05:32 PM (Asia/Calcutta)
Athithan Rajadurai[Organization Members] @ Jul 12, 2018 - 05:36 PM (Asia/Calcutta)

Bill Gallagher

Inefficient access path
(in response to Athithan Rajadurai)
1. MATCHCOLS is only 1 because of 4 because your predicate on COLUMN_1 (the first column in your index) is an “IN” rather than an “=”. DB2 cannot doing column matching beyond the first IN predicate applied to the leading columns in your index.
2. The SORT is needed because you are not sorting by leading columns, or by columns that were preceded by matching columns in your index. Your ORDER BY is on columns 5 and 6, while you are only matching on column 1.

I believe two options that could help this particular query run more efficiently would be:


1. An index defined on (C3, C4, C5, C6, C1, C2). This would result in an index only query, MATCHCOLS = 2, and would definitely eliminate the sort.
2. An index defined on (C3, C4, C1, C5, C6, C2). This would also result in an index only query, MATCHCOLS =3, and may (though I’m not 100% positive) eliminate the sort.

Option 2 might alternately be defined as (C3, C4, C2, C5, C6, C1) if C2 has a higher cardinality/better selectivity than C1.

And either option could be defined as (C4, C3, . . .) if C4 has a higher cardinality/better selectivity than C3.

Which of those two options would be best would likely depend on the specific cardinalities and selectivity of the actual data in the columns in question.

Bill Gallagher | Senior Systems Engineer, DBA | Data Administration

From: Athithan Rajadurai [mailto:[login to unmask email]
Sent: Thursday, July 12, 2018 7:55 AM
To: [login to unmask email]
Subject: [DB2-L] - Inefficient access path


Query
SELECT
COLUMN_5,
COLUMN_6
FROM DB2TST5.CST_PAST_PRES_FUTR
WHERE COLUMN_1 IN ( ? , ? , ? , ? , ? )
AND COLUMN_2 IN ( 'A' , 'B' , 'C' , 'D' , 'E')
AND COLUMN_3 = ?
AND COLUMN_4 = ?

ORDER BY COLUMN_5, COLUMN_6



Access path

PLANNOMETHODCREATORTNAMETABNOACCESSTYPEMATCHCOLSACCESSCREATORACCESSNAMEINDEXONLYSORTN_UNIQSORTN_JOINSORTN_ORDERBYSORTN_GROUPBYSORTC_UNIQSORTC_JOINSORTC_ORDERBYSORTC_GROUPBY10DB2TESTTEST_TABLE1N 1DB2TESTINDEX01YNNNNNNNN230 0NNNNNNNYN

INDEX Definition

TABLE NAME INDEX NAME INDEXED COLUMN COLSEQ ORD CLS UNQ TEST_TABLE
INDEX01
COLUMN_1 1 A Y P COLUMN_2 2 A Y P COLUMN_3 3 A Y P COLUMN_4 4 A Y P COLUMN_5 5 A Y P COLUMN_6 6 A Y P

1. Why MATCHCOLS is just 1.
2. Why Db2 requires a seperate ORDER BY 'SORTC_ORDERBY = Y' to process the output when the columns are already present in INDEX .

My thought is, as all the columns are present in INDEX. It should be INDEXONLY=Y

Could someone please explain the issue and how to make it efficient. Thanks in advance.



-----End Original Message-----
________________________________
This message (including any attachments) may contain confidential, proprietary, privileged and/or private information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.

TRVDiscDefault::1201

Ruediger Kurtz

AW: Inefficient access path
(in response to Athithan Rajadurai)
Athithan,

I find it extremely hard to decipher what you’ve sent us, but one thing’s pretty clear to me:

The sort’s required because column_5 and column_6 are not the leading columns in your index.
The sort might become superfluous if you have MC=1 and want to have the next two columns in the index (mark the “might”).

But there are 3 more columns in between column_1 and the ones you want your result set ordered by.

COL_1: A B C D E
COL_2: A A A A E
COL_3: H H H H H
COL_4: L L L L L
COL_5: 9 8 1 2 3
COL_6: 1 2 3 4 5

Suppose all rows qualify (I can’t tell for sure since I don’t know your search criteria), you’d end up with.

THE MC=0 is basically because of your IN-LIST, which is an OR-condition.

9 8
8 2
1 3
2 4
3 5

And finally: In the good old days the golden rule was: You want your result set in a specific order – use ORDER BY, and I suppose nothing’s changed since.

Hth

Ruediger





Rüdiger Kurtz
Abteilung Informatik – Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon: 09561 96-44148
Telefax: 09561 96-44104
E-Mail: [login to unmask email]
Internet: www.huk.de
________________________________
HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.
________________________________
Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________
Von: Athithan Rajadurai [mailto:[login to unmask email]
Gesendet: Donnerstag, 12. Juli 2018 13:55
An: [login to unmask email]
Betreff: [DB2-L] - Inefficient access path


Query
SELECT
COLUMN_5,
COLUMN_6
FROM DB2TST5.CST_PAST_PRES_FUTR
WHERE COLUMN_1 IN ( ? , ? , ? , ? , ? )
AND COLUMN_2 IN ( 'A' , 'B' , 'C' , 'D' , 'E')
AND COLUMN_3 = ?
AND COLUMN_4 = ?

ORDER BY COLUMN_5, COLUMN_6



Access path

PLANNOMETHODCREATORTNAMETABNOACCESSTYPEMATCHCOLSACCESSCREATORACCESSNAMEINDEXONLYSORTN_UNIQSORTN_JOINSORTN_ORDERBYSORTN_GROUPBYSORTC_UNIQSORTC_JOINSORTC_ORDERBYSORTC_GROUPBY10DB2TESTTEST_TABLE1N 1DB2TESTINDEX01YNNNNNNNN230 0NNNNNNNYN

INDEX Definition

TABLE NAME INDEX NAME INDEXED COLUMN COLSEQ ORD CLS UNQ TEST_TABLE
INDEX01
COLUMN_1 1 A Y P COLUMN_2 2 A Y P COLUMN_3 3 A Y P COLUMN_4 4 A Y P COLUMN_5 5 A Y P COLUMN_6 6 A Y P

1. Why MATCHCOLS is just 1.
2. Why Db2 requires a seperate ORDER BY 'SORTC_ORDERBY = Y' to process the output when the columns are already present in INDEX .

My thought is, as all the columns are present in INDEX. It should be INDEXONLY=Y

Could someone please explain the issue and how to make it efficient. Thanks in advance.



-----End Original Message-----

Ruediger Kurtz

WG: Inefficient access path
(in response to Ruediger Kurtz)
Ups, that got scrambled up a bit ….

THE MC=0 is basically because of your IN-LIST, which is an OR-condition.

The sort’s required because column_5 and column_6 are not the leading columns in your index.
The sort might become superfluous if you have MC=1 and want to have the next two columns in the index (mark the “might”).

But there are 3 more columns in between column_1 and the ones you want your result set ordered by.

COL_1: A B C D E
COL_2: A A A A E
COL_3: H H H H H
COL_4: L L L L L
COL_5: 9 8 1 2 3
COL_6: 1 2 3 4 5

Suppose all rows qualify (I can’t tell for sure since I don’t know your search criteria), you’d end up with.

9 8
8 2
1 3
2 4
3 5

And finally: In the good old days the golden rule was: You want your result set in a specific order – use ORDER BY, and I suppose nothing’s changed since.

Hth

Ruediger





Rüdiger Kurtz
Abteilung Informatik – Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon:

09561 96-44148

Telefax:

09561 96-44104

E-Mail:

[login to unmask email]<mailto:[login to unmask email]>

Internet:

www.huk.de http://www.huk.de

________________________________
HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.
________________________________
Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________

Rüdiger Kurtz
Abteilung Informatik – Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon: 09561 96-44148
Telefax: 09561 96-44104
E-Mail: [login to unmask email]
Internet: www.huk.de
________________________________
HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.
________________________________
Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________
Von: Athithan Rajadurai [mailto:[login to unmask email]
Gesendet: Donnerstag, 12. Juli 2018 13:55
An: [login to unmask email]<mailto:[login to unmask email]>
Betreff: [DB2-L] - Inefficient access path


Query
SELECT
COLUMN_5,
COLUMN_6
FROM DB2TST5.CST_PAST_PRES_FUTR
WHERE COLUMN_1 IN ( ? , ? , ? , ? , ? )
AND COLUMN_2 IN ( 'A' , 'B' , 'C' , 'D' , 'E')
AND COLUMN_3 = ?
AND COLUMN_4 = ?

ORDER BY COLUMN_5, COLUMN_6



Access path

PLANNOMETHODCREATORTNAMETABNOACCESSTYPEMATCHCOLSACCESSCREATORACCESSNAMEINDEXONLYSORTN_UNIQSORTN_JOINSORTN_ORDERBYSORTN_GROUPBYSORTC_UNIQSORTC_JOINSORTC_ORDERBYSORTC_GROUPBY10DB2TESTTEST_TABLE1N 1DB2TESTINDEX01YNNNNNNNN230 0NNNNNNNYN

INDEX Definition

TABLE NAME INDEX NAME INDEXED COLUMN COLSEQ ORD CLS UNQ TEST_TABLE
INDEX01
COLUMN_1 1 A Y P COLUMN_2 2 A Y P COLUMN_3 3 A Y P COLUMN_4 4 A Y P COLUMN_5 5 A Y P COLUMN_6 6 A Y P

1. Why MATCHCOLS is just 1.
2. Why Db2 requires a seperate ORDER BY 'SORTC_ORDERBY = Y' to process the output when the columns are already present in INDEX .

My thought is, as all the columns are present in INDEX. It should be INDEXONLY=Y

Could someone please explain the issue and how to make it efficient. Thanks in advance.



-----End Original Message-----

-----End Original Message-----

Walter Jani&#223;en

AW: Inefficient access path
(in response to Bill Gallagher)
Just to clarify: You can get matching for two IN-predicate due to the access path IN. So the two IN-predicates are joined and the result is then joined to the main-table.

Kind regards
Walter Janißen [standard_IBM+Champ+7+Yr+Analytics]

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Technische Anwendungsarchitektur
Victoriaplatz 2
D-40198 Düsseldorf
[login to unmask email]<mailto:[login to unmask email]>

ITERGO Informationstechnologie GmbH
Vorsitzender des Aufsichtsrats: Christian Diedrich
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Lothar Engelke, Ina Kirchhof, Dr. Michael Regauer
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf HRB 37996

Von: Gallagher,Bill R <[login to unmask email]>
Gesendet: Donnerstag, 12. Juli 2018 14:25
An: [login to unmask email]
Betreff: [DB2-L] - RE: Inefficient access path

1) MATCHCOLS is only 1 because of 4 because your predicate on COLUMN_1 (the first column in your index) is an “IN” rather than an “=”. DB2 cannot doing column matching beyond the first IN predicate applied to the leading columns in your index.
2) The SORT is needed because you are not sorting by leading columns, or by columns that were preceded by matching columns in your index. Your ORDER BY is on columns 5 and 6, while you are only matching on column 1.

I believe two options that could help this particular query run more efficiently would be:

1) An index defined on (C3, C4, C5, C6, C1, C2). This would result in an index only query, MATCHCOLS = 2, and would definitely eliminate the sort.
2) An index defined on (C3, C4, C1, C5, C6, C2). This would also result in an index only query, MATCHCOLS =3, and may (though I’m not 100% positive) eliminate the sort.

Option 2 might alternately be defined as (C3, C4, C2, C5, C6, C1) if C2 has a higher cardinality/better selectivity than C1.

And either option could be defined as (C4, C3, . . .) if C4 has a higher cardinality/better selectivity than C3.

Which of those two options would be best would likely depend on the specific cardinalities and selectivity of the actual data in the columns in question.

Bill Gallagher | Senior Systems Engineer, DBA | Data Administration

From: Athithan Rajadurai [mailto:[login to unmask email]
Sent: Thursday, July 12, 2018 7:55 AM
To: [login to unmask email]<mailto:[login to unmask email]>
Subject: [DB2-L] - Inefficient access path


Query
SELECT
COLUMN_5,
COLUMN_6
FROM DB2TST5.CST_PAST_PRES_FUTR
WHERE COLUMN_1 IN ( ? , ? , ? , ? , ? )
AND COLUMN_2 IN ( 'A' , 'B' , 'C' , 'D' , 'E')
AND COLUMN_3 = ?
AND COLUMN_4 = ?

ORDER BY COLUMN_5, COLUMN_6



Access path

PLANNOMETHODCREATORTNAMETABNOACCESSTYPEMATCHCOLSACCESSCREATORACCESSNAMEINDEXONLYSORTN_UNIQSORTN_JOINSORTN_ORDERBYSORTN_GROUPBYSORTC_UNIQSORTC_JOINSORTC_ORDERBYSORTC_GROUPBY10DB2TESTTEST_TABLE1N 1DB2TESTINDEX01YNNNNNNNN230 0NNNNNNNYN

INDEX Definition

TABLE NAME INDEX NAME INDEXED COLUMN COLSEQ ORD CLS UNQ TEST_TABLE
INDEX01
COLUMN_1 1 A Y P COLUMN_2 2 A Y P COLUMN_3 3 A Y P COLUMN_4 4 A Y P COLUMN_5 5 A Y P COLUMN_6 6 A Y P
1. Why MATCHCOLS is just 1.
2. Why Db2 requires a seperate ORDER BY 'SORTC_ORDERBY = Y' to process the output when the columns are already present in INDEX .

My thought is, as all the columns are present in INDEX. It should be INDEXONLY=Y

Could someone please explain the issue and how to make it efficient. Thanks in advance.



-----End Original Message-----
________________________________
This message (including any attachments) may contain confidential, proprietary, privileged and/or private information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.

TRVDiscDefault::1201
-----End Original Message-----
Attachments

  • image001.png (2.6k)

Athithan Rajadurai

RE: Inefficient access path
(in response to Bill Gallagher)

Thanks a lot Bill for the detailed explanation.

Below is the original query that had slow response time. 

Original query

-----------------

SELECT
COLUMN_5,
COLUMN_6
FROM TEMP_TABLE
WHERE COLUMN_3 = ?
AND COLUMN_4 = ?
AND
(
(
COLUMN_2 = 'A'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'B'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'C'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'D'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'E'
AND
COLUMN_1 = ?
)
)
ORDER BY COLUMN_5, COLUMN_6

Original access path

PLANNO   METHOD   CREATOR   TNAME   TABNO   ACCESSTYPE   MATCHCOLS   ACCESSCREATOR     ACCESSNAME     INDEXONLY     SORTN_UNIQ     SORTN_JOIN     SORTN_ORDERBY     SORTN_GROUPBY     SORTC_UNIQ     SORTC_JOIN     SORTC_ORDERBY     SORTC_GROUPBY     

1     0     DB2TEST     TEST_TABLE     1     I     2     DB2TEST     INDEX02     N     N     N     N     N     N     N     N     N

INDEX Definition

CREATE UNIQUE INDEX DB2PROD.INDEX02
ON DB2PROD.TEMP_TABLE
(COLUMN_4 ASC           /*COLCARD-4,703
,COLUMN_3 ASC           /*COLCARD-2
,COLUMN_5 ASC           /*COLCARD-5
,COLUMN_6 ASC           /*COLCARD-2,674
,COLUMN_1 ASC           /*COLCARD-548,207
,COLUMN_2 ASC           /*COLCARD-5
,COLUMN_7 ASC           /*COLCARD-4
,COLUMN_8 ASC           /*COLCARD-348
)

Query cost

QBLOCKNO     PLANNO    COMPCOST
1                       1                 4.7072096

I modified the query to pick another INDEX with FIRST KEY CARDINALITY having higher number.

Modified query

SELECT
COLUMN_5,
COLUMN_6
FROM DB2TST5.CST_PAST_PRES_FUTR
WHERE COLUMN_1 IN ( ? , ? , ? , ? , ? )
AND COLUMN_2 IN ( 'A' , 'B' , 'C' , 'D' , 'E')
AND COLUMN_3 = ?
AND COLUMN_4 = ?

ORDER BY COLUMN_5, COLUMN_6

 

Modified Access path
PLANNO     METHOD     CREATOR     TNAME     TABNO     ACCESSTYPE     MATCHCOLS     ACCESSCREATOR     ACCESSNAME     INDEXONLY     SORTN_UNIQ     SORTN_JOIN     SORTN_ORDERBY     SORTN_GROUPBY     SORTC_UNIQ     SORTC_JOIN     SORTC_ORDERBY     SORTC_GROUPBY     

1     0     DB2TEST     TEST_TABLE     1     N     1     DB2TEST     INDEX01     Y     N     N     N     N     N     N     N    N 

2     3                                                      0            0                                              N     N     N     N     N     N     N     Y     N

INDEX Definition

CREATE UNIQUE INDEX DB2TEST.INDEX01
ON DB2TEST.TEST_TABLE
(COLUMN_1 ASC           /*COLCARD-548,207
,COLUMN_2 ASC           /*COLCARD-5
,COLUMN_3 ASC           /*COLCARD-2
,COLUMN_4 ASC           /*COLCARD-4,703
,COLUMN_5 ASC           /*COLCARD-5
,COLUMN_6 ASC           /*COLCARD-2,674
,COLUMN_7 ASC           /*COLCARD-4
,COLUMN_8 ASC           /*COLCARD-348
)

Query cost

QBLOCKNO     PLANNO    COMPCOST
1                       1                 1.0670595

1                       2                 2.493307

 

Is there any efficient SQL rewrite to better handle this situation. 

 

Edited By:
Athithan Rajadurai[Organization Members] @ Jul 12, 2018 - 06:43 PM (Asia/Calcutta)

Walter Jani&#223;en

AW: Inefficient access path
(in response to Athithan Rajadurai)
Hi

I don’t understand what you are trying to improve. One matching columns for the modified query is sufficient or are there many more rows in your table than 548,207? It makes no sense to get matching on COLUMN2, because all the 5 values for this column are specified. And you can’t avoid the sort, if you don’t ORDER BY COLUMN1, COLUMN2 and then COLUMN5 and 6. You could change your index definition to increase the number of matching, if COLUMN3 and 4 are the next columns after COLUMN1.

Kind regards
Walter Janißen [standard_IBM+Champ+7+Yr+Analytics]

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Technische Anwendungsarchitektur
Victoriaplatz 2
D-40198 Düsseldorf
[login to unmask email]<mailto:[login to unmask email]>

ITERGO Informationstechnologie GmbH
Vorsitzender des Aufsichtsrats: Christian Diedrich
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Lothar Engelke, Ina Kirchhof, Dr. Michael Regauer
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf HRB 37996

Von: Athithan Rajadurai <[login to unmask email]>
Gesendet: Donnerstag, 12. Juli 2018 15:13
An: [login to unmask email]
Betreff: [DB2-L] - RE: Inefficient access path


Thanks a lot Bill for the detailed explanation.

Below is the original query that had slow response time.

Original query

-----------------

SELECT
COLUMN_5,
COLUMN_6
FROM TEMP_TABLE
WHERE COLUMN_3 = ?
AND COLUMN_4 = ?
AND
(
(
COLUMN_2 = 'A'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'B'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'C'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'D'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'E'
AND
COLUMN_1 = ?
)
)
ORDER BY COLUMN_5, COLUMN_6

Original access path

PLANNO METHOD CREATOR TNAME TABNO ACCESSTYPE MATCHCOLS ACCESSCREATOR ACCESSNAME INDEXONLY SORTN_UNIQ SORTN_JOIN SORTN_ORDERBY SORTN_GROUPBY SORTC_UNIQ SORTC_JOIN SORTC_ORDERBY SORTC_GROUPBY

1 0 DB2TEST TEST_TABLE 1 I 2 DB2TEST INDEX02 N N N N N N N N N

INDEX Definition

CREATE UNIQUE INDEX DB2PROD.INDEX02
ON DB2PROD.TEMP_TABLE
(COLUMN_4 ASC /*COLCARD-4,703
,COLUMN_3 ASC /*COLCARD-2
,COLUMN_5 ASC /*COLCARD-5
,COLUMN_6 ASC /*COLCARD-2,674
,COLUMN_1 ASC /*COLCARD-548,207
,COLUMN_2 ASC /*COLCARD-5
,COLUMN_7 ASC COLCARD-4
,COLUMN_8 ASC COLCARD-348
)

Query cost

QBLOCKNO PLANNO COMPCOST
1 1 4.7072096

I modified the query to pick another INDEX with FIRST KEY CARDINALITY having higher number.

Modified query

SELECT
COLUMN_5,
COLUMN_6
FROM DB2TST5.CST_PAST_PRES_FUTR
WHERE COLUMN_1 IN ( ? , ? , ? , ? , ? )
AND COLUMN_2 IN ( 'A' , 'B' , 'C' , 'D' , 'E')
AND COLUMN_3 = ?
AND COLUMN_4 = ?

ORDER BY COLUMN_5, COLUMN_6



Modified Access path
PLANNO METHOD CREATOR TNAME TABNO ACCESSTYPE MATCHCOLS ACCESSCREATOR ACCESSNAME INDEXONLY SORTN_UNIQ SORTN_JOIN SORTN_ORDERBY SORTN_GROUPBY SORTC_UNIQ SORTC_JOIN SORTC_ORDERBY SORTC_GROUPBY

1 0 DB2TEST TEST_TABLE 1 N 1 DB2TEST INDEX01 Y N N N N N N N N

2 3 0 0 N N N N N N N Y N

INDEX Definition

CREATE UNIQUE INDEX DB2TEST.INDEX01
ON DB2TEST.TEST_TABLE
(COLUMN_1 ASC /*COLCARD-548,207
,COLUMN_2 ASC /*COLCARD-5
,COLUMN_3 ASC /*COLCARD-2
,COLUMN_4 ASC /*COLCARD-4,703
,COLUMN_5 ASC /*COLCARD-5
,COLUMN_6 ASC /*COLCARD-2,674
,COLUMN_7 ASC /*COLCARD-4
,COLUMN_8 ASC /*COLCARD-348
)

Query cost

QBLOCKNO PLANNO COMPCOST
1 1 1.0670595

1 2 2.493307



Is there any efficient SQL rewrite to better handle this situation.



-----End Original Message-----
Attachments

  • image001.png (2.6k)

Bill Gallagher

Inefficient access path
(in response to Athithan Rajadurai)
I don’t believe your original query and your modified query are functionally equivalent, if I’m understanding them correctly.

Bill Gallagher | Senior Systems Engineer, DBA | Data Administration

From: Athithan Rajadurai [mailto:[login to unmask email]
Sent: Thursday, July 12, 2018 9:13 AM
To: [login to unmask email]
Subject: [DB2-L] - RE: Inefficient access path


Thanks a lot Bill for the detailed explanation.

Below is the original query that had slow response time.

Original query

-----------------

SELECT
COLUMN_5,
COLUMN_6
FROM TEMP_TABLE
WHERE COLUMN_3 = ?
AND COLUMN_4 = ?
AND
(
(
COLUMN_2 = 'A'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'B'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'C'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'D'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'E'
AND
COLUMN_1 = ?
)
)
ORDER BY COLUMN_5, COLUMN_6

Original access path

PLANNO METHOD CREATOR TNAME TABNO ACCESSTYPE MATCHCOLS ACCESSCREATOR ACCESSNAME INDEXONLY SORTN_UNIQ SORTN_JOIN SORTN_ORDERBY SORTN_GROUPBY SORTC_UNIQ SORTC_JOIN SORTC_ORDERBY SORTC_GROUPBY

1 0 DB2TEST TEST_TABLE 1 I 2 DB2TEST INDEX02 N N N N N N N N N

INDEX Definition

CREATE UNIQUE INDEX DB2PROD.INDEX02
ON DB2PROD.TEMP_TABLE
(COLUMN_4 ASC /*COLCARD-4,703
,COLUMN_3 ASC /*COLCARD-2
,COLUMN_5 ASC /*COLCARD-5
,COLUMN_6 ASC /*COLCARD-2,674
,COLUMN_1 ASC /*COLCARD-548,207
,COLUMN_2 ASC /*COLCARD-5
,COLUMN_7 ASC COLCARD-4
,COLUMN_8 ASC COLCARD-348
)

Query cost

QBLOCKNO PLANNO COMPCOST
1 1 4.7072096

I modified the query to pick another INDEX with FIRST KEY CARDINALITY having higher number.

Modified query

SELECT
COLUMN_5,
COLUMN_6
FROM DB2TST5.CST_PAST_PRES_FUTR
WHERE COLUMN_1 IN ( ? , ? , ? , ? , ? )
AND COLUMN_2 IN ( 'A' , 'B' , 'C' , 'D' , 'E')
AND COLUMN_3 = ?
AND COLUMN_4 = ?

ORDER BY COLUMN_5, COLUMN_6



Modified Access path
PLANNO METHOD CREATOR TNAME TABNO ACCESSTYPE MATCHCOLS ACCESSCREATOR ACCESSNAME INDEXONLY SORTN_UNIQ SORTN_JOIN SORTN_ORDERBY SORTN_GROUPBY SORTC_UNIQ SORTC_JOIN SORTC_ORDERBY SORTC_GROUPBY

1 0 DB2TEST TEST_TABLE 1 N 1 DB2TEST INDEX01 Y N N N N N N N N

2 3 0 0 N N N N N N N Y N

INDEX Definition

CREATE UNIQUE INDEX DB2TEST.INDEX01
ON DB2TEST.TEST_TABLE
(COLUMN_1 ASC /*COLCARD-548,207
,COLUMN_2 ASC /*COLCARD-5
,COLUMN_3 ASC /*COLCARD-2
,COLUMN_4 ASC /*COLCARD-4,703
,COLUMN_5 ASC /*COLCARD-5
,COLUMN_6 ASC /*COLCARD-2,674
,COLUMN_7 ASC /*COLCARD-4
,COLUMN_8 ASC /*COLCARD-348
)

Query cost

QBLOCKNO PLANNO COMPCOST
1 1 1.0670595

1 2 2.493307



Is there any efficient SQL rewrite to better handle this situation.



-----End Original Message-----
________________________________
This message (including any attachments) may contain confidential, proprietary, privileged and/or private information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.

TRVDiscDefault::1201

Athithan Rajadurai

RE: Inefficient access path
(in response to Bill Gallagher)

Yes Bill i accept both the queries are not functionally correct.

 

But i can think of a way to handle the functionality during SQL FETCH. But i worry the SQL fetch cost will increase by doing so. 

 

Is there a way to improve the original query.

Athithan Rajadurai

RE: AW: Inefficient access path
(in response to Walter Janißen)

Hi Walter Janißen,

 

Thanks for your reply.

 

My table has 2.5M records.

Of all COLUMN_1 has the highest cardinality-548,207.

The OPEN Cost of the query is higher. So i need to improve the performance of the query. 

Walter Jani&#223;en

AW: AW: Inefficient access path
(in response to Athithan Rajadurai)
Hi

The OPEN-cost can be reduced near to zero, if you can avoid the sort. So your original query avoids the sort, but has a worse filtering than your second query. You can’t get both. Either better filtering and a sort or worse filtering and no sort.

Kind regards
Walter Janißen [standard_IBM+Champ+7+Yr+Analytics]

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Technische Anwendungsarchitektur
Victoriaplatz 2
D-40198 Düsseldorf
[login to unmask email]<mailto:[login to unmask email]>

ITERGO Informationstechnologie GmbH
Vorsitzender des Aufsichtsrats: Christian Diedrich
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Lothar Engelke, Ina Kirchhof, Dr. Michael Regauer
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf HRB 37996

Von: Athithan Rajadurai <[login to unmask email]>
Gesendet: Donnerstag, 12. Juli 2018 15:40
An: [login to unmask email]
Betreff: [DB2-L] - RE: AW: Inefficient access path


Hi Walter Janißen,



Thanks for your reply.



My table has 2.5M records.

Of all COLUMN_1 has the highest cardinality-548,207.

The OPEN Cost of the query is higher. So i need to improve the performance of the query.

-----End Original Message-----
Attachments

  • image001.png (2.6k)

Michael Hannan

RE: Inefficient access path
(in response to Athithan Rajadurai)

Athithan.

To give best tuning advice, we ideally need to know the average number of rows in the answer set, and how much CPU time it takes per row returned. Better still the Getpages consumed as well if you have it. Static SQL I assume. DB2 Version?

Is there a way to make this query better?

The answer is yes.

Would be great if DB2 Optimizer would choose an access path using 5 NR (Range List Steps), however I think DB2 does not choose this due to not including COLUMN_1 and COLUMN_2 in the ORDER BY.

Index should then be by COLUMN_4, COLUMN_3, COLUMN_1, COLUMN_2, COLUMN_5, COLUMN_6. Is this index Unique or does your query return duplicates? Do you want duplicates? Capture Stats since KEYCARD is important to show the index skew.

COLUMN_7 and COLUMN_8 are omitted (not referenced).

So try changing to ORDER BY to COLUMN_1, COLUMN_2, COLUMN_5, COLUMN_6

to see if DB2 is willing to make a 5 step NR access path.

If that works, then can modify the query to sort into your original sequence, like this:

WITH CTE1 AS (

SELECT .....

WHERE your original predicates
ORDER BY COLUMN_1, COLUMN_2, COLUMN_5, COLUMN_6

FETCH FIRST 1000000 ROWS ONLY
)
SELECT * FROM CTE1
ORDER BY COLUMN_5, COLUMN_6
;

I might have to test this on a MF to make sure the concept works.

If the above idea fails, then:

You put in the IN clause predicates for COLUMN_1 and COLUMN_2 but do not take the other more complex structured predicates out. Then DB2 can get at least MC=4. Unfortunately will make 25 probes instead of desired 5, unless DB2 sees that MC=3 with 5 probes does strong enough filtering.

Publish the KEYCARDs for your index as this will show us if index specified is best or to modify it slightly. e.g. if COLUMN_2 does little extra filtering in pos 4 of index I might move it to the end.

DB2 Sort does not bother me in the modern day unless it gets really large and you will only fetch a small portion of the answer set.


 In Reply to Athithan Rajadurai:

Thanks a lot Bill for the detailed explanation.

Below is the original query that had slow response time. 

Original query

-----------------

SELECT
COLUMN_5,
COLUMN_6
FROM TEMP_TABLE
WHERE COLUMN_3 = ?
AND COLUMN_4 = ?
AND
(
(
COLUMN_2 = 'A'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'B'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'C'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'D'
AND
COLUMN_1 = ?
)
OR
(
COLUMN_2 = 'E'
AND
COLUMN_1 = ?
)
)
ORDER BY COLUMN_5, COLUMN_6

Original access path

PLANNO   METHOD   CREATOR   TNAME   TABNO   ACCESSTYPE   MATCHCOLS   ACCESSCREATOR     ACCESSNAME     INDEXONLY     SORTN_UNIQ     SORTN_JOIN     SORTN_ORDERBY     SORTN_GROUPBY     SORTC_UNIQ     SORTC_JOIN     SORTC_ORDERBY     SORTC_GROUPBY     

1     0     DB2TEST     TEST_TABLE     1     I     2     DB2TEST     INDEX02     N     N     N     N     N     N     N     N     N

INDEX Definition

CREATE UNIQUE INDEX DB2PROD.INDEX02
ON DB2PROD.TEMP_TABLE
(COLUMN_4 ASC           /*COLCARD-4,703
,COLUMN_3 ASC           /*COLCARD-2
,COLUMN_5 ASC           /*COLCARD-5
,COLUMN_6 ASC           /*COLCARD-2,674
,COLUMN_1 ASC           /*COLCARD-548,207
,COLUMN_2 ASC           /*COLCARD-5
,COLUMN_7 ASC           /*COLCARD-4
,COLUMN_8 ASC           /*COLCARD-348
)

Query cost

QBLOCKNO     PLANNO    COMPCOST
1                       1                 4.7072096

I modified the query to pick another INDEX with FIRST KEY CARDINALITY having higher number.

Modified query

SELECT
COLUMN_5,
COLUMN_6
FROM DB2TST5.CST_PAST_PRES_FUTR
WHERE COLUMN_1 IN ( ? , ? , ? , ? , ? )
AND COLUMN_2 IN ( 'A' , 'B' , 'C' , 'D' , 'E')
AND COLUMN_3 = ?
AND COLUMN_4 = ?

ORDER BY COLUMN_5, COLUMN_6

 

Modified Access path
PLANNO     METHOD     CREATOR     TNAME     TABNO     ACCESSTYPE     MATCHCOLS     ACCESSCREATOR     ACCESSNAME     INDEXONLY     SORTN_UNIQ     SORTN_JOIN     SORTN_ORDERBY     SORTN_GROUPBY     SORTC_UNIQ     SORTC_JOIN     SORTC_ORDERBY     SORTC_GROUPBY     

1     0     DB2TEST     TEST_TABLE     1     N     1     DB2TEST     INDEX01     Y     N     N     N     N     N     N     N    N 

2     3                                                      0            0                                              N     N     N     N     N     N     N     Y     N

INDEX Definition

CREATE UNIQUE INDEX DB2TEST.INDEX01
ON DB2TEST.TEST_TABLE
(COLUMN_1 ASC           /*COLCARD-548,207
,COLUMN_2 ASC           /*COLCARD-5
,COLUMN_3 ASC           /*COLCARD-2
,COLUMN_4 ASC           /*COLCARD-4,703
,COLUMN_5 ASC           /*COLCARD-5
,COLUMN_6 ASC           /*COLCARD-2,674
,COLUMN_7 ASC           /*COLCARD-4
,COLUMN_8 ASC           /*COLCARD-348
)

Query cost

QBLOCKNO     PLANNO    COMPCOST
1                       1                 1.0670595

1                       2                 2.493307

 

Is there any efficient SQL rewrite to better handle this situation. 

 

 

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ Jul 13, 2018 - 05:30 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ Jul 13, 2018 - 05:36 AM (Europe/Berlin)

Michael Hannan

RE: Inefficient access path
(in response to Michael Hannan)

Athithan,

I found that adding OPTTIMIZE clause to your query should be sufficient to get an NR access path given you use the right index I specified in previous mail. Alternatively an OPTHINT might be able to force it. I am not sure I have tried OPTHINT with NR access paths.

I rarely worry about DB2 Estimated Cost numbers too much. it is only a rough guide. If we get an access path then Optimizer has estimated it as best, but it does not always know best. It can highlight really expensive parts of a complex query (using DETCOST). For simple SQLs, Optimizer may estimate cost well, but does not really always understand the skew of the data.

Here is my testing on DB2 Catalog table SYSTABAUTH to demonstrate the technique. Not that in this case I have only 1 equals matched leading column, but concept is the same. Index Used is DSNATX02. This was on DB2 11 but even V10 has NR access paths to deal with complex OR conditions when the index is perfect for it.

  explain all set queryno = 100 for                                
select updatecols, alterauth
from sysibm.SYSTABAUTH
where GRANTEE = ?
and (Tcreator = 'A'
and TTNAME = ?
or Tcreator = 'B'
and TTNAME = ?
or Tcreator = 'C'
and TTNAME = ?
or Tcreator = 'D'
and TTNAME = ?
or Tcreator = 'E'
and TTNAME = ?
)
order by updatecols, alterauth
optimize for 10 rows
;
---------+---------+---------+---------+---------+---------+-------
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 0
---------+---------+---------+---------+---------+---------+-------
explain all set queryno = 200 for
with cte1 as (
select updatecols, alterauth
from sysibm.SYSTABAUTH
where GRANTEE = ?
and (Tcreator = 'A'
and TTNAME = ?
or Tcreator = 'B'
and TTNAME = ?
or Tcreator = 'C'
and TTNAME = ?
or Tcreator = 'D'
and TTNAME = ?
or Tcreator = 'E'
and TTNAME = ? )
order by TCREATOR, TTNAME, updatecols, alterauth
fetch first 1000000 rows only
)
select updatecols, alterauth
from cte1
order by updatecols, alterauth
fetch first 100 rows only
with ur;
---------+---------+---------+---------+---------+---------+-------
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 0
---------+---------+---------+---------+---------+---------+-------
select smallint(queryno) QNO
,qblockno BL#
,planno
,mixopseq MIXOP
,ACCESSTYPE AT
,method me
,indexONLY IXO
,matchcols MC
,substr(tname, 1, 18) TBL
from plan_table
order by queryno, qblockno, planno, mixopseq
with ur;
---------+---------+---------+---------+---------+---------+-------
QNO BL# PLANNO MIXOP AT ME IXO MC TBL
---------+---------+---------+---------+---------+---------+-------
100 1 1 1 NR 0 Y 3 SYSTABAUTH
100 1 1 2 NR 0 Y 3 SYSTABAUTH
100 1 1 3 NR 0 Y 3 SYSTABAUTH
100 1 1 4 NR 0 Y 3 SYSTABAUTH
100 1 1 5 NR 0 Y 3 SYSTABAUTH
100 1 2 0 3 N 0
200 1 1 0 R 0 N 0 CTE1
200 1 2 0 3 N 0
200 2 1 1 NR 0 Y 3 SYSTABAUTH
200 2 1 2 NR 0 Y 3 SYSTABAUTH
200 2 1 3 NR 0 Y 3 SYSTABAUTH
200 2 1 4 NR 0 Y 3 SYSTABAUTH
200 2 1 5 NR 0 Y 3 SYSTABAUTH
200 2 2 0 3 N 0
DSNE610I NUMBER OF ROWS DISPLAYED IS 14
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ Jul 13, 2018 - 06:43 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ Jul 13, 2018 - 06:47 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ Jul 13, 2018 - 06:48 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ Jul 13, 2018 - 06:51 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ Jul 13, 2018 - 06:56 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ Jul 13, 2018 - 06:58 AM (Europe/Berlin)

Michael Hannan

RE: Inefficient access path
(in response to Michael Hannan)

OPTIMIZE clause may make no practical difference at all where DB2 Sort not avoided.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Terry Purcell

RE: Inefficient access path
(in response to Athithan Rajadurai)

Since we are having to help by guessing......

As has been highlighted - changing the ORs to INs is not equivalent. Sort avoidance plus matching on all columns is also not possible.

If the optimizer is not maximizing all available matching columns (which it theoretically can do for ACCESSTYPE='NR' - but there are rules as to when it will do so - such as if there aren't already matching predicates available)......

You can code the query to force MATCHCOLS=4 by moving COLUMN_3 & COLUMN_4 predicates to each OR such as (assuming those 4 columns are the leading columns of one of the indexes - which I believe they are from your examples):

SELECT
COLUMN_5,
COLUMN_6
FROM TEMP_TABLE
WHERE ((
COLUMN_2 = 'A'
AND COLUMN_1 = ?

AND COLUMN_3 = ?
AND COLUMN_4 = ?
)
OR
(
COLUMN_2 = 'B'
AND COLUMN_1 = ?

AND COLUMN_3 = ?
AND COLUMN_4 = ?
)
OR
(
COLUMN_2 = 'C'
AND COLUMN_1 = ?

AND COLUMN_3 = ?
AND COLUMN_4 = ?
)
OR
(
COLUMN_2 = 'D'
AND COLUMN_1 = ?

AND COLUMN_3 = ?
AND COLUMN_4 = ?
)
OR
(
COLUMN_2 = 'E'
AND COLUMN_1 = ?

AND COLUMN_3 = ?
AND COLUMN_4 = ?
)
)
ORDER BY COLUMN_5, COLUMN_6

Regards

Terry Purcell