Query Explain question

Raquel Rodriguez

Query Explain question
Ok, I think I am missing something fundamental here.
This is my query:

SELECT * FROM

RABAMC.PHI H ,
MADO.ORGF S
WHERE H.YAMP_ID = 'EM'
AND H.OS_MSL_ID = S.MSL_ID
AND H.PHI_ID = '1'
AND H.TIS_ID = '01'
AND H.YNT_DT BETWEEN '2004-11-01' AND '2004-11-30'
AND S.GON_CD = '0000000004'

PHI has 2.8M rows, ORGF has only 222 rows. The indexes
on these tables are:

PHI: Two indexes:

PHIX1 on following columns:
C1
C2
C3
YNT_DT
TIS_ID
PHI_ID

The COLCARD values for YNT_DT, TIS_ID and PHI_ID are
1888, 99 and 9.

PHIX2 on following columns:
OS_MSL_ID (This has a cardinality of 210; Total no. of
rows in this table is 2.8M).

ORGF: Two indexes

ORGFX1 on following columns:
GON_CD (This has a cardinality of 20; Total no. of
rows in this table is 222).

ORGFX2 on following columns:
MSL_ID (This is a unique index).

Explain shows that the query goes for a non-matching
index scan on PHI in the first step using index PHIX1
and then it goes for a 1 column matching index scan on
ORGFX2. The non-matching index scan on PHIX1 (This is
a 2.8M rows table obviously takes a long time).

I would have thought that the query should have gone
for a matching index scan on ORGFX1 in the first step
and again a matching index scan on PHIX2 in the second
step. This should have been much much faster.


TIA
Raquel.





__________________________________
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail

---------------------------------------------------------------------------------
Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page select "Join or Leave the list". The IDUG DB2-L FAQ is at http://www.idugdb2-l.org. The IDUG List Admins can be reached at [login to unmask email] Find out the latest on IDUG conferences at http://conferences.idug.org/index.cfm

Sushant Dash

Re: Query Explain question
(in response to Raquel Rodriguez)
Hi Raquel, here are quick ways.

You are expecting the table S to be first table and which is one generally expect. How ever there are ways you can have better access paths. Please implement the same and get back.

To have the table S as first table please follow this

1. Try coding an ORDER BY with S.GON_CD. This would make your table S as the outer table an you have the less rows qualifies initially. This is only possible if the column GON_CD is highly clustered. I m guessing the same for you. In case the high cluster is not there with GON_CD then
2. Try to rewrite the query with NTE keeping the table S in a NTE and join the NTE column MSL_ID with the PHI table and apply the predicates as usual. This is also another way you would get the S table as the outer table.
3. This approach is okay but should not be followed always as the same quesry may work better today and may not tomorrow. Check any columns of table S apart from Gon_CD. Say you have a column called X for the table S. Then add an extra predicate to the existing query like S.X=S.X. This way optimizer is given a wrong hint and you may get the table S as outer table hence less rows.

Pleas try this tricks and see whether you are getting the required time or not. See you on Monday.

With thanks
Dash
-----Original Message-----
From: DB2 Data Base Discussion List [mailto:[login to unmask email] On Behalf Of Raquel Rodriguez
Sent: Friday, December 03, 2004 9:27 PM
To: [login to unmask email]
Subject: Query Explain question
Ok, I think I am missing something fundamental here.
This is my query:

SELECT * FROM

RABAMC.PHI H ,
MADO.ORGF S
WHERE H.YAMP_ID = 'EM'
AND H.OS_MSL_ID = S.MSL_ID
AND H.PHI_ID = '1'
AND H.TIS_ID = '01'
AND H.YNT_DT BETWEEN '2004-11-01' AND '2004-11-30'
AND S.GON_CD = '0000000004'

PHI has 2.8M rows, ORGF has only 222 rows. The indexes
on these tables are:

PHI: Two indexes:

PHIX1 on following columns:
C1
C2
C3
YNT_DT
TIS_ID
PHI_ID

The COLCARD values for YNT_DT, TIS_ID and PHI_ID are
1888, 99 and 9.

PHIX2 on following columns:
OS_MSL_ID (This has a cardinality of 210; Total no. of
rows in this table is 2.8M).

ORGF: Two indexes

ORGFX1 on following columns:
GON_CD (This has a cardinality of 20; Total no. of
rows in this table is 222).

ORGFX2 on following columns:
MSL_ID (This is a unique index).

Explain shows that the query goes for a non-matching
index scan on PHI in the first step using index PHIX1
and then it goes for a 1 column matching index scan on
ORGFX2. The non-matching index scan on PHIX1 (This is
a 2.8M rows table obviously takes a long time).

I would have thought that the query should have gone
for a matching index scan on ORGFX1 in the first step
and again a matching index scan on PHIX2 in the second
step. This should have been much much faster.




------------------------------------------------------------------------------
This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.

---------------------------------------------------------------------------------
Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page select "Join or Leave the list". The IDUG DB2-L FAQ is at http://www.idugdb2-l.org. The IDUG List Admins can be reached at [login to unmask email] Find out the latest on IDUG conferences at http://conferences.idug.org/index.cfm

James Campbell

Re: Query Explain question
(in response to Sushant Dash)
Is PHI clustered on PHIX1? If so, I suspect that the killer (as far
as DB2 is concerned) is the thought of doing 280k random reads on PHI
- which is what an access PHIX2 would require. Starting on PHIX1
means that these can be satisfied by doing reads on sequential pages.
In fact, DB2 might almost have done a tablespace scan on PHI.

Maybe you don't need to "SELECT *"?

James Campbell

On 3 Dec 2004 at 18:26, Raquel Rodriguez wrote:

> Ok, I think I am missing something fundamental here.
> This is my query:
>
> SELECT * FROM
>
> RABAMC.PHI H ,
> MADO.ORGF S
> WHERE H.YAMP_ID = 'EM'
> AND H.OS_MSL_ID = S.MSL_ID
> AND H.PHI_ID = '1'
> AND H.TIS_ID = '01'
> AND H.YNT_DT BETWEEN '2004-11-01' AND '2004-11-30'
> AND S.GON_CD = '0000000004'
>
> PHI has 2.8M rows, ORGF has only 222 rows. The indexes
> on these tables are:
>
> PHI: Two indexes:
>
> PHIX1 on following columns:
> C1
> C2
> C3
> YNT_DT
> TIS_ID
> PHI_ID
>
> The COLCARD values for YNT_DT, TIS_ID and PHI_ID are
> 1888, 99 and 9.
>
> PHIX2 on following columns:
> OS_MSL_ID (This has a cardinality of 210; Total no. of
> rows in this table is 2.8M).
>
> ORGF: Two indexes
>
> ORGFX1 on following columns:
> GON_CD (This has a cardinality of 20; Total no. of
> rows in this table is 222).
>
> ORGFX2 on following columns:
> MSL_ID (This is a unique index).
>
> Explain shows that the query goes for a non-matching
> index scan on PHI in the first step using index PHIX1
> and then it goes for a 1 column matching index scan on
> ORGFX2. The non-matching index scan on PHIX1 (This is
> a 2.8M rows table obviously takes a long time).
>
> I would have thought that the query should have gone
> for a matching index scan on ORGFX1 in the first step
> and again a matching index scan on PHIX2 in the second
> step. This should have been much much faster.
>
>
> TIA
> Raquel.
>
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Mail - You care about security. So do we.
> http://promotions.yahoo.com/new_mail
>
> ---------------------------------------------------------------------------------
> Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page select "Join or Leave the list". The IDUG DB2-L FAQ is at http://www.idugdb2-l.org. The IDUG List Admins can be reached at [login to unmask email] Find out the latest on IDUG conferences at http://conferences.idug.org/index.cfm

---------------------------------------------------------------------------------
Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page select "Join or Leave the list". The IDUG DB2-L FAQ is at http://www.idugdb2-l.org. The IDUG List Admins can be reached at [login to unmask email] Find out the latest on IDUG conferences at http://conferences.idug.org/index.cfm

Terry Purcell

Re: Query Explain question
(in response to James Campbell)
Raquel,

Yes, you are missing the (possible) 2 fundamental problems that the
optimizer has with this query. Both of these are with table PHI.

1st problem:
Given the available statistics provided (in combination with an assumption
I made about H.YAMP_ID), then I would expect that the optimizer considers
filtering on PHI to result in a smaller result than ORGF (even though PHI
starts out at 2.8 million rows). If this is not true, then it is likely
that one of the predicates is non-uniformly distributed (skewed), or that
some of the columns are correlated. These statistics are not available
based on the indexes provided, but would be available if you have collected
these using DSTATS (or RUNSTATS in V8).

2nd problem:
There is no index on PHI that supports the combined filtering of local and
join predicates for this query, making PHI a poor choice as the inner table
of the join.

You can potentially resolve both problems (to a satisfactory degree) with
the right index that combines the highest cardinality local predicate
columns with the join column, but leading with equals (for example -
TIS_ID, OS_MSL_ID, YNT_DT). You may or may not be able to improve
performance of the query resolving the statistics issues only.

I DO NOT RECOMMEND the use of tricks. Regardless of whether they do resolve
the performance of a single query, they do not resolve the fundamental
statistics and indexing issues related to the table itself. Fix the table,
and you potentially resolve majority of the queries accessing the table.

Please note that I am unable to provide you an EXACT answer because of the
minimal information provided.

Regards
Terry Purcell
On Fri, 3 Dec 2004 18:26:41 -0800, Raquel Rodriguez
<[login to unmask email]> wrote:

>Ok, I think I am missing something fundamental here.
>This is my query:
>
>SELECT * FROM
>
>RABAMC.PHI H ,
>MADO.ORGF S
>WHERE H.YAMP_ID = 'EM'
> AND H.OS_MSL_ID = S.MSL_ID
> AND H.PHI_ID = '1'
> AND H.TIS_ID = '01'
> AND H.YNT_DT BETWEEN '2004-11-01' AND '2004-11-30'
> AND S.GON_CD = '0000000004'
>
>PHI has 2.8M rows, ORGF has only 222 rows. The indexes
>on these tables are:
>
>PHI: Two indexes:
>
>PHIX1 on following columns:
>C1
>C2
>C3
>YNT_DT
>TIS_ID
>PHI_ID
>
>The COLCARD values for YNT_DT, TIS_ID and PHI_ID are
>1888, 99 and 9.
>
>PHIX2 on following columns:
>OS_MSL_ID (This has a cardinality of 210; Total no. of
>rows in this table is 2.8M).
>
>ORGF: Two indexes
>
>ORGFX1 on following columns:
>GON_CD (This has a cardinality of 20; Total no. of
>rows in this table is 222).
>
>ORGFX2 on following columns:
>MSL_ID (This is a unique index).
>
>Explain shows that the query goes for a non-matching
>index scan on PHI in the first step using index PHIX1
>and then it goes for a 1 column matching index scan on
>ORGFX2. The non-matching index scan on PHIX1 (This is
>a 2.8M rows table obviously takes a long time).
>
>I would have thought that the query should have gone
>for a matching index scan on ORGFX1 in the first step
>and again a matching index scan on PHIX2 in the second
>step. This should have been much much faster.
>
>
>TIA
>Raquel.
>
>
>
>
>
>__________________________________
>Do you Yahoo!?
>Yahoo! Mail - You care about security. So do we.
>http://promotions.yahoo.com/new_mail
>
>---------------------------------------------------------------------------
------
>Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and
home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page
select "Join or Leave the list". The IDUG DB2-L FAQ is at
http://www.idugdb2-l.org. The IDUG List Admins can be reached at DB2-L-
[login to unmask email] Find out the latest on IDUG conferences at
http://conferences.idug.org/index.cfm

---------------------------------------------------------------------------------
Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page select "Join or Leave the list". The IDUG DB2-L FAQ is at http://www.idugdb2-l.org. The IDUG List Admins can be reached at [login to unmask email] Find out the latest on IDUG conferences at http://conferences.idug.org/index.cfm