Access path and index choice

Larry Kintisch

Access path and index choice
Adam,

Something Terry said made me review the RUNSTATS options and something
you did not reply to us on my earlier response still is a mystery.

In RUNSTATS use INDEX(.... KEYCARD FREQVAL NUMCOLS 6 COUNT 10) or some
other COUNT number. This will help determine [as Terry says, if the
optimizer is coded to check] whether there is correlation amongst the
columns of your predicates [as would be in MAKE = 'FORD' and MODEL =
'TAURUS']. Similarly for the other index try a (....KEYCARD FREQVAL
NUMCOLS 3 COUNT 10).

You did not reply as to whether you were using literal values [as you
showed us] or host variables. With host variables, the "histogram" data
[FREQVAL] won't be used for static binds but will be used for BIND
...REOPT(VARS) [note in V8 it becomes REOPT(NONE) or REPOT(ONCE) or
REOPT(ALWAYS) ].

As a DB2 user in Germany (correct?) you may want to take IBM's 5 day
class CF96 "Application Performance and Tuning" for more details on index
design and optimizer issues.

An alternative in Europe is Tapio Lahdenmaki's 3-day class "Better
Database Performance with Well-Designed Indexes" [see his website
www.Tapio1.com --that's Tapio-one--] which comes to Germany every so often.

In the USA I teach that class privately as "Cost-Saving Database Index
Design" as a two day offering or longer [see www.DBIndexDesign.com ]. It
is also offered publicly when you folks out there suggest to me there is an
interest in your city.

Larry Kintisch 845-353-0885 www.DBIndexDesign.com

At 02:15 PM 12/31/2005 -0600, you wrote:
>It is difficult to provide a definitive answer with such limited detail.
>Although with what information is available, and given how V7 makes use of
>KEYCARD statistics for these type of predicates (V8 makes more extensive
>use of KEYCARD), both indexes will cost out to be very close.
>
>However I do not wish to speculate which V7 zparms or PTFs may alter the
>index choice here without more detailed analysis.
>
>A potential workaround is to reorder the columns of the 2nd index so that
>REFER becomes a matching rather than screening predicate (or replacing
>SUFNIF with another high cardinality column from the WHERE clause). Then
>the effect of choosing this index may not be negative.
>
> NIF ASC,
> REFER ASC,
> SUFNIF ASC)
>
>Regards
>Terry Purcell
>
>On Thu, 29 Dec 2005 03:46:33 -0600, Adam Baldwin <[login to unmask email]>
>wrote:
>
>>Fellow listers: Can anyone throw some light on this.
>>
>>We have a query:
>>
>>SELECT *
>>FROM MYTAB
>>WHERE ( CODISER = '11'
>>AND OFIAPE = '111'
>>AND NUMECTA = '111111'
>>AND DIGICTA = '1'
>>AND ENTIDAD = 1
>>AND NIF ='X11111111'
>>AND SUFNIF IN ( '100',' ')
>>AND REFER IN ( '000030013912',' '))
>>
>>against a table defined as follows:
>> CREATE TABLE MYTAB
>> (CODISER CHAR(2) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> OFIAPE CHAR(3) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> NUMECTA CHAR(6) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> DIGICTA CHAR(1) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> ENTIDAD DECIMAL(4, 0) NOT NULL WITH DEFAULT ,
>> NIF CHAR(9) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> SUFNIF CHAR(3) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> REFER CHAR(12) FOR SBCS DATA NOT NULL
>> WITH DEFAULT ,
>> IMPORTE DECIMAL(13, 2) NOT NULL WITH DEFAULT ,
>> FEINIDES DATE NOT NULL WITH DEFAULT ,
>> FEFINDES DATE NOT NULL WITH DEFAULT ,
>> FEALTA CHAR(6) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> FEULMOD CHAR(6) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> HORULMOD CHAR(6) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> OFIULMOD CHAR(3) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> NUMTER CHAR(4) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> USUARIO CHAR(4) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> A2000E_MNDORIG CHAR(1) FOR SBCS DATA NOT NULL WITH DEFAULT
>>,
>> A2000E_EUR_IMP DECIMAL(13, 0) NOT NULL WITH DEFAULT ,
>> A2000N_FEULMOD DATE WITH DEFAULT NULL ,
>> A2000N_FEALTA DATE WITH DEFAULT NULL ,
>> A2000D_IMPORTE DECIMAL(13, 0) NOT NULL WITH DEFAULT ,
>> A2000_DUMMY VARCHAR(2000) FOR SBCS DATA NOT NULL
>> WITH DEFAULT )
>>
>>There are two indexes:
>>
>>One, the clustering index (unique) on:
>>
>> (CODISER ASC,
>> OFIAPE ASC,
>> NUMECTA ASC,
>> DIGICTA ASC,
>> ENTIDAD ASC,
>> NIF ASC,
>> SUFNIF ASC,
>> REFER ASC)
>>
>>and the other, non unique, on:
>>
>> NIF ASC,
>> SUFNIF ASC,
>> REFER ASC)
>>
>>The query uses the second index with a non mathcing scan on 2 cols. Stats
>>are up to date.
>>
>>Why is the optimizer going for the non clustering index, ignoring 6
>>matchcols on the clustering index?
>>
>
>---------------------------------------------------------------------------
------
>Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and
home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page
select "Join or Leave the list". The IDUG DB2-L FAQ is at
http://www.idugdb2-l.org. The IDUG List Admins can be reached at
[login to unmask email] Find out the latest on IDUG conferences at
http://conferences.idug.org/index.cfm
>
>

---------------------------------------------------------------------------------
Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page select "Join or Leave the list". The IDUG DB2-L FAQ is at http://www.idugdb2-l.org. The IDUG List Admins can be reached at [login to unmask email] Find out the latest on IDUG conferences at http://conferences.idug.org/index.cfm

Adam Baldwin

Re: Access path and index choice
(in response to Larry Kintisch)
Larry:

Re the literals/hostvars - the access path chosen by the optimizer is the
same in both scenarios. We've tracked down some maintenance to apply which
should resolve the issue.

Re my location - I work for IBM in Spain.

Fellow Listers - Thanks for all of the replies and suggestions.

Regards, and Happy New Year.

---------------------------------------------------------------------------------
Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page select "Join or Leave the list". The IDUG DB2-L FAQ is at http://www.idugdb2-l.org. The IDUG List Admins can be reached at [login to unmask email] Find out the latest on IDUG conferences at http://conferences.idug.org/index.cfm

Patrick Bossman

Re: Access path and index choice
(in response to Adam Baldwin)
Hello,
I want to make sure a few things are clear based on Larry's response.

1) Multi-column cardinalities collected with KEYCARD (and/or COLGROUP) helps
the optimizer with uniform correlation. The MAKE and MODEL example is
generally used as a correlation example, because only Toyota makes 4-Runners
trucks.

2) Frequencies are similiar to histograms, and they are very useful for data
skew situations. (Eg. A lot more people live in Los Angeles, CA than Fargo,
ND. Frequencies would indicate that the value "Los Angeles, CA" has more
occurences than "Fargo, ND" if frequencies were collected on CITY, STATE).

3) If you run RUNSTATS INDEX (IX1 FREQVAL NUMCOLS 6 COUNT 10 )
3a) To get the full value of multi-column frequencies, you also have
to collect KEYCARD.
3b) Understand that RUNSTATS is collecting frequencies on just the first
6 leading concatenated values. It does not also collect frequencies
on the first column, first 2 columns, first 3 columns, etc. To get
the first column, first 2 columns, etc, they'd also have to be
explicitly coded.

RUNSTATS INDEX (IX1 FREQVAL NUMCOLS 1 COUNT 10
,FREQVAL NUMCOLS 2 COUNT 10
, ...
,FREQVAL NUMCOLS 6 COUNT 10)

With issue #3 we have seen customers change RUNSTATS to FREQVAL NUMCOLS 6
COUNT 10 permanently. While this does drive frequencies on the first 6
concatenated columns, it also STOPS the default collection of the top 10
frequencies on the first column.

Felíz año nuevo,
Pat Bossman

---------------------------------------------------------------------------------
Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page select "Join or Leave the list". The IDUG DB2-L FAQ is at http://www.idugdb2-l.org. The IDUG List Admins can be reached at [login to unmask email] Find out the latest on IDUG conferences at http://conferences.idug.org/index.cfm

Larry Kintisch

Re: Access path and index choice
(in response to Patrick Bossman)
Adam,
A few more points and suggestions after Pat's excellent comments.
My partner, Tapio Lahdenmaki, has written to me:

I agree the partial index key cardinality of the good index is important.
Without that, the optimizer may interpolate between FIRSTKEYCARD and
FULLKEYCARD -- or simply use FULLKEYCARD as an upper bound. However, the
partial index key cardinality of the bad index is important, too. The two
IN predicates complicate estimating the filter factor. I wonder if the
optimizer simply multiplies the filter factors (1/CARD1 x 2/CARD2 x
2/CARD3) without any upper bound. Then, the FF estimate can be way too low
and the bad index looks good. It is not easy to be an optimizer.

Based on that, here are two other suggestions. First repeat the RUNSTATS
INDEX for the "bad" index (as Pat suggested for 1, 2 and 3 "NUMCOLS").
Second, if that doesn't work, let's try to improve the number of matching
columns on the "good" index by re-writing the query to eliminate the first
if the IN predicates, as below. I hope this helps. Larry Kintisch
www.DBIndexDesign.com

SELECT *
FROM MYTAB
WHERE ( CODISER = '11'
AND OFIAPE = '111'
AND NUMECTA = '111111'
AND DIGICTA = '1'
AND ENTIDAD = 1
AND NIF ='X11111111'
AND SUFNIF ='100'
AND REFER IN ( '000030013912',' '))
UNION ALL
SELECT *
FROM MYTAB
WHERE ( CODISER = '11'
AND OFIAPE = '111'
AND NUMECTA = '111111'
AND DIGICTA = '1'
AND ENTIDAD = 1
AND NIF ='X11111111'
AND SUFNIF =' '
AND REFER IN ( '000030013912',' '))


At 09:04 AM 1/4/2006 -0600, you wrote:
>Hello,
>I want to make sure a few things are clear based on Larry's response.
>
>1) Multi-column cardinalities collected with KEYCARD (and/or COLGROUP) helps
>the optimizer with uniform correlation. The MAKE and MODEL example is
>generally used as a correlation example, because only Toyota makes 4-Runners
>trucks.
>
>2) Frequencies are similiar to histograms, and they are very useful for data
>skew situations. (Eg. A lot more people live in Los Angeles, CA than Fargo,
>ND. Frequencies would indicate that the value "Los Angeles, CA" has more
>occurences than "Fargo, ND" if frequencies were collected on CITY, STATE).
>
>3) If you run RUNSTATS INDEX (IX1 FREQVAL NUMCOLS 6 COUNT 10 )
> 3a) To get the full value of multi-column frequencies, you also have
> to collect KEYCARD.
> 3b) Understand that RUNSTATS is collecting frequencies on just the first
> 6 leading concatenated values. It does not also collect frequencies
> on the first column, first 2 columns, first 3 columns, etc. To get
> the first column, first 2 columns, etc, they'd also have to be
> explicitly coded.
>
> RUNSTATS INDEX (IX1 FREQVAL NUMCOLS 1 COUNT 10
> ,FREQVAL NUMCOLS 2 COUNT 10
> , ...
> ,FREQVAL NUMCOLS 6 COUNT 10)
>
>With issue #3 we have seen customers change RUNSTATS to FREQVAL NUMCOLS 6
>COUNT 10 permanently. While this does drive frequencies on the first 6
>concatenated columns, it also STOPS the default collection of the top 10
>frequencies on the first column.
>
>Felíz año nuevo,
>Pat Bossman
>
>---------------------------------------------------------------------------
------
>Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and
home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page
select "Join or Leave the list". The IDUG DB2-L FAQ is at
http://www.idugdb2-l.org. The IDUG List Admins can be reached at
[login to unmask email] Find out the latest on IDUG conferences at
http://conferences.idug.org/index.cfm
>
>

---------------------------------------------------------------------------------
Welcome to the IDUG DB2-L list. To unsubscribe, go to the archives and home page at http://www.idugdb2-l.org/archives/db2-l.html. From that page select "Join or Leave the list". The IDUG DB2-L FAQ is at http://www.idugdb2-l.org. The IDUG List Admins can be reached at [login to unmask email] Find out the latest on IDUG conferences at http://conferences.idug.org/index.cfm