Secondary Index

James Campbell

Secondary Index
Jane

Without details of your actual statistics, this is partially guesswork -
and partially not.

When you did the read using "part_key > 0", db2 would read an
index entry, decide which data page contained the target row, issue
a read on that page, wait around for the read to complete, extract
the data and move onto the next index entry. This time, however
the target page was already in a buffer (at least, being the
clustering index, one would hope so), so there was no wait for the
read to complete. DB2 would read on until the data row was on
another page, wait for that page to be read, and continue - only
waiting when a new page was required. After doing this for a
number of times (the number depends on the size of the buffer
pool), DB2 would decide that, rather than waiting for the SQL to
want a data page, it would pre-read blocks of data pages (known
as "sequential prefetch detection"). After this, your process would
not have to wait for the data pages to be read - because the data
pages would already be in a buffer.

Same thing happens in your second case - except, of course that
an index is the not involved. DB2 is just reading through the
tablespace ("tablespace scan").

In your third case, you are not reading data pages sequentially -
the index is providing data rows in a random sequence - so DB2 is
reading each data page to provide one row. (Depending on
various criteria, DB2 might be pre-reading the pages in ("list pre-
fetch"). But (and this is the cause of the performance degredation),
by the time DB2 decides it needs to access another row on a page,
it is likely that it has flushed the data page and re-used the buffer
space for another row. Hence, each data page might be read
several times.


Why, then, use an index? Try:

SELECT * ... WHERE IND BETWEEN 0 AND 10
and
SELECT * ... WHERE IND > 1000000
(or any other values that exclude 99.9% of rows)

How many data pages would be read if there was no index? How
many if there was an index?

BTW, you did run RUNSTATS on your new index before these
tests, didn't you? It is possible that with suitable statistics, DB2
would recognise the situation I described and done a table space
scan. (But, as I said, I don't know your situation.)

An "Estimator" is just that. It is not real performance.

James Campbell

On 31 Dec 2001 at 11:57, jane mike wrote:

> Hi
>
> I have doubt on secondary index. When its needed...? Does the db2 really
> look the secondary index..? if so, when ?
>
> Actually, I checked the CPU time and I/O time using DB2 Estimator, I got the
> following result. Only one index that too Partition inedx.
>
> 1. created Partion unique index(PART_KEY).
> Executed the sql SELECT * from table where part_key > 0 it takes all rows
> and resulted CPU Time 00:08:39.712 and IO time 00:01.29.995.
>
> 2. Executed sql SELECT * FROM IND > 0 , resulted as
> CPU time : 00:07.38.728 and IO time : 00.01.25.536
>
> both columns are Decimal part_key is dec(3,0) ind key is dec(9,0) but the
> second query runs faster than first one.
>
> So i decided to create seconday Non-partitioned index on ind key. The
> cluster ratio is 50%
> now the result for the 2 query is
> CPU time : 00:17:26.947 I/O time : 08:42:47.485
>
> now, it degrades CPU time for the second query
>
> Please clarify without secondary index, my query runs faster, after creation
> of secondary index its very slow. All values are calculated in the DB2
> Estimator.
>
> can any one suggest regarding secondary index..??
>
> thanks
> Jane
>
>
> _________________________________________________________________
> MSN Photos is the easiest way to share and print your photos:
> http://photos.msn.com/support/worldwide.aspx
>
>
>