When Indexes Are Only Partially Effective

Posted By: David Simpson Technical Content,

Case Study #1

THE QUERY:

SELECT  ...
FROM T1 JOIN T2
  ON T1.C1 = T2.C1
 AND T1.C2 = T2.C2
WHERE T1.C4 = ‘Something’
 AND T1.C6 > 27

THE ENVIRONMENT:
  • The only available interesting index on T2 is a single column index on C1.
  • Table T2 has 2.4 Billion Rows
  • Column Cardinality (COLCARDF) statistics on T2:
    • C1 = 250
    • C2 = 10,000,000

It is likely that T1 will be the first table accessed for the join since both of the local predicates in the WHERE clause filter that table. The only helpful index on T2 is on C1 only AND C1 only has 250 unique values out of 2.4 billion rows. This likely means that millions of hits on the actual table for T2 will be needed to match or eliminate rows for the join.

LESSON LEARNED:

Even though an index is being used to support the join it could still be a significant performance issue since the entire join key is not being processed through the index. A composite index containing both C1 and C2 would significantly improve the performance of this query. DON’T STOP LOOKING JUST BECAUSE AN INDEX IS BEING USED.


Case Study #2

THE QUERY:

SELECT ...
   FROM T1
  WHERE T1.C4 = ‘Something’
    AND T1.C6 > 27

THE ENVIRONMENT:
  • The only available interesting index on T1 is a composite index on (C6, C4)
  • T1 has 1 Billion Rows

This query has one equal predicate and one range predicate. A range predicate uses comparison operators such as <, >, BETWEEN or LIKE to qualify a range of values. Range predicates can use an index to match the relevant column, however, further matching will not be possible. In this case since the leading column of the index is paired with a range predicate on the query only one column (C6) can be matched and the other column (C1) will be screened (meaning all values for the match will need to be processed).

LESSON LEARNED:

A better index to support this query would lead with the equal predicate (C4) and then include the range predicate (C6).


Conclusion

Both of these queries would benefit from a new index or alteration of an existing index. Be careful when deciding what to do. New indexes will create overhead for inserts and altering existing indexes could affect other workloads that depend on them. Analysis should be done before making these changes.