[DB2-L] Which is better? "BETWEEN" vs "<=" and >"="

Walter Jani&#223;en

[DB2-L] Which is better? "BETWEEN" vs "<=" and >"="
Hi

I think, although the result set will be the same, the predicates are not equivalent, because they have different filtor factors and I don't believe either, that DB2 will rewrite these predicates. May be, DB2 can use multi-index access for the greater/less equal predicate. From time to time, I got different access paths when I use these predicates.

But Terry or Pat will know all that for sure.

Mit freundlichen Grüßen
Walter Janißen

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Laufzeitarchitektur
Victoriaplatz 2
40477 Düsseldorf
mailto:[login to unmask email]

Vorsitzender des Aufsichtsrates: Jürgen Vetter
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Ina Kirchhof, Dr. Christian Nymphius, Dr. Michael Regauer, Wolfgang Schön.
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf, HRB 37996


-----Ursprüngliche Nachricht-----
Von: IDUG DB2-L [mailto:[login to unmask email] Im Auftrag von Phil Grainger
Gesendet: Donnerstag, 14. Januar 2010 16:02
An: [login to unmask email]
Betreff: Re: [DB2-L] Which is better? "BETWEEN" vs "<=" and >"="

I know Terry is out there somewhere, but I'm kind of hoping that in the 21st century DB2 is smart enough to know these are equivalent and would rewrite one into the other or at least treat them the same

Now I'll wait for someone to patiently tell me circumstances when they are NOT equivalent statements!

Phil G
Grainger Database Solutions

-----Original Message-----
From: IDUG DB2-L [mailto:[login to unmask email] On Behalf Of Sheldon Rich
Sent: 14 January 2010 13:41
To: [login to unmask email]
Subject: [DB2-L] Which is better? "BETWEEN" vs "<=" and >"="

As part of our DB2 performance review another interesting question has developed. Which is the most efficient way to retrieve rows that are between two values.
Method 1:
Select * from mytab
where cola between :hv1 and :hv2

Method 2:
Select * from mytab
where cola >= :hv1
and cola <= :hv2

Two possibilities exist: value of cola available in index or table data will be needed to determine value of cola. Is there a difference?

Which is the better and most efficient way to retrieve the data? Are there other methods to retrieve these rows?

Thanks to all,

Sheldon Rich
Bank Tfahot
[login to unmask email]



_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA * _____________________________________________________________________

http://www.idug.org/db2-content/index.html has THOUSANDS of free technical presentations!
DB2 LUW, DB2 z/OS, Performance, Installation, Tuning, Coding, BI, Warehouses, - among many more categories of help waiting for you!
Whether you are an old hand or a DB2 newbie, we have presentations for every level.
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA * _____________________________________________________________________

http://www.idug.org/db2-content/index.html has THOUSANDS of free technical presentations!
DB2 LUW, DB2 z/OS, Performance, Installation, Tuning, Coding, BI, Warehouses, - among many more categories of help waiting for you!
Whether you are an old hand or a DB2 newbie, we have presentations for every level.
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA *
_____________________________________________________________________

http://www.idug.org/db2-content/index.html has THOUSANDS of free technical presentations!
DB2 LUW, DB2 z/OS, Performance, Installation, Tuning, Coding, BI, Warehouses, - among
many more categories of help waiting for you!
Whether you are an old hand or a DB2 newbie, we have presentations for every level.
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

Bernd Oppolzer

Re: AW: [DB2-L] Which is better? "BETWEEN" vs "<=" and >"="
(in response to Walter Janißen)
I, too, think that there is a difference in the calculation of the filter
factors, and, probably more important, there will be different
default filter factors, if proper filter factors can not be calculated
due to the use of host variables and NOREOPT(VARS).

I always recommend to use BETWEEN, if if meets the needs
of the application.

Kind regards

Bernd



Walter Janißen schrieb:
> Hi
>
> I think, although the result set will be the same, the predicates are not equivalent, because they have different filtor factors and I don't believe either, that DB2 will rewrite these predicates. May be, DB2 can use multi-index access for the greater/less equal predicate. From time to time, I got different access paths when I use these predicates.
>
> But Terry or Pat will know all that for sure.
>
> Mit freundlichen Grüßen
> Walter Janißen
>
>

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA *
_____________________________________________________________________

http://www.idug.org/db2-content/index.html has THOUSANDS of free technical presentations!
DB2 LUW, DB2 z/OS, Performance, Installation, Tuning, Coding, BI, Warehouses, - among
many more categories of help waiting for you!
Whether you are an old hand or a DB2 newbie, we have presentations for every level.
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

David Simpson

Re: AW: [DB2-L] Which is better? "BETWEEN" vs "<=" and >"="
(in response to Bernd Oppolzer)
I just did some experimenting and it appears that even in DB2 9 optimizer will not rewrite the two predicates as a BETWEEN. Perceived costs for the BETWEEN seem to be much lower than writing the two predicates so it would seem to be a good standard to use BETWEEN when possible.

... unless the other one performs better ;)
______________________________________________________________________________
David Simpson | Senior Technical Advisor | Themis Education
123 South Avenue E, Westfield NJ, 07090 | phone 908-233-8900 | [login to unmask email]
For more information about Themis, visit www.themisinc.com


-----Original Message-----
From: IDUG DB2-L [mailto:[login to unmask email] On Behalf Of Bernd Oppolzer
Sent: Thursday, January 14, 2010 11:22 AM
To: [login to unmask email]
Subject: Re: [DB2-L] AW: [DB2-L] Which is better? "BETWEEN" vs "<=" and >"="

I, too, think that there is a difference in the calculation of the filter
factors, and, probably more important, there will be different
default filter factors, if proper filter factors can not be calculated
due to the use of host variables and NOREOPT(VARS).

I always recommend to use BETWEEN, if if meets the needs
of the application.

Kind regards

Bernd



Walter Janißen schrieb:
> Hi
>
> I think, although the result set will be the same, the predicates are not equivalent, because they have different filtor factors and I don't believe either, that DB2 will rewrite these predicates. May be, DB2 can use multi-index access for the greater/less equal predicate. From time to time, I got different access paths when I use these predicates.
>
> But Terry or Pat will know all that for sure.
>
> Mit freundlichen Grüßen
> Walter Janißen
>
>

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA *
_____________________________________________________________________

http://www.idug.org/db2-content/index.html has THOUSANDS of free technical presentations!
DB2 LUW, DB2 z/OS, Performance, Installation, Tuning, Coding, BI, Warehouses, - among
many more categories of help waiting for you!
Whether you are an old hand or a DB2 newbie, we have presentations for every level.
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA *
_____________________________________________________________________

http://www.idug.org/solutions-journal.html - home of the IDUG Solutions Journal
Technical atricles from world famous authors in DB2's most prestigious, peer reviewed
magazine now on-line!
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

Peter Vanroose

Re: Which is better? "BETWEEN" vs "<=" and >"="
(in response to David Simpson)
A small example, to illustrate the difference in filter factor, and the
potential importance of carefully choosing between either "<= & >=" and
"BETWEEN":

Suppose a column A contains the following values in its 20 rows:
0 5 5 5 5 5 10 10 10 10 15 15 15 15 20 25 25 25 25 25.
Comparing the two predicates "A>= 10 && A <= 15" and "A BETWEEN 10 AND 15":

- the "real" filter factor is 40%, since 8 out of 20 values match.
- without statistics, or with parameter markers instead of the values 10 and
15, the estimated filter factors would be 11.1% and 10%, respectively. (The
FF for a "<=" is 1/3, and a combination of two inequalities are always
considered "independent", hence 1/3 x 1/3 = 1/9.)
- when "normal" statistics are available (i.e., no cardinalities), the
filter factors are 37.5% and 25%, respectively:
LOW2KEY is 5 (not 0!) and HIGH2KEY is 25, hence out of the "range" 5--25,
A>=10 covers 75%, A<=15 covers 50%, and 10<=A<=15 covers 25%.
- with cardinality statistics, the FF would be 40%, of course.

So in this particular case, and with "normal" statistics and no parameter
markers, writing "A>= 10 && A <= 15" (instead of "A BETWEEN 10 AND 15") will
give a FF estimation which is closest to the real 40%.
Of course, with other values than 10 & 15, or with other column content, the
outcome will be completely different...


-- Peter Vanroose,
ABIS Training & Consulting

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA *
_____________________________________________________________________

http://www.idug.org/rug/index.html - with almost 150 IDUG Regional User Groups,
there is probably one near you!
Regional User Groups are your local connection to the Worldwide DB2 User Community
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

Walter Jani&#223;en

AW: [DB2-L] Which is better? "BETWEEN" vs "<=" and >"="
(in response to Peter Vanroose)
Hello Peter

I think, the correct value for HIGH2KEY in your example must be 20 mustn't it?
I just did a test with a small table containing 1, 2, 2, 2, 6, 10, 10, 10 and I got 6 as HIGH2KEY for this column.

Mit freundlichen Grüßen
Walter Janißen

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Laufzeitarchitektur
Victoriaplatz 2
40477 Düsseldorf
Tel.: +49 211 477-2928
Fax: +49 211 477-2615
mailto:[login to unmask email]

Vorsitzender des Aufsichtsrates: Jürgen Vetter
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Ina Kirchhof, Dr. Christian Nymphius, Dr. Michael Regauer, Wolfgang Schön.
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf, HRB 37996


-----Ursprüngliche Nachricht-----
Von: IDUG DB2-L [mailto:[login to unmask email] Im Auftrag von Peter Vanroose
Gesendet: Samstag, 16. Januar 2010 21:56
An: [login to unmask email]
Betreff: Re: [DB2-L] Which is better? "BETWEEN" vs "<=" and >"="

A small example, to illustrate the difference in filter factor, and the potential importance of carefully choosing between either "<= & >=" and
"BETWEEN":

Suppose a column A contains the following values in its 20 rows:
0 5 5 5 5 5 10 10 10 10 15 15 15 15 20 25 25 25 25 25.
Comparing the two predicates "A>= 10 && A <= 15" and "A BETWEEN 10 AND 15":

- the "real" filter factor is 40%, since 8 out of 20 values match.
- without statistics, or with parameter markers instead of the values 10 and 15, the estimated filter factors would be 11.1% and 10%, respectively. (The FF for a "<=" is 1/3, and a combination of two inequalities are always considered "independent", hence 1/3 x 1/3 = 1/9.)
- when "normal" statistics are available (i.e., no cardinalities), the filter factors are 37.5% and 25%, respectively:
LOW2KEY is 5 (not 0!) and HIGH2KEY is 25, hence out of the "range" 5--25,
A>=10 covers 75%, A<=15 covers 50%, and 10<=A<=15 covers 25%.
- with cardinality statistics, the FF would be 40%, of course.

So in this particular case, and with "normal" statistics and no parameter markers, writing "A>= 10 && A <= 15" (instead of "A BETWEEN 10 AND 15") will give a FF estimation which is closest to the real 40%.
Of course, with other values than 10 & 15, or with other column content, the outcome will be completely different...


-- Peter Vanroose,
ABIS Training & Consulting

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA * _____________________________________________________________________

http://www.idug.org/rug/index.html - with almost 150 IDUG Regional User Groups, there is probably one near you!
Regional User Groups are your local connection to the Worldwide DB2 User Community _____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA *
_____________________________________________________________________

http://www.idug.org/db2-videos.html has hundreds of video presentations!
Did you miss out on attending an IDUG conference?
Many of the presentations were recorded and are available on our website!
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

Troy L Coleman

Re: Which is better? "BETWEEN" vs "<=" and >"="
(in response to Walter Janißen)
My experience has shown that BETWEEN has a better filter factor and will influence DB2 to pick a better access path.

-Troy

-----Original Message-----
From: IDUG DB2-L [mailto:[login to unmask email] On Behalf Of Sheldon Rich
Sent: Thursday, January 14, 2010 7:41 AM
To: [login to unmask email]
Subject: [DB2-L] Which is better? "BETWEEN" vs "<=" and >"="

As part of our DB2 performance review another interesting question has
developed. Which is the most efficient way to retrieve rows that are between
two values.
Method 1:
Select * from mytab
where cola between :hv1 and :hv2

Method 2:
Select * from mytab
where cola >= :hv1
and cola <= :hv2

Two possibilities exist: value of cola available in index or table data will be
needed to determine value of cola. Is there a difference?

Which is the better and most efficient way to retrieve the data? Are there
other methods to retrieve these rows?

Thanks to all,

Sheldon Rich
Bank Tfahot
[login to unmask email]



_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA *
_____________________________________________________________________

http://www.idug.org/db2-content/index.html has THOUSANDS of free technical presentations!
DB2 LUW, DB2 z/OS, Performance, Installation, Tuning, Coding, BI, Warehouses, - among
many more categories of help waiting for you!
Whether you are an old hand or a DB2 newbie, we have presentations for every level.
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L

Terry Purcell

Re: Which is better? "BETWEEN" vs "<=" and >"="
(in response to Troy L Coleman)
It is true that there are still some differences with how DB2 handles
BETWEEN vs ">= AND <=". And it is still on the to-do list to bring these into
line - it just hasn't bubbled up high enough.

Rather than focusing on Filter Factor estimation differences, I would first
target what gets you matching index access.

For example, if you have a simple case of COL BETWEEN ? AND ?
or "COL >= ? AND COL <= ?", then both provide matching index access.

However, ? BETWEEN COL1 AND COL2 is stage 2, whereas "? >= COL1
AND ? <= COL2" are indexable predicates.

Also, T1.COL BETWEEN T2.COL1 AND T2.COL2 is only indexable if T2 is
accessed before T1, whereas the >=/<= syntax is indexable in either join
direction.

Regards
Terry Purcell


On Thu, 14 Jan 2010 13:41:02 +0000, Sheldon Rich <[login to unmask email]>
wrote:

>As part of our DB2 performance review another interesting question has
>developed. Which is the most efficient way to retrieve rows that are
between
>two values.
>Method 1:
>Select * from mytab
> where cola between :hv1 and :hv2
>
>Method 2:
>Select * from mytab
> where cola >= :hv1
> and cola <= :hv2
>
>Two possibilities exist: value of cola available in index or table data will be
>needed to determine value of cola. Is there a difference?
>
>Which is the better and most efficient way to retrieve the data? Are there
>other methods to retrieve these rows?
>
>Thanks to all,
>
>Sheldon Rich
>Bank Tfahot
>[login to unmask email]
>
>
>
>__________________________________________________________
___________
>
>* IDUG North America * Tampa, Florida, * May 10-14 2010 *
http://IDUG.ORG/NA *
>__________________________________________________________
___________
>
>http://www.idug.org/db2-content/index.html has THOUSANDS of free
technical presentations!
>DB2 LUW, DB2 z/OS, Performance, Installation, Tuning, Coding, BI,
Warehouses, - among
>many more categories of help waiting for you!
>Whether you are an old hand or a DB2 newbie, we have presentations for
every level.
>__________________________________________________________
___________
>
>If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is
the home of IDUG's DB2-L

_____________________________________________________________________

* IDUG North America * Tampa, Florida, * May 10-14 2010 * http://IDUG.ORG/NA *
** Your only source for independent, unbiased, and trusted DB2 information.

_____________________________________________________________________
http://www.IDUG.org/mentor
How can you expand your staff or do succession planning in this economy?
Mentoring is a proven, economical, way to train the next generation of DB2 Users!
_____________________________________________________________________

If you need to change settings, http://www.idug.org/cgi-bin/wa?A0=DB2-L is the home of IDUG's DB2-L