Online reorg with inline colgroup histograms stats BOTH keyword getting ignored

Edmund Joshua

Online reorg with inline colgroup histograms stats BOTH keyword getting ignored

Hello everybody,

kindly someone share your thoughts on the below error.
I ran the online reorg job with inline colgroup histograms stats BOTH keyword getting ignored for INDEX ALL .

Details :-

DB2 Z/os V12

'DSNU073I 150 XX:XX:XX.XX DSNUGPPM - KEYWORD 'BOTH' IGNORED'

stat-index-spec:-

STATISTICS
TABLE(OWNER.TABLE1) SAMPLE 25
COLUMN(ALL)
COLGROUP("C1")
FREQVAL (COUNT 10 BOTH)
HISTOGRAM NUMQUANTILES 100
COLGROUP("C2")
FREQVAL (COUNT 10 BOTH)
HISTOGRAM NUMQUANTILES 100
INDEX(ALL)
FREQVAL NUMCOLS 1 (COUNT 10 BOTH)
FREQVAL NUMCOLS 2 (COUNT 10 BOTH)
FREQVAL NUMCOLS 3 (COUNT 10 BOTH)
FREQVAL NUMCOLS 4 (COUNT 10 BOTH)
FREQVAL NUMCOLS 5 (COUNT 10 BOTH)
HISTOGRAM NUMCOLS 1 NUMQUANTILES 100
HISTOGRAM NUMCOLS 2 NUMQUANTILES 100
HISTOGRAM NUMCOLS 3 NUMQUANTILES 100
HISTOGRAM NUMCOLS 4 NUMQUANTILES 100
HISTOGRAM NUMCOLS 5 NUMQUANTILES 100
REPORT NO


Thanks

 

Roy Boxwell

Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Edmund Joshua)
It could be me, but I think putting the „count 10 both“ in brackets is possibly the problem.


Roy Boxwell
SOFTWARE ENGINEERING GmbH and SEGUS Inc.
-Product Development-
Heinrichstrasse 83-85
40239 Düsseldorf/Germany
Tel. +49 (0)211 96149-675
Fax +49 (0)211 96149-32
Email: [login to unmask email]
http://www.seg.de
Link zur Datenschutzerklärung

Software Engineering GmbH
Amtsgericht Düsseldorf, HRB 37894
Geschäftsführung: Gerhard Schubert, Ulf Heinrich

> On 30 May 2019, at 17:04, Edmund Joshua <[login to unmask email]> wrote:
>
> Hello everybody,
>
> kindly someone share your thoughts on the below error.
> I ran the online reorg job with inline colgroup histograms stats BOTH keyword getting ignored for INDEX ALL .
>
> Details :-
>
> DB2 Z/os V12
>
> 'DSNU073I 150 XX:XX:XX.XX DSNUGPPM - KEYWORD 'BOTH' IGNORED'
>
> stat-index-spec:-
>
> STATISTICS
> TABLE(OWNER.TABLE1) SAMPLE 25
> COLUMN(ALL)
> COLGROUP("C1")
> FREQVAL (COUNT 10 BOTH)
> HISTOGRAM NUMQUANTILES 100
> COLGROUP("C2")
> FREQVAL (COUNT 10 BOTH)
> HISTOGRAM NUMQUANTILES 100
> INDEX(ALL)
> FREQVAL NUMCOLS 1 (COUNT 10 BOTH)
> FREQVAL NUMCOLS 2 (COUNT 10 BOTH)
> FREQVAL NUMCOLS 3 (COUNT 10 BOTH)
> FREQVAL NUMCOLS 4 (COUNT 10 BOTH)
> FREQVAL NUMCOLS 5 (COUNT 10 BOTH)
> HISTOGRAM NUMCOLS 1 NUMQUANTILES 100
> HISTOGRAM NUMCOLS 2 NUMQUANTILES 100
> HISTOGRAM NUMCOLS 3 NUMQUANTILES 100
> HISTOGRAM NUMCOLS 4 NUMQUANTILES 100
> HISTOGRAM NUMCOLS 5 NUMQUANTILES 100
> REPORT NO
>
>
>
> Thanks
>
>
>
>
> Site Links: View post online View mailing list online Start new thread via email Unsubscribe from this mailing list Manage your subscription
>
> This email has been sent to: [login to unmask email]
> Visit ESAi at IDUG Charlotte. Enter drawings & learn about BCV4, BCV5, Masking Tool, BPA4DB2...
> great automated solutions for Cloning, Buffer Pool Tuning, Log processing, TDM & more.
> http://www.ESAIGroup.com/idug
>
>
> Use of this email content is governed by the terms of service at:
> http://www.idug.org/p/cm/ld/fid=2
>
Attachments

  • smime.p7s (3.9k)

Ruediger Kurtz

AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Edmund Joshua)
Hi,

here’s the syntax we create:

RUNSTATS TABLESPACE <database>.<tabspace>
TABLE (<creator>.<tabname>)
COLGROUP (<colname>
) FREQVAL COUNT 20 BOTH
COLGROUP (<colname#2>
) FREQVAL COUNT 20 BOTH
INDEX (ALL KEYCARD FREQVAL NUMCOLS 1 COUNT 10
FREQVAL NUMCOLS 2 COUNT 10
FREQVAL NUMCOLS 3 COUNT 10
FREQVAL NUMCOLS 4 COUNT 10
FREQVAL NUMCOLS 5 COUNT 10)
HISTOGRAM NUMCOLS 10 NUMQUANTILES 10
SHRLEVEL CHANGE REPORT YES HISTORY ALL FORCEROLLUP YES

No brackets required for “COUNT 20 BOTH”.

Hth

Ruediger


Rüdiger Kurtz
Abteilung Informatik - Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon: 09561 96-44148
Telefax: 09561 96-44104
E-Mail: [login to unmask email]
Internet: www.huk.de
________________________________
HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.
________________________________
Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________
Von: Edmund Joshua [mailto:[login to unmask email]
Gesendet: Donnerstag, 30. Mai 2019 17:05
An: [login to unmask email]
Betreff: [DB2-L] - Online reorg with inline colgroup histograms stats BOTH keyword getting ignored


Hello everybody,

kindly someone share your thoughts on the below error.
I ran the online reorg job with inline colgroup histograms stats BOTH keyword getting ignored for INDEX ALL .

Details :-

DB2 Z/os V12

'DSNU073I 150 XX:XX:XX.XX DSNUGPPM - KEYWORD 'BOTH' IGNORED'

stat-index-spec:-

STATISTICS
TABLE(OWNER.TABLE1) SAMPLE 25
COLUMN(ALL)
COLGROUP("C1")
FREQVAL (COUNT 10 BOTH)
HISTOGRAM NUMQUANTILES 100
COLGROUP("C2")
FREQVAL (COUNT 10 BOTH)
HISTOGRAM NUMQUANTILES 100
INDEX(ALL)
FREQVAL NUMCOLS 1 (COUNT 10 BOTH)
FREQVAL NUMCOLS 2 (COUNT 10 BOTH)
FREQVAL NUMCOLS 3 (COUNT 10 BOTH)
FREQVAL NUMCOLS 4 (COUNT 10 BOTH)
FREQVAL NUMCOLS 5 (COUNT 10 BOTH)
HISTOGRAM NUMCOLS 1 NUMQUANTILES 100
HISTOGRAM NUMCOLS 2 NUMQUANTILES 100
HISTOGRAM NUMCOLS 3 NUMQUANTILES 100
HISTOGRAM NUMCOLS 4 NUMQUANTILES 100
HISTOGRAM NUMCOLS 5 NUMQUANTILES 100
REPORT NO


Thanks



-----End Original Message-----

Edmund Joshua

RE: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Ruediger Kurtz)

Hi Kurtz,

I tried with the options that you mention with and without brackets both giving same result. BOTH keyword is getting invalid when i try with the option that you mentioned.

INDEX(ALL 

FREQVAL NUMCOLS 1 COUNT 10 BOTH
DSNU049I 151 XX:XX:XX.XX DSNUGPRS - INVALID OPERAND 'BOTH' FOR KEYWORD 'STATISTICS'
FREQVAL NUMCOLS 2 COUNT 10 BOTH
FREQVAL NUMCOLS 3 COUNT 10 BOTH
FREQVAL NUMCOLS 4 COUNT 10 BOTH
FREQVAL NUMCOLS 5 COUNT 10 BOTH )
HISTOGRAM NUMCOLS 1 NUMQUANTILES 100
HISTOGRAM NUMCOLS 2 NUMQUANTILES 100
HISTOGRAM NUMCOLS 3 NUMQUANTILES 100
HISTOGRAM NUMCOLS 4 NUMQUANTILES 100
HISTOGRAM NUMCOLS 5 NUMQUANTILES 100

Any thoughts on this.

 

Thanks in advance!

Ruediger Kurtz

AW: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Edmund Joshua)
Hi Edmund,

apart from the fact that I do not really like being addressed by my surname (reminds so much of school), please note that I do not have the BOTH in the INDEX-part of the RUNSTATS.

what I have is simply:

RUNSTATS TABLESPACE whatever_tablespace
TABLE (whatever_table)
COLGROUP (whatever_column
) FREQVAL COUNT 20 BOTH
COLGROUP (whatever_column
) FREQVAL COUNT 20 BOTH
COLGROUP (whatever_column
) FREQVAL COUNT 20 BOTH
COLGROUP (whatever_column
) FREQVAL COUNT 20 BOTH
COLGROUP (whatever_column
) FREQVAL COUNT 20 BOTH
INDEX (ALL KEYCARD FREQVAL NUMCOLS 1 COUNT 10 <<= NO BOTH HERE
FREQVAL NUMCOLS 2 COUNT 10
FREQVAL NUMCOLS 3 COUNT 10
FREQVAL NUMCOLS 4 COUNT 10
FREQVAL NUMCOLS 5 COUNT 10)
HISTOGRAM NUMCOLS 10 NUMQUANTILES 10
SHRLEVEL CHANGE REPORT YES HISTORY ALL FORCEROLLUP YES

This appears to be your problem. Try removing the BOTH where indicated (see below) and run the utility again.

Good luck

Rüdiger



Rüdiger Kurtz
Abteilung Informatik - Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon: 09561 96-44148
Telefax: 09561 96-44104
E-Mail: [login to unmask email]
Internet: www.huk.de
________________________________
HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.
________________________________
Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________
Von: Edmund Joshua [mailto:[login to unmask email]
Gesendet: Freitag, 31. Mai 2019 08:27
An: [login to unmask email]
Betreff: [DB2-L] - RE: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored


Hi Kurtz,

I tried with the options that you mention with and without brackets both giving same result. BOTH keyword is getting invalid when i try with the option that you mentioned.

INDEX(ALL

FREQVAL NUMCOLS 1 COUNT 10 BOTH
DSNU049I 151 XX:XX:XX.XX DSNUGPRS - INVALID OPERAND 'BOTH' FOR KEYWORD 'STATISTICS'
FREQVAL NUMCOLS 2 COUNT 10 BOTH
FREQVAL NUMCOLS 3 COUNT 10 BOTH
FREQVAL NUMCOLS 4 COUNT 10 BOTH
FREQVAL NUMCOLS 5 COUNT 10 BOTH )
HISTOGRAM NUMCOLS 1 NUMQUANTILES 100
HISTOGRAM NUMCOLS 2 NUMQUANTILES 100
HISTOGRAM NUMCOLS 3 NUMQUANTILES 100
HISTOGRAM NUMCOLS 4 NUMQUANTILES 100
HISTOGRAM NUMCOLS 5 NUMQUANTILES 100

Any thoughts on this.



Thanks in advance!

-----End Original Message-----

Michael Hannan

RE: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Edmund Joshua)

The manual suggests a bracket after the ALL, where Ruediger suggests the other syntax is accepted (with BOTH omitted).

On a Tangent to the question, I would not recommend sites to capture BOTH in general. There might be very specific exceptions, however in general we want DB2 to realize how poor the filtering is for very common values. If we leave out the COUNT phrase, I like that because DB2 decides how many to capture, until data is not particularly skewed any more. Non skewed data would not need to have many value frequencies captured at all.

The problem with Least and Both in mind, is that many values are not populated at all, but get default filtering calculation. Or maybe a very infrequent value was missed altogether by sampling.  DB2 does not really understand the filtering at all for values not found, so why have special data for a value found but extremely low frequency? It seems inconsistent to me. The lowest 10 values actually populated might be very unlikely search values indeed. Even less likely than Not Found values searched for, maybe? If DB2 had a slightly pessimistic estimate for search for a low frequency value, I am not too fussed, usually.

The high frequency value information is much more valuable.

If you have a very specific value you need to fix the Filter Factor for, then great, but I would not go capturing BOTH in general. Not very useful data in my personal opinion. O.K. IBM put it there because someone voted for it, I guess.

Maybe for Index Stats, IBM did not think that BOTH was useful either, especially for multi-column Freq Val Stats. Manual seems to suggest you can, but there are faults in the manuals at times.

It is very handy to know the worst case filtering for a skewed column, and also the frequency of Nulls (if high). Best case filtering is not very useful. Best case true FF always = 0 when a missing value is searched for.

I hope sites do not capture Stats only on Reorg, making it look like the Cluster index is perfect, when perhaps most of the time, it might be degraded, and if Stats are collected on reorg, I hope Sites will not REBIND, to get an optimistic access path seeing a perfectly organised table.

Yes I know my thinking is very different to some others.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ May 31, 2019 - 09:08 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ May 31, 2019 - 09:10 AM (Europe/Berlin)

Ruediger Kurtz

AW: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Michael Hannan)
Michael,

I know your standpoint on that matter and I can see you do have a point there – but apart from that, the syntax I have in the Runstats-statement posted does actually run successfully, so I must have been doing something right here ☺

C U

Ruediger


Rüdiger Kurtz
Abteilung Informatik - Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon: 09561 96-44148
Telefax: 09561 96-44104
E-Mail: [login to unmask email]
Internet: www.huk.de
________________________________
HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.
________________________________
Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________
Von: Michael Hannan [mailto:[login to unmask email]
Gesendet: Freitag, 31. Mai 2019 09:04
An: [login to unmask email]
Betreff: [DB2-L] - RE: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored


The manual suggests a bracket after the ALL, where Ruediger suggests the other syntax is accepted. I don't really know, which is accepted.

On a Tangent to the question, I would not recommend sites to capture BOTH in general. There might be very specific exceptions, however in general we want DB2 to realize how poor the filtering is for very common values. If we leave out the COUNT phrase, I like that because DB2 decides how many to capture, until data is not particularly skewed any more. Non skewed data would not need to have any value frequencies captured at all.

The problem with Least and Both in mind, is that many values are not populated at all, but get default filtering calculation. Or maybe a very infrequent value was missed altogether by sampling. DB2 does not really understand the filtering at all for values not found, so why have special data for a value found but extremely low frequency? It seems inconsistent to me. The lowest 10 values actually populated might be very unlikely search values indeed. Even less likely than Not Found values searched for, maybe? If DB2 had a slightly pessimistic estimate for search for a low frequency value, I am not too fussed, usually.

The high frequency value information is much more valuable.

If you have a very specific value you need to fix the Filter Factor for, then great, but I would not go capturing BOTH in general. Not very useful data in my personal opinion. O.K. IBM put it there because someone voted for it, I guess.

It is very handy to know the worst case filtering for a skewed column, and also the frequency of Nulls (if high). Best case filtering is not very useful. Best case true FF always = 0 when a missing value is searched for.

I hope sites do not capture Stats only on Reorg, making it look like the Cluster index is perfect, when perhaps most of the time, it might be degraded, and if Stats are collected on reorg, I hope Sites will not REBIND, to get an optimistic access path seeing a perfectly organised table.

Yes I know my thinking is very different to some others.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

-----End Original Message-----

Michael Hannan

RE: AW: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Ruediger Kurtz)

In Reply to Ruediger Kurtz:

Michael,

I know your standpoint on that matter and I can see you do have a point there – but apart from that, the syntax I have in the Runstats-statement posted does actually run successfully, so I must have been doing something right here ☺

C U

Ruediger

I love to stir up a bit of controversy, rather than just stick to plain facts.  However good if some of you stick to the actual question. Ha Ha.

How many times did I need the LEAST 10 or BOTH to tune an access path? Almost never.  Possible solution looking for a problem. A curiosity where one might use it in a strange case, but mostly not needed. 

 

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Walter Jani&#223;en

AW: AW: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Michael Hannan)
Especially if one thinks about using runstats profiles, where there is no such option as LEAST or BOTH.
Also with TABLESAMPLE, LEAST values could have been overlooked.

Kind regards
Walter Janißen [standard_IBM+Champ+7+Yr+Analytics]

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Technische Anwendungsarchitektur
Victoriaplatz 2
D-40198 Düsseldorf
[login to unmask email]<mailto:[login to unmask email]>

ITERGO Informationstechnologie GmbH
Vorsitzender des Aufsichtsrats: Christian Diedrich
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Lothar Engelke, Ina Kirchhof, Dr. Michael Regauer
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf HRB 37996

Von: Michael Hannan <[login to unmask email]>
Gesendet: Freitag, 31. Mai 2019 11:57
An: [login to unmask email]
Betreff: [DB2-L] - RE: AW: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored


In Reply to Ruediger Kurtz:
Michael,

I know your standpoint on that matter and I can see you do have a point there – but apart from that, the syntax I have in the Runstats-statement posted does actually run successfully, so I must have been doing something right here ☺

C U

Ruediger

I love to stir up a bit of controversy, rather than just stick to plain facts. However good if some of you stick to the actual question. Ha Ha.

How many times did I need the LEAST 10 or BOTH to tune an access path? Almost never. Possible solution looking for a problem. A curiosity where one might use it in a strange case, but mostly not needed.



Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

-----End Original Message-----
Attachments

  • image001.png (2.6k)

Horacio Villa

AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Michael Hannan)
Hi Michael,

you said:

"I hope sites do not capture Stats only on Reorg, making it look like the
Cluster index is perfect, when perhaps most of the time, it might be
degraded, and if Stats are collected on reorg, I hope Sites will not
REBIND, to get an optimistic access path seeing a perfectly organised
table."

And what you hope is exactly what we are doing and makes me re-think the
strategy. I feel you're right.
But I wonder:
¿When would you capture Stats on Reorg and when with Runstats?
¿When would you REBIND?

Thanks,
Horacio Villa


Patrick Bossman

RE: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Michael Hannan)

Hello,

1. I was responsible for the introduction of LEAST and BOTH.  It was for theoretical completeness.  It is something I now regret, because in practice, it's not useful, and it's created confusion and may increase the cost of collection.

2. The collection of multi-column histograms and frequencies shown is probably wasteful.  Those statistics are rarely used.  The internet seems to have swallowed my old articles on this (I thought the internet never forgets?), but our recommendations are baked into STATSFEEDBACK.  I'd build my "statistical base" with single column frequencies, and if I ran into a situation where I needed multi-column frequencies, I'd selectively collect it.

I don't work on PMRs anymore, but in the 10 or so years that I did, over all those years, there were probably less than 10 occasions where the precision of a multi-column frequency was necessary.

So seeing people bake collection of multi-column frequencies into what looks like a template, it looks like collection overkill.

Statistics feedback is where what we'd recommend gets surfaced.

Best regards,

Patrick Bossman

Opinions are my own, YMMV.

 

 

Terry Purcell

RE: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Patrick Bossman)

+1 to Pat's reply.

I would add - that the general collection of multi-column FREQVAL and (especially) multi-column HISTOGRAMs are simply wasting catalog space and increasing the cost of each BIND/REBIND/PREPARE/EXPLAIN because all stats for referenced objects are read in up front (and thus add CPU and catalog getpages to read). Usage of BOTH simply exacerbates this issue.

Regards

Terry Purcell

In Reply to Patrick Bossman:

Hello,

1. I was responsible for the introduction of LEAST and BOTH.  It was for theoretical completeness.  It is something I now regret, because in practice, it's not useful, and it's created confusion and may increase the cost of collection.

2. The collection of multi-column histograms and frequencies shown is probably wasteful.  Those statistics are rarely used.  The internet seems to have swallowed my old articles on this (I thought the internet never forgets?), but our recommendations are baked into STATSFEEDBACK.  I'd build my "statistical base" with single column frequencies, and if I ran into a situation where I needed multi-column frequencies, I'd selectively collect it.

I don't work on PMRs anymore, but in the 10 or so years that I did, over all those years, there were probably less than 10 occasions where the precision of a multi-column frequency was necessary.

So seeing people bake collection of multi-column frequencies into what looks like a template, it looks like collection overkill.

Statistics feedback is where what we'd recommend gets surfaced.

Best regards,

Patrick Bossman

Opinions are my own, YMMV.

 

 

Michael Hannan

RE: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Patrick Bossman)

In Reply to Patrick Bossman:

2. The collection of multi-column histograms and frequencies shown is probably wasteful.  Those statistics are rarely used.  The internet seems to have swallowed my old articles on this (I thought the internet never forgets?), but our recommendations are baked into STATSFEEDBACK.  I'd build my "statistical base" with single column frequencies, and if I ran into a situation where I needed multi-column frequencies, I'd selectively collect it.

I don't like to see lot of histogram stats unless have range predicates with: literal values, Dynamic without parameter markers, or REOPT. So I expect not used often. Even then hard to think why a multi-column histogram stat would be useful. 

For multi-col Freq Value stats, they could get used on a multi-column match for REOPT VARS or dynamic I guess. I don't mind having just the top 1 or 2 values, because it helps show me the very worst case filtering on a multi-column match and degree of skew. This can be especially handy if there is mixed use of a table (multi purpose - not good), and one index has say 3 columns, but 30% of rows have these 3 columns all blank or zero, or Null. Even if this stat did not get used by the Optimizer, it is still useful to know just how bad the filtering could be on a 3 column match, in the absolute worst case. I am not going to go to 10 columns of course. That would be over the top. 3 or 4 cols would be a normal max for me.

Hopefully collecting top 2 values for a multi-col Freq Value stat is not too expensive. Db2 can examine as it scans index data sequentially, and throw away any value no longer in the top 2, and we get offload to zIIP these days. The stats won't do much harm, and can be useful info at times. I don't want to see the Catalog filled up with crap data though, that will never be used by a human or by the Optimizer.

In some cases we find the first column of many indexes is a COMPANY_CODE, with COLCARD=1. In this case, having Freq Value Stats for just the first col of an index is useless. 2 cols would be more useful, but single col Freq Val stats on just the 2nd column would also be more useful, if the Optimizer would actually use it in case Matchcols=2 (probably not).

I do like to have single col Freq Value Stats for all skewed cols, to know what is going on with the filtering. 

I hate to see any single col Freq Value stats cleaned out, even if old (or stale). Replace them with more up to date is O.K. (once in a while) but old is better than nothing.

I think the good thing about Freq Val Stats of the TOP and single column variety, is they tend to get stale very slowly so for most tables don't need to be collected too often. Near enough is good enough.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Michael Hannan

RE: AW: Online reorg with inline colgroup histograms stats BOTH keyword getting ignored
(in response to Horacio Villa)

In Reply to Horacio Villa:

Hi Michael,

you said:

"I hope sites do not capture Stats only on Reorg, making it look like the
Cluster index is perfect, when perhaps most of the time, it might be
degraded, and if Stats are collected on reorg, I hope Sites will not
REBIND, to get an optimistic access path seeing a perfectly organised
table."

And what you hope is exactly what we are doing and makes me re-think the
strategy. I feel you're right.
But I wonder:
¿When would you capture Stats on Reorg and when with Runstats?
¿When would you REBIND?

Thanks,
Horacio Villa

This is a tough subject. First of all, access paths built for Stats showing perfectly Reorged tables, is fine, if the site really keeps its tables well Reorged, and I mean Clusterratios typically 99% and above on the cluster index, and keeps INDREF down, etc.

Once tables get to 97% clustered they are already degrading performance, and cost estimates for access paths will be affected if Stats updated and if REBIND is done.

It is a tough subject because for random probing type access paths, a bit of disorganisation may not hurt much, but for scans on degraded indexes and tables, it can cost you a lot more Getpages. It is easy to see SQLs where it might matter when performance numbers show very large Getpages per Select or per Fetch in a Cursor. These are cases when a lot of scanning is probably going on. Then check the access paths.

So my first recommendation is to Reorg often, and REBIND rarely. I don't like to see access paths changing many times per year. It gets chaotic. If you have a good access path now, keep it for a while. Once it is not good, change it. I like APREUSE(WARN) to be used where possible.

There are access path compare tools, but human expertise is very limited to try to understand better than the Optimizer does. Access path change can be risky.

If a site is going to REBIND for uncontrolled access path change, then I would like to see a system in place that collects performance data and checks if performance was better or worse after REBIND access path change, then can REBIND SWITCH back to old path if needed. Often at the site, it is not apparent  the performance went bad after REBIND, unless it is a severe blowup.

Does any site site keep every past access path with a record of how it performed so they could consider which is likely best after a degradation.

REORGs done together with REBIND only makes it more confusing, what caused a performance change, Reorg or REBIND?

I know REBIND is needed sometimes.

If REBIND will be rare and REORG will be frequent, then Collect stats that are needed to determine REORG. So I would rather collect Stats before the Reorg to show the typical worst clustering of indexes etc. for when the REBIND really does happen.

An access path suited to slightly disorganised tables will still perform good after the Reorg, but the reverse may not be true. 

Dynamic SQL without parameter markers is very different since Db2 has more information to choose a better access path, and reasonably up to date stats may be helpful.

So I would propose regular Stats before Reorgs, regular Reorgs, and not regular REBINDs, and use APREUSE heavily.

I don't particularly like the idea of inline Stats in a REORG. I am different. Most people will like this.

It really needs an in depth study to show if very frequent access path change from REBIND could make real savings, or be more trouble than it is worth, and might be site dependent.

I recently see a site with certain expensive SQLs that keep chop and changing the table join sequence, and the indexes selected. Performance goes from moderate to really bad, is all over the place. I would like them to stabilize the access paths, starting with an OPTHINT or Selectivity Hints, and then maybe using APREUSE on all REBINDs of the problem areas, or stop REBINDing. Bad performance is caused by random access on joins, but trying to get the Optimizer to understand which predicates do the good filtering and are on indexes that minimize the random access is tricky. Too bad if OPTHINTs does not appeal to the site but they REBIND frequently. REBIND access path change chaos in some Packages.

The complexity of this topic leads to not having very definitive recommendations other than:

* Reorg often to keep table very well organised, if you can afford it, and have scanning access paths where it matters.

* Collect Stats, but consider that collecting only with Reorg might be giving an unfair picture, unless Reorgs are always kept up to date. Consider having Stats that show the typical disorganisation level of a table or the worst, rather than most optimistic. This means have to understand Stats like NLEVELS, DATAREPEATFACTORF, CLUSTERRATIOF etc. I don't know the complete answer answer here. Maybe the problem is not really solvable, and therefore keeping tables well Reorged becomes vital.

* Don't REBIND too often unless forced. Try not to change access paths too often, and maybe review performance of changes. Be able to use APREUSE and also SWITCH back to an old path when needed.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd