Tool for building views?

Tim Hare

Tool for building views?

I'm sure this has been asked before (maybe even by me but I can't remember doing so on this list). 
Are there tools that can aid in developing complex views?  We have some applications which have, in my opinion, bad patterns:  they do a SELECT on a table,  loop through a FETCH on those rows issuing SELECT on other tables, sometimes even going three or five layers deep.  These are created by a code-generating program so modifying the source isn't really an option, the code has to be regenerated.  It is my believe that a VIEW accomplishing what those loops do would bring the DB2 optimizer into play in a much better way, and I'd like to do a benchmark of that, but as the VIEW would need to include 30+ tables (if even possible) it becomes kind of hard to code manually (and keep track of nested parantheses etc.). 

We have CA's database tools but I haven't seen a 'view builder' utility;  I haven't looked into the 'encyclopedia' for the code-generator or the CA Repository tool yet to see if those have aids for this.

steen rasmussen

Tool for building views?
(in response to Tim Hare)
Tim - you don't have to look further since we don't have a tool assisting building views. In fact I have never bumped into such a solution, so maybe something I can do when I retire and contribute to my 401K :-)
Steen Rasmussen 
On Thursday, May 16, 2019, 11:45:59 AM CDT, Tim Hare <[login to unmask email]> wrote:


I'm sure this has been asked before (maybe even by me but I can't remember doing so on this list). 
Are there tools that can aid in developing complex views?  We have some applications which have, in my opinion, bad patterns:  they do a SELECT on a table,  loop through a FETCH on those rows issuing SELECT on other tables, sometimes even going three or five layers deep.  These are created by a code-generating program so modifying the source isn't really an option, the code has to be regenerated.  It is my believe that a VIEW accomplishing what those loops do would bring the DB2 optimizer into play in a much better way, and I'd like to do a benchmark of that, but as the VIEW would need to include 30+ tables (if even possible) it becomes kind of hard to code manually (and keep track of nested parantheses etc.). 

We have CA's database tools but I haven't seen a 'view builder' utility;  I haven't looked into the 'encyclopedia' for the code-generator or the CA Repository tool yet to see if those have aids for this.

Site Links: View post online   View mailing list online   Start new thread via email   Unsubscribe from this mailing list   Manage your subscription  

This email has been sent to: [login to unmask email]
ESAi has well-regarded tools for Fast Cloning, Buffer Pool Tuning, Log Analysis, TDM & more.
BCV4, BCV5, BPA4DB2, ULT4DB2... modern power tools to get the job done faster & easier than ever.
http://www.ESAIGroup.com/idug



Use of this email content is governed by the terms of service at:
http://www.idug.org/p/cm/ld/fid=2

steen rasmussen

Tool for building views?
(in response to Tim Hare)
Tim - you don't have to look further since we don't have a tool assisting building views. In fact I have never bumped into such a solution, so maybe something I can do when I retire and contribute to my 401K :-)
Steen Rasmussen 
On Thursday, May 16, 2019, 11:45:59 AM CDT, Tim Hare <[login to unmask email]> wrote:


I'm sure this has been asked before (maybe even by me but I can't remember doing so on this list). 
Are there tools that can aid in developing complex views?  We have some applications which have, in my opinion, bad patterns:  they do a SELECT on a table,  loop through a FETCH on those rows issuing SELECT on other tables, sometimes even going three or five layers deep.  These are created by a code-generating program so modifying the source isn't really an option, the code has to be regenerated.  It is my believe that a VIEW accomplishing what those loops do would bring the DB2 optimizer into play in a much better way, and I'd like to do a benchmark of that, but as the VIEW would need to include 30+ tables (if even possible) it becomes kind of hard to code manually (and keep track of nested parantheses etc.). 

We have CA's database tools but I haven't seen a 'view builder' utility;  I haven't looked into the 'encyclopedia' for the code-generator or the CA Repository tool yet to see if those have aids for this.

Site Links: View post online   View mailing list online   Start new thread via email   Unsubscribe from this mailing list   Manage your subscription  

This email has been sent to: [login to unmask email]
ESAi has well-regarded tools for Fast Cloning, Buffer Pool Tuning, Log Analysis, TDM & more.
BCV4, BCV5, BPA4DB2, ULT4DB2... modern power tools to get the job done faster & easier than ever.
http://www.ESAIGroup.com/idug



Use of this email content is governed by the terms of service at:
http://www.idug.org/p/cm/ld/fid=2

Michael Hannan

RE: Tool for building views?
(in response to Tim Hare)

Tim,

Essentially you are asking for a complex SQL writer, I think. The SQL Generators generally handle very cut down functionality, for example years ago I worked with COOL:Gen (later called Advantage:Gen) etc. It could do joins, but no Outer Joins and no subqueries. I became well aware of it's limitations. When it could not make the desired SQL, one could code a real SQL separately.

Novel idea to put all the SQL into a View, and have the product use that as though it were a table.

Writing of complex SQLs (with good performance) is a skill. A book on how to do it well might be called for. There was a recent book, but I don't think it goes far enough. Automating the writing of complex SQLs would be worth a fortune perhaps if it could avoid all the pitfalls, and if the results could be human modified when necessary.

Now comes a big problem. A View is not the same as a complex SQL. A complex SQL can have predicates with Host Variables buried deep down inside a subquery possibly nested several levels deep. When you code predicates on  a View, the DB2 Optimizer is quite smart and attempts to push down the predicates as far as it can without changing the meaning of the SQL. Still it may not be capable of pushing the predicates down to the correct subquery desired for good performance. 

Table Functions are an alternative to Views, where you call the Table Function with arguments that can supply the Host Variable values. There are still limitations with the Optimizer chosen access paths, although they have been improved. Not many sites are using Table Functions as yet. I have tried them but my experience with them is still limited, mainly because I can't always guarantee I will be given privileges to create them. I did wish to use them for very complex SQLs. They would scare many sites, just from being something unusual.

I think Views are very well suited to simpler SQLs without too any layers of sub-queries where you can guarantee the predicate push down. For complex SQLs, not really suitable. So a complex View might be untunable due to the predicate pushdown problem.

Writing of complex SQLs can be done well, if the writer understands the Optimizer well enough. I recommend using a series of Common Table Expressions to simplify each stage of the process rather than the old style Nested Subqueries.

Indeed one way to make SQL as efficient as possible is to do all things required for the query transaction in a single Cursor, as far as is practical. Reduction in the total number of SQLs executed can be important for very high volume transactions, but not necessary beyond your top x number of Packages for CPU consumption. When all your indexes are perfect, savings can still be made by SQL call reductions. Multiple SQL calls are allowed and quite suitable for lower cost processes.

Getting back to your aims. You don't want the program doing the table joins, or accessing one table at a time like it was a VSAM File of the old days. Program joins that filter are "Stage 3" predicates (Terry Purcell's term for the worst type of filtering). So you could certainly use Views to incorporate Joins, if your SQL Generator product does not do that. If the View includes Outer Joins predicates on the Left hand tables will be fine, but any predicates on the Outer Join tables won't be easily handled, if can be done at all.

So use Views, yes with Joins, but in a limited way for not so complex SQLs, or even with subqueries if there are no hostvar predicates required in the subquery, e.g. correlated EXISTS. Be careful in case your View will have a worse access path than a standard SQL. You can use Explain to check that.

For performance critical stuff, consider using native SQL instead of the application generator.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ May 17, 2019 - 06:01 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ May 17, 2019 - 06:16 AM (Europe/Berlin)

Peter Backlund

Tool for building views?
(in response to Michael Hannan)
Michael,

I memory serves me right, the expression "Stage 3"  was coined by Bonnie Baker

Best regards,

Peter

On 2019-05-17 05:58, Michael Hannan wrote:

Tim,
Essentially you are asking for a complex SQL writer. The SQL Generators generally handle very cut down functionality, for example years ago I worked with COOL:Gen (later called Advantage:Gen) etc. It could do joins, but no Outer Joins and no subqueries. I became well aware of it's limitations. When it could not make the desired SQL, one could code a real SQL separately.
Novel idea to put all the SQL into a View, and have the product use that as though it were a table.
Writing of complex SQLs (with good performance) is a skill. A book on how to do it well might be called for. There was a recent book, but I don't think it goes far enough. Automating the writing of complex SQLs would be worth a fortune perhaps if it could avoid all the pitfalls, and if the results could be human modified when necessary.
Now comes a big problem. A View is not the same as a complex SQL. A complex SQL can have predicates with Host Variables buried deep down inside a subquery possibly nested several levels deep. When you code predicates on  a View, the DB2 Optimizer is quite smart and attempts to push down the predicates as far as it can without changing the meaning of the SQL. Still it may not be capable of pushing the predicates down to the correct subquery desired for good performance. 
Table Functions are an alternative to Views, where you call the Table Function with arguments that can supply the Host Variable values. There are still limitations with the Optimizer chosen access paths, although they have been improved. Not many sites are using Table Functions as yet. I have tried them but my experience with them is still limited, mainly because I can't always guarantee I will be given privileges to create them. I did wish to use them for very complex SQLs. They would scare many sites, just from being something unusual.
I think Views are very well suited to simpler SQLs without too any layers of sub-queries where you can guarantee the predicate push down. For complex SQLs, not really suitable. So a complex View might be untunable due to the predicate pushdown problem.
Writing of complex SQLs can be done well, if the writer understands the Optimizer well enough. I recommend using a series of Common Table Expressions to simplify each stage of the process rather than the old style Nest subqueries.
Indeed one way to make SQL as efficient as possible is to do all things required for the query transaction in a single Cursor, as far as is practical. Reduction in the total number of SQLs executed can be important for very high volume transactions, but not necessary beyond your top x number of Packages for CPU consumption. When all your indexes are perfect, savings can still be made by SQL call reductions. Multiple SQL calls are allowed and quite suitable for lower cost processes.
Getting back to your aims. You don't want the program doing the table joins, or accessing one table at a time like it was a VSAM File of the old days. Program joins that filter are "Stage 3" predicates (Terry Purcell's term for the worst type of filtering). So you could certainly use Views to incorporate Joins, if your SQL Generator product does not do that. If the View includes Outer Joins predicates on the Left hand tables will be fine, but any predicates on the Outer Join tables won't be easily handled, if can be done at all.
So use Views, yes with Joins, but in a limited way for not so complex SQLs, or even with subqueries if there are no hostvar predicates required in the subquery, e.g. correlated EXISTS. Be careful in case your View will have a worse access path than a standard SQL. You can use Explain to check that.
For performance critical stuff, consider using native SQL instead of the application generator.
Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd
Site Links: View post online   View mailing list online   Start new thread via email   Unsubscribe from this mailing list   Manage your subscription  

This email has been sent to: [login to unmask email]
ESAi has well-regarded tools for Fast Cloning, Buffer Pool Tuning, Log Analysis, TDM & more.
BCV4, BCV5, BPA4DB2, ULT4DB2... modern power tools to get the job done faster & easier than ever.
http://www.ESAIGroup.com/idug


Use of this email content is governed by the terms of service at:
http://www.idug.org/p/cm/ld/fid=2
--

+--------------------------------+---------------------------------+
| Peter G Backlund | E-mail: [login to unmask email] |
+--------------------------------+---------------------------------+
| Phone: 0046(0)70 - 764 7554 | Skype: BacklundDB2 |
+--------------------------------+---------------------------------+
| May - September | October - April |
+------------------------------------------------------------------+
| Råsundavägen 94 | Calle Alella 55, Apt 2B |
| S-169 57 SOLNA | 03185 Torrevieja |
| Sweden | Spain |
+--------------------------------+---------------------------------+
| Playing with DB2 since 1981 ... and forever! |
+------------------------------------------------------------------+

bernd oppolzer

Tool for building views?
(in response to Peter Backlund)
"Stage 3" is so obvious ...

I myself used this to describe programs doing the filtering,
when I was in charge to do DB2 classes (including DB2 performance)
here in Germany ... and I never heard anyone using this before, IIRC.

BTW:

"You don't want the program doing the table joins, or accessing one
table at a time like it was a VSAM File of the old days."

We once (mid 1990s) had a real problem with a purchased application ...
ok, it was Oracle, not DB2, but anyway.
The application was written in Fortran, and the SLA was, that a certain
action would not take more
than 20 seconds to complete. But it took 50 seconds (wait time for the
user).

We found out, that there was a join operation involved, but the join was
implemented as nested
Fortran loops. I turned out that the application was moved from a (sort
of) index file system to
Oracle, without changing the access functions :-)

We changed the Fortran join to a real Oracle join, and the elapsed time
dropped down to 5 seconds :-)

Kind regards

Bernd



Am 17.05.2019 um 10:18 schrieb Peter Backlund:
> Michael,
>
> I memory serves me right, the expression "Stage 3"  was coined by
> Bonnie Baker
>
> Best regards,
>
> Peter
>
>
>
> On 2019-05-17 05:58, Michael Hannan wrote:
>>
>> Tim,
>>
>> Essentially you are asking for a complex SQL writer. The SQL
>> Generators generally handle very cut down functionality, for example
>> years ago I worked with COOL:Gen (later called Advantage:Gen) etc. It
>> could do joins, but no Outer Joins and no subqueries. I became well
>> aware of it's limitations. When it could not make the desired SQL,
>> one could code a real SQL separately.
>>
>> Novel idea to put all the SQL into a View, and have the product use
>> that as though it were a table.
>>
>> Writing of complex SQLs (with good performance) is a skill. A book on
>> how to do it well might be called for. There was a recent book, but I
>> don't think it goes far enough. Automating the writing of complex
>> SQLs would be worth a fortune perhaps if it could avoid all the
>> pitfalls, and if the results could be human modified when necessary.
>>
>> Now comes a big problem. A View is not the same as a complex SQL. A
>> complex SQL can have predicates with Host Variables buried deep down
>> inside a subquery possibly nested several levels deep. When you code
>> predicates on  a View, the DB2 Optimizer is quite smart and attempts
>> to push down the predicates as far as it can without changing the
>> meaning of the SQL. Still it may not be capable of pushing the
>> predicates down to the correct subquery desired for good performance.
>>
>> Table Functions are an alternative to Views, where you call the Table
>> Function with arguments that can supply the Host Variable values.
>> There are still limitations with the Optimizer chosen access paths,
>> although they have been improved. Not many sites are using Table
>> Functions as yet. I have tried them but my experience with them is
>> still limited, mainly because I can't always guarantee I will be
>> given privileges to create them. I did wish to use them for very
>> complex SQLs. They would scare many sites, just from being something
>> unusual.
>>
>> I think Views are very well suited to simpler SQLs without too any
>> layers of sub-queries where you can guarantee the predicate push
>> down. For complex SQLs, not really suitable. So a complex View might
>> be untunable due to the predicate pushdown problem.
>>
>> Writing of complex SQLs can be done well, if the writer understands
>> the Optimizer well enough. I recommend using a series of Common Table
>> Expressions to simplify each stage of the process rather than the old
>> style Nest subqueries.
>>
>> Indeed one way to make SQL as efficient as possible is to do all
>> things required for the query transaction in a single Cursor, as far
>> as is practical. Reduction in the total number of SQLs executed can
>> be important for very high volume transactions, but not necessary
>> beyond your top x number of Packages for CPU consumption. When all
>> your indexes are perfect, savings can still be made by SQL call
>> reductions. Multiple SQL calls are allowed and quite suitable for
>> lower cost processes.
>>
>> Getting back to your aims. You don't want the program doing the table
>> joins, or accessing one table at a time like it was a VSAM File of
>> the old days. Program joins that filter are "Stage 3" predicates
>> (Terry Purcell's term for the worst type of filtering). So you could
>> certainly use Views to incorporate Joins, if your SQL Generator
>> product does not do that. If the View includes Outer Joins predicates
>> on the Left hand tables will be fine, but any predicates on the Outer
>> Join tables won't be easily handled, if can be done at all.
>>
>> So use Views, yes with Joins, but in a limited way for not so
>> complex SQLs, or even with subqueries if there are no hostvar
>> predicates required in the subquery, e.g. correlated EXISTS. Be
>> careful in case your View will have a worse access path than a
>> standard SQL. You can use Explain to check that.
>>
>> For performance critical stuff, consider using native SQL instead of
>> the application generator.
>>
>> Michael Hannan,
>> DB2 Application Performance Specialist
>> CPT Global Ltd
>>
>>
>> -----End Original Message-----
>
> --
>
> +--------------------------------+---------------------------------+
> | Peter G Backlund | E-mail:[login to unmask email] |
> +--------------------------------+---------------------------------+
> | Phone: 0046(0)70 - 764 7554 | Skype: BacklundDB2 |
> +--------------------------------+---------------------------------+
> | May - September | October - April |
> +------------------------------------------------------------------+
> | Råsundavägen 94 | Calle Alella 55, Apt 2B |
> | S-169 57 SOLNA | 03185 Torrevieja |
> | Sweden | Spain |
> +--------------------------------+---------------------------------+
> | Playing with DB2 since 1981 ... and forever! |
> +------------------------------------------------------------------+
>
> -----End Original Message-----

Michael Hannan

RE: Tool for building views?
(in response to Peter Backlund)

In Reply to Peter Backlund:

Michael,

I memory serves me right, the expression "Stage 3"  was coined by Bonnie Baker

Best regards,

Peter

Thanks Peter,

I would have to ask Terry Purcell if he made it up himself independently or got it from Bonnie Baker. I was attending Terry's presentations instead I guess. I remember learning from Bonnie though, that one reason for high overhead on passing rows to Stage 2 processing, was Data Manager latching and unlatching the page(s) for each row passed back to RDS. A latch has to be such short duration, since there is only one at a time (normally I think) and hence no deadlatch detection. Latch wait has to be kept very low. That was a big highlight for me. Ha ha. Maybe I am weird, but I like technical detail. 

Bernd points out that different people could have coined the term independently too, quite easily.

I often used the term "Death by random I/O" years ago, but had lost all memory of who I pinched that term from. Could have been Joel Goldstein or Chuck Hoover, or ......? Now we have such large BPools, I tend to modify it a bit to "death by random Getpage". Ha ha. Any I/Os are just a by product.

On another Tangent:
Who first invented the scroll reposition method (I first saw it in an early IBM redbook):

AND  COL1 >= :hv1
AND  NOT (COL1 = :hv1 AND COL2 < :hv2)
etc.

This technique is superseded now we have NR access paths, but I still would have liked to know as was useful in the early days, to avoid List Prefetch. Still appears in applications.

I still see some very old suspect techniques in Application SQL, like a column equals itself, and get a surprise.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ May 18, 2019 - 09:52 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ May 18, 2019 - 09:55 AM (Europe/Berlin)

Joe Geller

RE: Tool for building views?
(in response to Michael Hannan)

Hi Michael,

You asked:

"

Who first invented the scroll reposition method (I first saw it in an early IBM redbook):

AND  COL1 >= :hv1
AND  NOT (COL1 = :hv1 AND COL2 < :hv2)
etc.

This technique is superseded now we have NR access paths, but I still would have liked to know as was useful in the early days, to avoid List Prefetch."

There are several ways to code scroll repositioning (when you want to fetch a small - say 20 - rows at a time), but they pre-dated Db2 (or even DB2).  It would be the standard way to reposition a multi-column/field key of a VSAM file, IMS database, etc.  And, they did not avoid List Prefetch.  They pre-date List Prefetch in Db2. In fact, List Prefetch defeated this scroll technique when using a non-clustered index.  To avoid List Prefetch (which reads all the index entries and sorts them), Db2 invented OPTIMIZE FOR 1 ROW ONLY.

Joe



In Reply to Michael Hannan:

In Reply to Peter Backlund:

Michael,

I memory serves me right, the expression "Stage 3"  was coined by Bonnie Baker

Best regards,

Peter

Thanks Peter,

I would have to ask Terry Purcell if he made it up himself independently or got it from Bonnie Baker. I was attending Terry's presentations instead I guess. I remember learning from Bonnie though, that one reason for high overhead on passing rows to Stage 2 processing, was Data Manager latching and unlatching the page(s) for each row passed back to RDS. A latch has to be such short duration, since there is only one at a time (normally I think) and hence no deadlatch detection. Latch wait has to be kept very low. That was a big highlight for me. Ha ha. Maybe I am weird, but I like technical detail. 

Bernd points out that different people could have coined the term independently too, quite easily.

I often used the term "Death by random I/O" years ago, but had lost all memory of who I pinched that term from. Could have been Joel Goldstein or Chuck Hoover, or ......? Now we have such large BPools, I tend to modify it a bit to "death by random Getpage". Ha ha. Any I/Os are just a by product.

On another Tangent:
Who first invented the scroll reposition method (I first saw it in an early IBM redbook):

AND  COL1 >= :hv1
AND  NOT (COL1 = :hv1 AND COL2 < :hv2)
etc.

This technique is superseded now we have NR access paths, but I still would have liked to know as was useful in the early days, to avoid List Prefetch. Still appears in applications.

I still see some very old suspect techniques in Application SQL, like a column equals itself, and get a surprise.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Michael Hannan

RE: Tool for building views?
(in response to Joe Geller)



In Reply to Joe Geller:

Hi Michael,

You asked:

"

Who first invented the scroll reposition method (I first saw it in an early IBM redbook):

AND  COL1 >= :hv1
AND  NOT (COL1 = :hv1 AND COL2 < :hv2)
etc.

This technique is superseded now we have NR access paths, but I still would have liked to know as was useful in the early days, to avoid List Prefetch."

There are several ways to code scroll repositioning (when you want to fetch a small - say 20 - rows at a time), but they pre-dated Db2 (or even DB2).  It would be the standard way to reposition a multi-column/field key of a VSAM file, IMS database, etc.  And, they did not avoid List Prefetch.  They pre-date List Prefetch in Db2. In fact, List Prefetch defeated this scroll technique when using a non-clustered index.  To avoid List Prefetch (which reads all the index entries and sorts them), Db2 invented OPTIMIZE FOR 1 ROW ONLY.

Joe

Thanks Joe,

I did not know it was that old. Yes I recall the poor performance with List Prefetch. This technique, I had thought, was specifically to avoid an MX/MU with List Prefetch access path. So I had not thought of List Prefetch defeating it, but yes OPTIMIZE FOR n ROWS was to avoid the List Prefetch with MX/MU paths, and n=1 to avoid all List Prefetch possibility, since we needed the Optimizer to know that a limited set of rows would be fetched. I was never in favor of OPTIMIZE FOR 1 ROW as a general technique. Only for coding the true estimated number of rows. 

Why would the Negative form of coding (with NOTs) be useful for VSAM file access (since it does not have List prefetch and MX/MU access paths)? The more straight forward ways of coding the reposition predicates would seem suited to VSAM. It looked to me like someone invented this technique (applying De Morgan's law) specifically against MX/MU access paths.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Walter Jani&#223;en

AW: Tool for building views?
(in response to Michael Hannan)
Hi Michael

I learned Stage3 from Bonnie Baker in one of her presentations, but I also don’t know, if she was the inventor. I don’t believe it was Terry. Bonnie was great in explaining things, which I can recall even after twenty years or so. One of her wonderful explanations was the difference between Stage1 and Stage2. She used a big box, which she filled up with balls and left it at the airport and then some of the students had to fetch a ball without knowing which one. She said you are not smart enough that I can tell you.

Regarding the repositioning technique which was introduced by an IBM’er, which I can’t get rid off. It was something like that:
WHERE COL1!!COL2!!COL3… > :H
and if a column is type INTEGER, DIGITS was applied on that column. This technique is Stage2 and prevents that NR-access is chosen.

Kind regards
Walter Janißen

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Technische Anwendungsarchitektur
Victoriaplatz 2
D-40198 Düsseldorf
Tel.: +49(0)211/477-2928
Fax: +49(0)211/477-6441
[login to unmask email]<mailto:[login to unmask email]>

ITERGO Informationstechnologie GmbH
Vorsitzender des Aufsichtsrats: Christian Diedrich
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Lothar Engelke, Ina Kirchhof, Dr. Michael Regauer
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf HRB 37996

Von: Michael Hannan <[login to unmask email]>
Gesendet: Sonntag, 19. Mai 2019 07:25
An: [login to unmask email]
Betreff: [DB2-L] - RE: Tool for building views?



In Reply to Joe Geller:

Hi Michael,

You asked:

"

Who first invented the scroll reposition method (I first saw it in an early IBM redbook):

AND COL1 >= :hv1
AND NOT (COL1 = :hv1 AND COL2 < :hv2)
etc.

This technique is superseded now we have NR access paths, but I still would have liked to know as was useful in the early days, to avoid List Prefetch."

There are several ways to code scroll repositioning (when you want to fetch a small - say 20 - rows at a time), but they pre-dated Db2 (or even DB2). It would be the standard way to reposition a multi-column/field key of a VSAM file, IMS database, etc. And, they did not avoid List Prefetch. They pre-date List Prefetch in Db2. In fact, List Prefetch defeated this scroll technique when using a non-clustered index. To avoid List Prefetch (which reads all the index entries and sorts them), Db2 invented OPTIMIZE FOR 1 ROW ONLY.

Joe

Thanks Joe,

I did not know it was that old. Yes I recall the poor performance with List Prefetch. This technique, I had thought, was specifically to avoid an MX/MU with List Prefetch access path. So I had not thought of List Prefetch defeating it, but yes OPTIMIZE FOR n ROWS was to avoid the List Prefetch with MX/MU paths, and n=1 to avoid all List Prefetch possibility, since we needed the Optimizer to know that a limited set of rows would be fetched. I was never in favor of OPTIMIZE FOR 1 ROW as a general technique. Only for coding the true estimated number of rows.

Why would the Negative form of coding (with NOTs) be useful for VSAM file access (since it does not have List prefetch and MX/MU access paths)? The more straight forward ways of coding the reposition predicates would seem suited to VSAM. It looked to me like someone invented this technique (applying De Morgan's law) specifically against MX/MU access paths.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

-----End Original Message-----

Michael Hannan

RE: AW: Tool for building views?
(in response to Walter Janißen)

In Reply to Walter Janißen:

Regarding the repositioning technique which was introduced by an IBM’er, which I can’t get rid off. It was something like that:
WHERE COL1!!COL2!!COL3… > :H
and if a column is type INTEGER, DIGITS was applied on that column. This technique is Stage2 and prevents that NR-access is chosen.

Kind regards
Walter Janißen

Funny analogy about balls at the airport. Not a big fan of analogies. I do recall Bonnie on Prefetch. Bricklayer sending his assistant to get a brick, or to use a wheelbarrow to get a stack of bricks in one go. Problem was the story took up a fair portion of the hour. My learning time available was short.

Yes that is a particularly bad one if used on its own and not even an indexable predicate on the first column. I have also seen repositioning predicates and ORDER BYs stuffed up by Nullable columns and wanting the Null columns to appear first in the sort order.

If COL1 had  high cardinality, this might not be too bad:

WHERE COL1!!COL2!!COL3… > :H
AND       COL1 >= :HV1

given the Stage 2 predicate does low filtering. High filtering Stage 2 predicates are the bad ones.

For one off use quick and dirty, where performance did not matter, I might have even coded something like that myself in early days:

AND COL1!!COL2!!COL3 = 

(SELECT MAX(COL1!!COL2!!COL3)  FROM table where .....)

but would never actually recommend it.

Now there are better ways available.

Fortunately V12 allows the better syntax: and (COL1, COL2, COL3) >= (:hv1, :hv2, :hv3) for data dependent Pagination with NR access paths. See chapter 7.1.2 of performance Topics Red Book, for those who did not read this as yet.

 

 

Getting back to the Views for complex SQLs, the predicates can be in the situation of being applied not on the base table as intended, but on the result of a materialised subquery if pushdown is not possible, resulting in something like Stage 3 predicates.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ May 20, 2019 - 11:19 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ May 20, 2019 - 11:23 AM (Europe/Berlin)
Michael Hannan[Organization Members] @ May 20, 2019 - 11:24 AM (Europe/Berlin)

Phil Grainger

Tool for building views?
(in response to Michael Hannan)
Whenever I’ve used “Stage 3 predicate” I’ve always attributed it to Bonnie

Phil Grainger
Principal Enablement Manager

[BMC Exchange 2019 - Global Event Series - REGISTER] https://www.bmc.com/ami

Direct

+44 1189 218 000

Mobile

+44 7808 643 479

Email

[login to unmask email]

E2, Eskdale Road
Winnersh
Berkshire
United Kingdom
RG41 5TS
[image001 (002)][https://acclaim-production-app.s3.amazonaws.com/images/2429c3cd-a1de-44fc-b4f3-bc762bb2f963/IBM%2BChampion%2B-%2BAnalytics%2B2018.png][https://acclaim-production-app.s3.amazonaws.com/images/2429c3cd-a1de-44fc-b4f3-bc762bb2f963/IBM%2BChampion%2B-%2BAnalytics%2B2018.png]



From: Michael Hannan [mailto:[login to unmask email]
Sent: 18 May 2019 08:46
To: [login to unmask email]
Subject: [EXTERNAL] [DB2-L] - RE: Tool for building views?


In Reply to Peter Backlund:
Michael,

I memory serves me right, the expression "Stage 3" was coined by Bonnie Baker

Best regards,

Peter

Thanks Peter,

I would have to ask Terry Purcell if he made it up himself independently or got it from Bonnie Baker. I was attending Terry's presentations instead I guess. I remember learning from Bonnie though, that one reason for high overhead on passing rows to Stage 2 processing, was Data Manager latching and unlatching the page(s) for each row passed back to RDS. Latch has to be such short duration. That was a big highlight for me. Ha ha. Maybe I am weird.

I often used the term "Death by random I/O" years ago, but had lost all memory of who I pinched that term from. Could have been Joel Goldstein or Chuck Hoover, or ......? Now we have such large BPools, I tend to modify it a bit to "death by random Getpage". Ha ha. Any I/Os are just a by product.

On another Tangent:
Who first invented the scroll reposition method (I first saw it in an early IBM redbook):

AND COL1 >= :hv1
AND NOT (COL1 = :hv1 AND COL2 < :hv2)
etc.

This technique is superseded now we have NR access paths, but I still would have liked to know as was useful in the early days, to avoid List Prefetch. Still appears in applications.

I still see some very old suspect techniques in Application SQL, like a column equals itself, and get a surprise.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

-----End Original Message-----
BMC Software Limited Registered Office: Building E2, Eskdale Road, Winnersh, Wokingham, Berkshire, United Kingdom, RG41 5TS Registered in England No. 1927903 The content of this email is confidential. If you are not the addressee, you may not distribute, copy or disclose any part of it. If you receive this message in error, please delete this from your system and notify the sender immediately.
Attachments

  • image001.jpg (49.7k)
  • image002.png (6.7k)
  • image003.png (3.7k)
  • image004.png (<1k)

Joe Geller

RE: Tool for building views?
(in response to Michael Hannan)

Hi Michael,

The original way it was taught was very straightforward (logically):

AND  COL1 > :hv1
OR (COL1 = :hv1 AND COL2 > :hv2)

The problem of course is that the OR prevented Db2 from using the index structure. Instead it had to begin at the beginning of the index each time. (I say prevented in the past tense because of the new-ish NR access path).

That's why the negative format was invented:

AND  COL1 >= :hv1
AND  NOT (COL1 = :hv1 AND COL2 < :hv2)

This would always start at the positioning of the index at the first entry of COl1 >= :hv1.  Better, but not perfect if there are many duplicate values of COL1 (e.g. city = 'NEW YORK').

A third way to process the data is the way a person would do it - Find the first entry for COL1 = :hv1 AND COL2 > :hv2, then keep going.  An example of how sometimes we were smarter than Db2.

A fourth way (the one I used) was a UNION ALL with no ORDER BY (and hoping that Db2 didn't introduce a sort):

select ...

WHERE COL1 = :hv1 AND COL2 > :hv2

   UNION ALL

select ....

WHERE COL1 > :hv1      

effectively doing what I wanted to do.  If you are worried about a sort, then you could declare multiple cursors and if the first runs out before your planned limit, then open the second one (more overhead because of the extra open and an extra fetch).

Then came NR - YAY! and Db2 did multiple index probes for each part of the OR.  And then the new syntax - same access path but so much easier to code (especially for more than 2 columns) and more intuitive.

As for VSAM, I was speaking generally about scrolling techniques, but it would still apply.  The VSAM index would find the first one where COL1 >= :hv1, then the program would have to skip the ones for :hv1 that you have aleady done.  Certainly not as good as Db2 doing it for you.

Joe



In Reply to Michael Hannan:



In Reply to Joe Geller:

Hi Michael,

You asked:

"

Who first invented the scroll reposition method (I first saw it in an early IBM redbook):

AND  COL1 >= :hv1
AND  NOT (COL1 = :hv1 AND COL2 < :hv2)
etc.

This technique is superseded now we have NR access paths, but I still would have liked to know as was useful in the early days, to avoid List Prefetch."

There are several ways to code scroll repositioning (when you want to fetch a small - say 20 - rows at a time), but they pre-dated Db2 (or even DB2).  It would be the standard way to reposition a multi-column/field key of a VSAM file, IMS database, etc.  And, they did not avoid List Prefetch.  They pre-date List Prefetch in Db2. In fact, List Prefetch defeated this scroll technique when using a non-clustered index.  To avoid List Prefetch (which reads all the index entries and sorts them), Db2 invented OPTIMIZE FOR 1 ROW ONLY.

Joe

Thanks Joe,

I did not know it was that old. Yes I recall the poor performance with List Prefetch. This technique, I had thought, was specifically to avoid an MX/MU with List Prefetch access path. So I had not thought of List Prefetch defeating it, but yes OPTIMIZE FOR n ROWS was to avoid the List Prefetch with MX/MU paths, and n=1 to avoid all List Prefetch possibility, since we needed the Optimizer to know that a limited set of rows would be fetched. I was never in favor of OPTIMIZE FOR 1 ROW as a general technique. Only for coding the true estimated number of rows. 

Why would the Negative form of coding (with NOTs) be useful for VSAM file access (since it does not have List prefetch and MX/MU access paths)? The more straight forward ways of coding the reposition predicates would seem suited to VSAM. It looked to me like someone invented this technique (applying De Morgan's law) specifically against MX/MU access paths.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Michael Hannan

RE: Tool for building views?
(in response to Joe Geller)

In Reply to Joe Geller:

Hi Michael,

The original way it was taught was very straightforward (logically):

AND  COL1 > :hv1
OR (COL1 = :hv1 AND COL2 > :hv2)

The problem of course is that the OR prevented Db2 from using the index structure. Instead it had to begin at the beginning of the index each time. (I say prevented in the past tense because of the new-ish NR access path).

That's why the negative format was invented:

AND  COL1 >= :hv1
AND  NOT (COL1 = :hv1 AND COL2 < :hv2)

This would always start at the positioning of the index at the first entry of COl1 >= :hv1.  Better, but not perfect if there are many duplicate values of COL1 (e.g. city = 'NEW YORK').

A third way to process the data is the way a person would do it - Find the first entry for COL1 = :hv1 AND COL2 > :hv2, then keep going.  An example of how sometimes we were smarter than Db2.

A fourth way (the one I used) was a UNION ALL with no ORDER BY (and hoping that Db2 didn't introduce a sort):

select ...

WHERE COL1 = :hv1 AND COL2 > :hv2

   UNION ALL

select ....

WHERE COL1 > :hv1      

effectively doing what I wanted to do.  If you are worried about a sort, then you could declare multiple cursors and if the first runs out before your planned limit, then open the second one (more overhead because of the extra open and an extra fetch).

Then came NR - YAY! and Db2 did multiple index probes for each part of the OR.  And then the new syntax - same access path but so much easier to code (especially for more than 2 columns) and more intuitive.

As for VSAM, I was speaking generally about scrolling techniques, but it would still apply.  The VSAM index would find the first one where COL1 >= :hv1, then the program would have to skip the ones for :hv1 that you have aleady done.  Certainly not as good as Db2 doing it for you.

Joe

The original simple solution, before I became aware of the Negative approach was:
AND  COL1 >= :hv1
AND  (COL1 > :hv1

OR (COL1 = :hv1 AND COL2 > :hv2)
     )
Unfortunately this allowed MX/MI access paths (later could be disabled with OPTIMIZE clause)  and was scanning a huge amount of unnecessary index when COL1 cardinality was low with one value targetting a lot of rows.

So the UNION ALL approach would have been nice in DB2 z/OS other than any ORDER BY clause would cause entire result set to be materialized. It did not merge the two sides in pipes retaining sequence provided by the index. Very unfortunate and ORDER BY was not available on subqueries originally. So one had to code two separate SQLs each with an ORDER BY instead of the UNION ALL, if materializing the result set is to be avoided, usually the case since end point of the list retrieved could be very far away.  UNION ALL would work for cases where materialization was impossible with ORDER BY spread across multiple tables, particularly when first order by column was not on the table with the reposition predicates, hopefully rare. ORDER BY was not exactly safe to omit in case DB2 chose List Prefetch or decided to resequence UNION ALL legs. A supposedly working SQL could potentially stop working in the future simple because no sequence was specified. So probably worth the trouble to code two SQLs instead.

'NR' access paths solved these problems, Provided one had used ORs and not used the AND NOT approach, and materialization can be avoided as long as ORDER BY matches the index columns for Index used by the NR steps. 'NR' can be Index Only as well.

Would have been nice if IBM internally converted the AND NOT approach back to ORs so that NR could be used, however it would not know in advance that 'NR' access path would be selected for certain I guess, and transform typically happens very early before access paths chosen.

I have found that multiple  OR sets enclosed in a bracket group can be transformed using De Morgans law of algebra and distributive laws, to get 'NR' access paths too and a large number of resulting NR steps are possible. 

'NR' is a generic solution to handle OR on groups of predicates on the same index, avoiding index entry rescanning, whether we have an ORDER BY or not, and not necessarily for scroll pagination purposes.

Michael Hannan,

DB2 Application Performance Specialist
CPT Global Ltd

Michael Hannan

RE: Tool for building views?
(in response to Peter Backlund)

In Reply to Peter Backlund:

Michael,

I memory serves me right, the expression "Stage 3"  was coined by Bonnie Baker

Best regards,
Peter 

Peter,

Terry was using it in 2001 or earlier, but he did not recall where it came from. So Bonnie Baker could be quite right. If he invented it himself, I think he would have remembered. I certainly remember things I came up with by myself.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Tim Hare

RE: Tool for building views?
(in response to Michael Hannan)

Thanks for the help.  Yes, this is related to CA:Gen but this is also complicated by the fact that the developers don't want to get involved in coding real SQL "action blocks" as they are trying to just keep this maintained - this shop is minimizing spending on the mainframe, while they start to maximize their expenses by moving off to other platforms (a little snarky, I know.  We can table that discussion).    So, the best thing we can do to help performance is build views or MQTs and get them to regenerate the application around that.  At least with the view/MQT method the DBAs have a better chance of affecting performance outside of application rework (after the initial re-gen of course), which management is loathe to assign resources to.

I will also look into Common Table Expressions, that sounds interesting.

J&#248;rn Thyssen

RE: Tool for building views?
(in response to Tim Hare)

Am I the only one who’s concerned about creating a view with a 30-way join and assuming the performance will be good?

in addition, the maintainability of such a view would be problematic even if everything is natural joins.

 

I’ve seen some horrific views with 5-way join, CTEs, sub-selects and more, wondering if anyone could actually make changes to that view 12 months from now.

Best regards,

Jørn Thyssen

Rocket Software
77 Fourth Avenue • Waltham, MA • 02451 • USA
E: [login to unmask email] • W: www.rocketsoftware.com 

2019 IBM Champion.

Views are personal. 

Philip Sevetson

Tool for building views?
(in response to Jørn Thyssen)
Jørn, I endorse your concerns.

Any automatically generated code should be presumed to need examination for performance, 100% of the time. If you don’t inspect it during development, you’ll do so when it runs, probably at 3AM when the operations team calls you to figure out why the five-minute program has been running for four hours.

-phil
>28 years DB2 z/OS database administration

From: Jørn Thyssen [mailto:[login to unmask email]
Sent: Tuesday, May 21, 2019 4:17 PM
To: [login to unmask email]
Subject: [DB2-L] - RE: Tool for building views?


Am I the only one who’s concerned about creating a view with a 30-way join and assuming the performance will be good?

in addition, the maintainability of such a view would be problematic even if everything is natural joins.



I’ve seen some horrific views with 5-way join, CTEs, sub-selects and more, wondering if anyone could actually make changes to that view 12 months from now.

Best regards,

Jørn Thyssen

Rocket Software
77 Fourth Avenue • Waltham, MA • 02451 • USA
E: [login to unmask email] • W: www.rocketsoftware.com

2019 IBM Champion.

Views are personal.

-----End Original Message-----
**This e-mail, including any attachments, may be confidential, privileged, or otherwise legally protected. It is intended only for the addressee. If you received this e-mail in error or from someone who was not authorized to send it to you, do not disseminate, copy, or otherwise use this e-mail or its attachments. Please notify the sender immediately by reply e-mail and delete the e-mail from your system.**

Patrick Bossman

RE: Tool for building views?
(in response to Jørn Thyssen)

The transformed query annotation in query tuner was invented because if this type of view.  Make it really fun, nest views, change the column names along the way.  Ug.  I am having flashbacks now.

Michael Hannan

RE: Tool for building views?
(in response to Tim Hare)

In Reply to Tim Hare:

Thanks for the help.  Yes, this is related to CA:Gen but this is also complicated by the fact that the developers don't want to get involved in coding real SQL "action blocks" as they are trying to just keep this maintained - this shop is minimizing spending on the mainframe, while they start to maximize their expenses by moving off to other platforms (a little snarky, I know.  We can table that discussion).    So, the best thing we can do to help performance is build views or MQTs and get them to regenerate the application around that.  At least with the view/MQT method the DBAs have a better chance of affecting performance outside of application rework (after the initial re-gen of course), which management is loathe to assign resources to.

I will also look into Common Table Expressions, that sounds interesting.

I hope "views" in the original question referred to DB2 Views and not CA:Gen Views which are a totally different animal.

CA:Gen supports Joins already (not outer joins) and not subqueries in general. If you want to use a DB2 View like it was an Entity then you will have to define it to the Encyclopedia. Views can be a lot more powerful than CA:Gen Reads, but watch out the View is written for one purpose and not trying to be many things, and make sure your program predicates can push down properly to the table where they are supposed to filter.

Is it going to end up simpler than an External Action Block with proper SQL? Not so sure. You will know best. I haven't written an External Action Block myself.

MQTs are probably not very applicable since you have to have to keep them in Sync with Triggers, and only used by Dynamic SQLs. So View idea and denormalised data tables (updated by Triggers) is better. 

Vanilla CA:Gen Reads will be just fine for most processing, so just identify very high frequency high cost processing that needs an SQL better than what CA:Gen can generate.

I once presented at IDUG on getting performance from CA:Gen Database access. Many mistakes happen at the Data Modelling phase. A Data Model is not a good DB Design. It cannot be plugged straight in if you want performance. A Data Model is a Database design yes, but a bad one to varying degrees, depending on how abstract the modeller gets, and fails the KISS principal (Keep It Simple Stupid).

Identifying Relationships are good for performance, where part of the parent key is inherited, by the Child. Every Entity having its own Unique generated key is bad for performance. It promotes a lot of joins.  Mutually Exclusive Relationships are very bad and imply multi-purpose entities. Mandatory or Optional One to One relationships are generally bad. Optional columns in a table is a much better idea with DB2 compression.

Automatically implementing referential Integrity, everywhere, is both costly, and duplicates the checking of the application logic. Maybe worth avoiding for high cost Inserts and Updates. 

I guess every site already has the bad DB Design to live with already, but denormalization or re-normalization can help. Recognize all those wasteful Joins in the application that should not have been needed.

Jorn said:"I’ve seen some horrific views with 5-way join, CTEs, sub-selects and more, wondering if anyone could actually make changes to that view 12 months from now."

Complex SQLs appear in applications. This is reality. Putting them in a view is no worse for maintainability in 12 months time, in fact possibly better. Never (almost never) see great documentation comments explaining why the SQL does what it does whether in the View creation source file, or in the program. Explanation or comment doco text can help a lot. Like why did a DISTINCT get Inserted? Perhaps The coder always uses DISTINCT ha ha. Even which predicates are expected to do strong filtering could be helpful. I have to try to document my own SQLs well, since later I may not be able to understand why i wrote it that way.

Common Table Expressions can be used in a View where I don't think can be coded in a subquery.  CTEs were compared to Views as well but there are some major differences in behaviour.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ May 22, 2019 - 10:40 AM (Europe/Berlin)

Phil Grainger

Tool for building views?
(in response to Jørn Thyssen)
I remember a LONG time ago Andy Lankester talking about views

I don’t remember the NUMBERS accurately, but it went something like “90% of views that join 5 or more tables together do NOT return the results that the person writing the SQL is wanting – AND THEY DON’T KNOW”

That last part of the sentence is my big concern with SQL. Provided you write something syntactically valid, it will execute AND return results. But are they the RIGHT results???

Phil Grainger
Principal Enablement Manager

[BMC Exchange 2019 - Global Event Series - REGISTER] https://www.bmc.com/ami

Direct

+44 1189 218 000

Mobile

+44 7808 643 479

Email

[login to unmask email]

E2, Eskdale Road
Winnersh
Berkshire
United Kingdom
RG41 5TS
[image001 (002)][https://acclaim-production-app.s3.amazonaws.com/images/2429c3cd-a1de-44fc-b4f3-bc762bb2f963/IBM%2BChampion%2B-%2BAnalytics%2B2018.png][https://acclaim-production-app.s3.amazonaws.com/images/2429c3cd-a1de-44fc-b4f3-bc762bb2f963/IBM%2BChampion%2B-%2BAnalytics%2B2018.png]



From: Jørn Thyssen <[login to unmask email]>
Sent: 21 May 2019 21:17
To: [login to unmask email]
Subject: [EXTERNAL] [DB2-L] - RE: Tool for building views?


Am I the only one who’s concerned about creating a view with a 30-way join and assuming the performance will be good?

in addition, the maintainability of such a view would be problematic even if everything is natural joins.



I’ve seen some horrific views with 5-way join, CTEs, sub-selects and more, wondering if anyone could actually make changes to that view 12 months from now.

Best regards,

Jørn Thyssen

Rocket Software
77 Fourth Avenue • Waltham, MA • 02451 • USA
E: [login to unmask email]<mailto:[login to unmask email]> • W: www.rocketsoftware.com http://www.rocketsoftware.com

2019 IBM Champion.

Views are personal.

-----End Original Message-----
BMC Software Limited Registered Office: Building E2, Eskdale Road, Winnersh, Wokingham, Berkshire, United Kingdom, RG41 5TS Registered in England No. 1927903 The content of this email is confidential. If you are not the addressee, you may not distribute, copy or disclose any part of it. If you receive this message in error, please delete this from your system and notify the sender immediately.
Attachments

  • image001.jpg (49.7k)
  • image002.png (6.7k)
  • image003.png (3.7k)
  • image004.png (<1k)

Daniel Luksetich

Tool for building views?
(in response to Phil Grainger)
I think it’s important to say that a complex SQL is generally going to be more efficient than application logic combined with multiple calls to the data server in order to get the same result (i.e. stage 3). That’s why I encourage people to write a complex SQL in pieces, and unit test the pieces. Then put the pieces together. Will that produce an efficient statement? Maybe, but maybe not. However, the statement is exposed to the database engine, which may transform it, but it is also exposed to the DBA who can assist with optimization. If it’s not in the database then there is no way to know if it is efficient or not. I like to share a story one of dozens of stories…



DBA: why are you here?

Dan: I’m supposed to tune your bread and butter transaction

DBA: Well, every single SQL statement in the transaction is perfectly tuned, so F$#k off!

Dan: Son of a gun Mr. DBA you are correct, there is nothing I can do to make any one of these statements faster. However, there are 300 statements in the transaction and 294 of them aren’t needed, and can be merged or eliminated.



7 million USD is savings.



Morale of the story…SQL is part of the program. I usually start any conversation about SQL with a drawing of 2 boxes.



-------------------------------------- -------------------------------------------

| | | |

| | | |

| application | ----->| data server |

| | | |

| | | |

| | | |

-------------------------------------- -------------------------------------------



That line between the two represents the largest contributor to application latency and data server resource consumption. Complex SQL reduces both. That is what Bonnie was teaching us, and we still try to carry that torch.



Cheers,

Dan





Daniel L Luksetich

DanL Database Consulting



IBM GOLD Consultant

IBM Champion for Analytics

IDUG Content Committee Past-Chairman

IDUG DB2-L Administrator

IBM Certified Database Adminstrator – DB2 11 DBA for z/OS

IBM Certified System Administrator – DB2 11 for z/OS

IBM Certified Application Developer – DB2 11 for z/OS

IBM Certified Advanced Database Administrator – DB2 10.1 for Linux UNIX and Windows



From: Grainger, Phil <[login to unmask email]>
Sent: Wednesday, May 22, 2019 3:40 AM
To: [login to unmask email]
Subject: [DB2-L] - RE: Tool for building views?



I remember a LONG time ago Andy Lankester talking about views



I don’t remember the NUMBERS accurately, but it went something like “90% of views that join 5 or more tables together do NOT return the results that the person writing the SQL is wanting – AND THEY DON’T KNOW”



That last part of the sentence is my big concern with SQL. Provided you write something syntactically valid, it will execute AND return results. But are they the RIGHT results???




Phil Grainger
Principal Enablement Manager

https://www.bmc.com/ami


Direct

+44 1189 218 000


Mobile

+44 7808 643 479


Email

[login to unmask email] <mailto:[login to unmask email]>


E2, Eskdale Road
Winnersh
Berkshire
United Kingdom
RG41 5TS







From: Jørn Thyssen <[login to unmask email] <mailto:[login to unmask email]> >
Sent: 21 May 2019 21:17
To: [login to unmask email] <mailto:[login to unmask email]>
Subject: [EXTERNAL] [DB2-L] - RE: Tool for building views?



Am I the only one who’s concerned about creating a view with a 30-way join and assuming the performance will be good?

in addition, the maintainability of such a view would be problematic even if everything is natural joins.



I’ve seen some horrific views with 5-way join, CTEs, sub-selects and more, wondering if anyone could actually make changes to that view 12 months from now.

Best regards,

Jørn Thyssen

Rocket Software
77 Fourth Avenue • Waltham, MA • 02451 • USA
E: [login to unmask email] <mailto:[login to unmask email]> • W: www.rocketsoftware.com http://www.rocketsoftware.com

2019 IBM Champion.

Views are personal.



-----End Original Message-----

BMC Software Limited Registered Office: Building E2, Eskdale Road, Winnersh, Wokingham, Berkshire, United Kingdom, RG41 5TS Registered in England No. 1927903 The content of this email is confidential. If you are not the addressee, you may not distribute, copy or disclose any part of it. If you receive this message in error, please delete this from your system and notify the sender immediately.

-----End Original Message-----

Tim Hare

RE: Tool for building views?
(in response to Michael Hannan)

Yes, I want to build a DB2 View to let the CA:Gen developer have something as that maybe works better.  I am an old sysprog, haven't done the data modelling - but the pattern I see coming out of many of their Gen programs is fetch loop within fetch loop within fetch loop...   which is kind of a "manual" join I guess, but we're sure not seeing efficiency.  We have also lost most of the people who might understand performance implications of various options or ways of doing things in Gen,  so here's what happens:  performance is not good,  office management complains, application management turns to the DBAs with a demand for a fix, but don't ask us to assign a resource to modify a program.

And I agree with Dan - that we're trying to reduce that line between application and the 'data server'.   We're also, with some of these, going to reduce our 4-hour average because some of these are real 'hogs'. When I had an MVS performance class in Boston a lot of it was about finding and tuning the "hahgs" .

 

Edited By:
Tim Hare[Organization Members] @ May 22, 2019 - 03:08 PM (America/Eastern)

Tim Hare

RE: Tool for building views?
(in response to Tim Hare)

Also - someone mentioned MQTs.  We wouldn't have to maintain one of these with triggers;  for this data its enough to refresh the MQT once per day.

Michael Hannan

RE: Tool for building views?
(in response to Phil Grainger)

In Reply to Phil Grainger:

I remember a LONG time ago Andy Lankester talking about views

I don’t remember the NUMBERS accurately, but it went something like “90% of views that join 5 or more tables together do NOT return the results that the person writing the SQL is wanting – AND THEY DON’T KNOW”

That last part of the sentence is my big concern with SQL. Provided you write something syntactically valid, it will execute AND return results. But are they the RIGHT results???

Phil Grainger
Principal Enablement Manager

The numbers seem exaggerated but  the point is still valid. When I tune excessively complex SQLs, Running in a production system, I am always wondering if they actually work as intended, and sometimes they don't.

Too many time see data pagination (scroll reposition) predicates out of sync with the ORDER BY, and one can see in some circumstances it won't work.

Also see cases of missing brackets around OR construct and a SELECT DISTINCT at the top to get rid of the cartesian product duplicates. Laughable stuff in production.

Then there must be lots of other cases where the faults are not so obvious and data dependent.

Not sure if the Views part of the statement was critical or not. I am not a big fan of complex Views at all, I would rather have the SQL code inline. When we see a very complex SQL using several Views, and those Views contain both Unions and Joins, then the final result can be a horrible mess for the access path and performance. Was trying to analyse one of these performing badly early this year. Whether the final SQL actually worked or not was too far beyond the analysis. I just wanted it to go faster.

I write extremely complex SQLs myself to query DB2 Catalog and Explain tables, and recognise that the results are mostly good enough, but not perfection for every possible scenario, and there will be ongoing faults found to fix. When Unicode Catalog came, I had a lot of faults to fix. These are not intended for others to maintain! Just expediency.

Having said that, a complex program can have the faults, equally well. I make complex SQLs purely because I can, and can enhance them regularly. They don't have to be installed. For Ad Hoc use, I just run them. I use Common Table Expressions to make them modular and can test one CTE stage at a time.

To write programs to produce the same reports, would take me a lot longer, although with REXX I might get close to creating the functionality as quickly. The power of SQL is a help, and can be rope to hang yourself with too.

Michael Hannan,

DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ May 23, 2019 - 05:04 AM (Europe/Berlin)

Michael Hannan

RE: Tool for building views?
(in response to Daniel Luksetich)

Further to Dan's comment, there are no perfectly tuned programs from an SQL point of view. At least I never saw a program with SQL close to optimal.

The simple reason is that for all the SQLs coded, the access path may well be perfect, but when the SQL is overly simple, not doing joins or subqueries, the tuning method of last resort, is to reduce the number of SQL calls. Each SQL call has an overhead, and Stage 3 filtering is much worse still. Moving application logic into the SQL and hence reducing the number of SQL calls can make a good saving, providing the reduction in calls is very large.

SQL call reduction is the tuning method when we get desperate to save a bit more. Gets tougher when each program already has only 1 SQL section, since then we have to look at combining Packages that are used together.

SQL Call reduction works well, when the total number of calls to be eliminated is really huge. Don't bother to code Multi Row Fetch unless the total fetches per day is really large. A Fetch might cost 2 micro seconds overhead. So save 1 million Fetches is only 2 secs of CPU. Just not worth it. If you don't have 100s of millions of the same Fetch, then don't bother. The old saying that MRF can save up to 50% of a query cost is misleading and meaningless. The savings is per Fetch and not related to the cost of obtaining the result set before fetching.

Once a Sysprog suggested we turn on Multi Row Fetch for QMF at the site, then measure the savings. Sorry, the total site Fetches in QMF per day was not that large. We estimated a savings of less than 1 MIPS. Unmeasurable against a varied workload and ad hoc queries. I often see Multi Row Fetch coded where it is of little benefit, even cases of 1 rows fetched per OPEN, on average.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Michael Hannan

RE: Tool for building views?
(in response to Tim Hare)

In Reply to Tim Hare:

We have also lost most of the people who might understand performance implications of various options or ways of doing things in Gen,  so here's what happens:  performance is not good,  office management complains, application management turns to the DBAs with a demand for a fix, but don't ask us to assign a resource to modify a program.

And I agree with Dan - that we're trying to reduce that line between application and the 'data server'.   We're also, with some of these, going to reduce our 4-hour average because some of these are real 'hogs'. When I had an MVS performance class in Boston a lot of it was about finding and tuning the "hahgs" . 

Tim,

If the CA:Gen READs are coded very unwisely, not using the DB2 Engine, then there is only a limited amount that can be done with Index tuning, and not have the CA:Gen code changed. 

CA:Gen can use joins, and can have C:Gen Views "Starved" to allow only required columns to be selected and allow for Index Only access paths.

The way arrays are passed between CA:Gen routines is also critical. You can end up with large scale unneeded data initialisation overheads if large arrays are passed around. 

Coding CA:Gen stuff wisely should be the first step. Email me, and I can send you a copy of my old IDUG presentation on getting good SQL out of the CA:Gen code generator. I am sure it is all still relevant. I had quite a number of years experience tuning SQL at more than one COOL:Gen site (before it became CA:Gen).

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

 [login to unmask email]