Arithmetic operations on a DB2 query

Ebubekir Büyüktosun

Arithmetic operations on a DB2 query

Hi everybody,

I am working on a query performance tuning. In my case, the select query is called everyday about 15 million times from a pgm. The query has a structure as below :

SELECT 
  SUM ( T1.A - T1.B )
FROM 
  TABLE T1
WHERE ....

Here, substraction and cumulative addition operations are done on each result set row of the query. That means :

15.000.000 x (10 substractions) x (10 additions) = 1.500.000.000 arithmetic operations approximately

 

These operations directly affect the value "in DB2 cpu time". On the other hand, this operations can be done on program side calls the query. In that case programmer will create a loop and make the calculations iteratively.

I generally know that the operations are sent to cpu queue to process either by DB2 or the program. So, any difference sending the operations from db2 query or program ? Can I suggest the programmer to make the calculations on the program side for this case ?

Thanks for your interest,

Ebubekir

Walter Janißen

AW: Arithmetic operations on a DB2 query
(in response to Ebubekir Büyüktosun)
Hi

There is a rule of thumb: What Db2 can do, you should leave to Db2, but in this case I think, only a benchmark can say the truth

Kind regards
Walter Janißen [standard_IBM+Champ+7+Yr+Analytics]

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Technische Anwendungsarchitektur
Victoriaplatz 2
D-40198 Düsseldorf
[login to unmask email]<mailto:[login to unmask email]>

ITERGO Informationstechnologie GmbH
Vorsitzender des Aufsichtsrats: Christian Diedrich
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Lothar Engelke, Ina Kirchhof, Dr. Michael Regauer
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf HRB 37996

Von: Ebubekir Büyüktosun [mailto:[login to unmask email]
Gesendet: Donnerstag, 23. November 2017 09:38
An: [login to unmask email]
Betreff: [DB2-L] - Arithmetic operations on a DB2 query


Hi everybody,

I am working on a query performance tuning. In my case, the select query is called everyday about 15 million times from a pgm. The query has a structure as below :

SELECT
SUM ( T1.A - T1.B )
FROM
TABLE T1
WHERE ....

Here, substraction and cumulative addition operations are done on each result set row of the query. That means :

15.000.000 x (10 substractions) x (10 additions) = 1.500.000.000 arithmetic operations approximately



These operations directly affect the value "in DB2 cpu time". On the other hand, this operations can be done on program side calls the query. In that case programmer will create a loop and make the calculations iteratively.

I generally know that the operations are sent to cpu queue to process either by DB2 or the program. So, any difference sending the operations from db2 query or program ? Can I suggest the programmer to make the calculations on the program side for this case ?

Thanks for your interest,

Ebubekir

-----End Original Message-----
Attachments

  • image001.png (2.6k)

James Campbell

AW: Arithmetic operations on a DB2 query
(in response to Walter Janißen)
In this case the overheads of the 15.000.000 cross memory data transfers will far outweigh
whatever benefits of having the arithmetic calculations done on the client side.

But, as Walter says, try it and see.

James Campbell


On 23 Nov 2017 at 9:20, Walter Jani&#223;en wrote:

>
> Hi
>  
> There is a rule of thumb: What Db2 can do, you should leave to Db2, but in this case I think, only
> a benchmark can say the truth
>  
> Kind regards
> Walter Janißen  standard_IBM+Champ+7+Yr+Analytics
>
>  
> Von: Ebubekir Büyüktosun [mailto:[login to unmask email]
> Gesendet: Donnerstag, 23. November 2017 09:38
> An: [login to unmask email]
> Betreff: [DB2-L] - Arithmetic operations on a DB2 query
>  
> Hi everybody,
> I am working on a query performance tuning. In my case, the select query is
> called everyday about 15 million times from a pgm. The query has a structure as
> below :
> SELECT 
>   SUM ( T1.A - T1.B )
> FROM 
>   TABLE T1
> WHERE ....
> Here, substraction and cumulative addition operations are done on each result set
> row of the query. That means :
> 15.000.000 x (10 substractions) x (10 additions) = 1.500.000.000 arithmetic
> operations approximately
>  
> These operations directly affect the value "in DB2 cpu time". On the other hand,
> this operations can be done on program side calls the query. In that case
> programmer will create a loop and make the calculations iteratively.
> I generally know that the operations are sent to cpu queue to process either by
> DB2 or the program. So, any difference sending the operations from db2 query or
> program ? Can I suggest the programmer to make the calculations on the program
> side for this case ?
> Thanks for your interest,
> Ebubekir
>  

Ebubekir B&#252;y&#252;ktosun

RE: AW: Arithmetic operations on a DB2 query
(in response to James Campbell)

Thank you very much guys. I will test both cases and then decide.

Shay Miller

RE: AW: Arithmetic operations on a DB2 query
(in response to Ebubekir Büyüktosun)

Hello.

 

I also think that it's better to do as much in DB2 rather than in the application.

 

I have another idea for you,

Sum each column separately and do the calculation on them:

SELECT 
  SUM ( T1.A) - SUM(T1.B )

 

I did a benchmark on a 9M table at my DB2.

Both queries are executed as tablespace scan(no where clause)

SUM (COL1-COL2)             --> ELAPSED 12.39 sec ; CPU 10.69 sec

SUM(COL1) - SUM(COL2) --> ELAPSED 6.98 sec ; CPU 5.99 sec

 

Check if that helps you.

 

Regards,

Shay

Daniel Luksetich

AW: Arithmetic operations on a DB2 query
(in response to Shay Miller)
Shay,

I am fascinated by your benchmark. Did you flush the buffers in between the tests? I have performed similar benchmark tests on various Db2 functions, and largely rely on these sorts of test more so than access path analysis and cost estimates. Especially the case where you have the same access path and likely the same optimizer cost estimate, but yet your test shows a significant difference.

Cheers,

Dan



Daniel L Luksetich

DanL Database Consulting



IBM GOLD Consultant

IBM Champion for Analytics

IDUG Content Committee Past-Chairman

IBM Certified Database Adminstrator – DB2 11 DBA for z/OS

IBM Certified System Administrator – DB2 11 for z/OS

IBM Certified Application Developer – DB2 11 for z/OS

IBM Certified Advanced Database Administrator – DB2 10.1 for Linux UNIX and Windows



From: Shay Miller [mailto:[login to unmask email]
Sent: Monday, November 27, 2017 8:27 AM
To: [login to unmask email]
Subject: [DB2-L] - RE: AW: Arithmetic operations on a DB2 query



Hello.



I also think that it's better to do as much in DB2 rather than in the application.



I have another idea for you,

Sum each column separately and do the calculation on them:

SELECT
SUM ( T1.A) - SUM(T1.B )



I did a benchmark on a 9M table at my DB2.

Both queries are executed as tablespace scan(no where clause)

SUM (COL1-COL2) --> ELAPSED 12.39 sec ; CPU 10.69 sec

SUM(COL1) - SUM(COL2) --> ELAPSED 6.98 sec ; CPU 5.99 sec



Check if that helps you.



Regards,

Shay



-----End Original Message-----

Terry Purcell

RE: AW: Arithmetic operations on a DB2 query
(in response to Daniel Luksetich)

In one case the subtraction is performed before the SUM, and the other after the SUM. Performing this after the SUM results in significantly fewer subtractions - but also, from the perspective of the Db2 code - performing a simply SUM is far more efficient than performing arithmetic within the SUM column function. This is true for all column functions.

Depending on the datatype, it is possible however that there may be differences in the results due to when any datatype truncation occurs.

Regards

Terry Purcell

In Reply to Daniel Luksetich:

Shay,

I am fascinated by your benchmark. Did you flush the buffers in between the tests? I have performed similar benchmark tests on various Db2 functions, and largely rely on these sorts of test more so than access path analysis and cost estimates. Especially the case where you have the same access path and likely the same optimizer cost estimate, but yet your test shows a significant difference.

Cheers,

Dan



Daniel L Luksetich

DanL Database Consulting



IBM GOLD Consultant

IBM Champion for Analytics

IDUG Content Committee Past-Chairman

IBM Certified Database Adminstrator – DB2 11 DBA for z/OS

IBM Certified System Administrator – DB2 11 for z/OS

IBM Certified Application Developer – DB2 11 for z/OS

IBM Certified Advanced Database Administrator – DB2 10.1 for Linux UNIX and Windows



From: Shay Miller [mailto:[login to unmask email]
Sent: Monday, November 27, 2017 8:27 AM
To: [login to unmask email]
Subject: [DB2-L] - RE: AW: Arithmetic operations on a DB2 query



Hello.



I also think that it's better to do as much in DB2 rather than in the application.



I have another idea for you,

Sum each column separately and do the calculation on them:

SELECT
SUM ( T1.A) - SUM(T1.B )



I did a benchmark on a 9M table at my DB2.

Both queries are executed as tablespace scan(no where clause)

SUM (COL1-COL2) --> ELAPSED 12.39 sec ; CPU 10.69 sec

SUM(COL1) - SUM(COL2) --> ELAPSED 6.98 sec ; CPU 5.99 sec



Check if that helps you.



Regards,

Shay



-----End Original Message-----

Tony Andrews

AW: Arithmetic operations on a DB2 query
(in response to Daniel Luksetich)
Look at the explains on both of these, specifically the field COLUMN_FN_EVAL in the PLAN_TABLE. This states
when DB2 is evaluating the function (in this case SUM). It will be done either during retrieval of the data or
after data retrieval (more expensive). If the COLUMN_FN_EVAL = ‘R’ then the function is being calculated
during retrieval time (less expansive). If the COLUMN_FN_EVAL = ‘ ‘, then the final answer is not calculated until
after data retrieval (more expensive).

My guess is that the SUM(C1 – C2) is more expensive because the mathematics within then SUM is causing DB2 to
evaluate and calculate after retrieval, and with more subtractions going on. This one probably is showing
COLUMN_FN_EVAL = ‘ ‘.

Tony Andrews

From: Daniel L Luksetich [mailto:[login to unmask email]
Sent: Monday, November 27, 2017 9:56 AM
To: [login to unmask email]
Subject: [DB2-L] - RE: AW: Arithmetic operations on a DB2 query

Shay,
I am fascinated by your benchmark. Did you flush the buffers in between the tests? I have performed similar benchmark tests on various Db2 functions, and largely rely on these sorts of test more so than access path analysis and cost estimates. Especially the case where you have the same access path and likely the same optimizer cost estimate, but yet your test shows a significant difference.
Cheers,
Dan

Daniel L Luksetich
DanL Database Consulting

IBM GOLD Consultant
IBM Champion for Analytics
IDUG Content Committee Past-Chairman
IBM Certified Database Adminstrator – DB2 11 DBA for z/OS
IBM Certified System Administrator – DB2 11 for z/OS
IBM Certified Application Developer – DB2 11 for z/OS
IBM Certified Advanced Database Administrator – DB2 10.1 for Linux UNIX and Windows

From: Shay Miller [mailto:[login to unmask email]
Sent: Monday, November 27, 2017 8:27 AM
To: [login to unmask email]<mailto:[login to unmask email]>
Subject: [DB2-L] - RE: AW: Arithmetic operations on a DB2 query


Hello.



I also think that it's better to do as much in DB2 rather than in the application.



I have another idea for you,

Sum each column separately and do the calculation on them:

SELECT
SUM ( T1.A) - SUM(T1.B )



I did a benchmark on a 9M table at my DB2.

Both queries are executed as tablespace scan(no where clause)

SUM (COL1-COL2) --> ELAPSED 12.39 sec ; CPU 10.69 sec

SUM(COL1) - SUM(COL2) --> ELAPSED 6.98 sec ; CPU 5.99 sec



Check if that helps you.



Regards,

Shay

-----End Original Message-----

-----End Original Message-----

Shay Miller

RE: AW: Arithmetic operations on a DB2 query
(in response to Tony Andrews)

Hello everyone.

The thought was exactly like Terry said - less subtractions (only one at the end) and simpler input to the SUM function .

 

Dan, I didn't flush the buffer between runs but I did ran the first select 2 times.

One to load the data into the buffer and one for the measurement.

 

The difference between the two runs was practically non existent.

 

One final remark, the test was done on V10 NFM.

 

Shay. 

Edited By:
Shay Miller[Organization Members] @ Nov 28, 2017 - 09:08 AM (Asia/Jerusalem)
Shay Miller[Organization Members] @ Nov 28, 2017 - 09:09 AM (Asia/Jerusalem)

Ebubekir B&#252;y&#252;ktosun

RE: AW: Arithmetic operations on a DB2 query
(in response to Shay Miller)

Hi Shay,

That is really good and effective solution for my case, thank you very much. I'll suggest the approach to programmer.

Your solution is also valid for DB2 V11 NFM according to the my tests.
 
In Reply to Shay Miller:

Hello.

 

I also think that it's better to do as much in DB2 rather than in the application.

 

I have another idea for you,

Sum each column separately and do the calculation on them:

SELECT 
  SUM ( T1.A) - SUM(T1.B )

 

I did a benchmark on a 9M table at my DB2.

Both queries are executed as tablespace scan(no where clause)

SUM (COL1-COL2)             --> ELAPSED 12.39 sec ; CPU 10.69 sec

SUM(COL1) - SUM(COL2) --> ELAPSED 6.98 sec ; CPU 5.99 sec

 

Check if that helps you.

 

Regards,

Shay

Shay Miller

RE: AW: Arithmetic operations on a DB2 query
(in response to Ebubekir Büyüktosun)

Great.

 

Glad to help :-)

 

Cheers,

Shay.

In Reply to Ebubekir Büyüktosun:

Hi Shay,

That is really good and effective solution for my case, thank you very much. I'll suggest the approach to programmer.

Your solution is also valid for DB2 V11 NFM according to the my tests.
 
In Reply to Shay Miller:

Hello.

 

I also think that it's better to do as much in DB2 rather than in the application.

 

I have another idea for you,

Sum each column separately and do the calculation on them:

SELECT 
  SUM ( T1.A) - SUM(T1.B )

 

I did a benchmark on a 9M table at my DB2.

Both queries are executed as tablespace scan(no where clause)

SUM (COL1-COL2)             --> ELAPSED 12.39 sec ; CPU 10.69 sec

SUM(COL1) - SUM(COL2) --> ELAPSED 6.98 sec ; CPU 5.99 sec

 

Check if that helps you.

 

Regards,

Shay