SQL LEFT JOIN BUG?

Cees Slot

SQL LEFT JOIN BUG?

Hi all,

we are on zOS DB2 12 since two months. We have noticed some peculiar behavior of SQL SELECT with LEFT JOIN.

I have simplified the problem SQL to:

 

SELECT *                                                                  
FROM THEPRS E
LEFT JOIN
(
SELECT DISTINCT ENTI_KEY_1
FROM THEPRS
) S
ON E.ENTI_KEY_1 = S.ENTI_KEY_1
WHERE S.ENTI_KEY_1 IS NULL
---------+---------+---------+---------+---------+---------+---------+----
VK_CODE ENTI_KEY_1 DAT_TYD_MUT ENTI_KEY_1
---------+---------+---------+---------+---------+---------+---------+----
DV AH08702680 2018-04-17-20.24.20.269820 ----------
DSNE610I NUMBER OF ROWS DISPLAYED IS 1

 Clearly, the result of this query should be no rows.

Table THEPRS has about 28 million rows,

When an ORDER BY E.ENTI_KEY_1 is added to the query no rows are returned as expected !

I have tried this on a few tables. The problem seems to exist for large tables only (but not all large tables exhibit the error), and the problem is there are a few rows in the result set that should not be there.

Has anyone suffered from the same problem?

Looks like we are looking at a bug here i.m.o.

Thoughts?

Regards,

Cees 

Another example: 

 

SELECT *                                                                        
FROM THEREK E
LEFT JOIN
(
SELECT DISTINCT ENTI_KEY_1
FROM THEREK
) S
ON E.ENTI_KEY_1 = S.ENTI_KEY_1
WHERE S.ENTI_KEY_1 IS NULL
---------+---------+---------+---------+---------+---------+---------+---------+
VK_CODE ENTI_KEY_1 DAT_TYD_MUT ENTI_KEY_1
---------+---------+---------+---------+---------+---------+---------+---------+
GR 000000010568677 2018-02-11-12.22.17.372554 ----------
ZS 000000010568677 2013-04-06-20.50.21.136923 ----------
DSNE610I NUMBER OF ROWS DISPLAYED IS 2
Edited By:
Cees Slot[Organization Members] @ May 22, 2020 - 02:03 PM (Europe/Amsterdam)
Cees Slot[Organization Members] @ May 22, 2020 - 02:03 PM (Europe/Amsterdam)
Cees Slot[Organization Members] @ May 24, 2020 - 12:07 PM (Europe/Amsterdam)

bernd oppolzer

SQL LEFT JOIN BUG?
(in response to Cees Slot)
Hi,

the select distinct subquery could IMO build a workfile for the result,
then (with some delay) the left join (nested loop join) uses this
workfile to check for occurences of the key in the workfile.

What if a parallel running process inserts records into table THEPRS in
the meantime?
IMO, nothing protects you from such inserts.

If you use ORDER by on the outer SELECT, maybe both sorts (for DISTINCT
and for ORDER BY)
are done in the same run (that is: only one sort, only one workfile, the
same workfile used for both),
and so there is no problem with parallel running inserts in this case.

HTH, kind regards

Bernd



Am 22.05.2020 um 13:56 schrieb Cees Slot:
>
> Hi all,
>
> we are on DB2 12 since two months. We have noticed some peculiar
> behavior of SQL SELECT with LEFT JOIN.
>
> I have simplified the problem SQL to:
>
> *SELECT * * *  FROM THEPRS E * *LEFT JOIN * *( * *SELECT DISTINCT
> ENTI_KEY_1 * *  FROM THEPRS * *) S * *ON E.ENTI_KEY_1 = S.ENTI_KEY_1 *
> *WHERE S.ENTI_KEY_1 IS NULL *
> *---------+---------+---------+---------+---------+---------+---------+----*
> *VK_CODE  ENTI_KEY_1  DAT_TYD_MUT                 ENTI_KEY_1 *
> *---------+---------+---------+---------+---------+---------+---------+----*
> *DV       AH08702680  2018-04-17-20.24.20.269820  ---------- *
> *DSNE610I NUMBER OF ROWS DISPLAYED IS 1*
>
>  Clearly, the result of this query should be no rows.
>
> Table THEPRS has about 28 million rows,
>
> When an ORDER BY E.ENTI_KEY_1 is added to the query no rows are
> returned as expected !
>
> I have tried this on a few tables. The problem seems to exist for
> large tables only (but not all large tables exhibit the error), and
> the problem is there are a few rows in the result set that should not
> be there.
>
> Has anyone suffered from the same problem?
>
> Looks like we are looking at a bug here i.m.o.
>
> Thoughts?
>
> Regards,
>
> Cees
>
> Another example:
>
> SELECT *
> FROM THEREK E
> LEFT JOIN
> (
> SELECT DISTINCT ENTI_KEY_1
> FROM THEREK
> ) S
> ON E.ENTI_KEY_1 = S.ENTI_KEY_1
> WHERE S.ENTI_KEY_1 IS NULL
> ---------+---------+---------+---------+---------+---------+---------+---------+
> VK_CODE ENTI_KEY_1 DAT_TYD_MUT ENTI_KEY_1
> ---------+---------+---------+---------+---------+---------+---------+---------+
> GR 000000010568677 2018-02-11-12.22.17.372554 ----------
> ZS 000000010568677 2013-04-06-20.50.21.136923 ----------
> DSNE610I NUMBER OF ROWS DISPLAYED IS 2
>
> -----End Original Message-----

Cees Slot

RE: SQL LEFT JOIN BUG?
(in response to bernd oppolzer)

Hi Bernd,

I am running these queries in an environment with no other users, so the tables being queried do not change.

Ik can also reproduce the results by running the same query again. The same erroneous rows area returned consistently.

Cees 

Larry Kintisch

SQL LEFT JOIN BUG?
(in response to Cees Slot)
Hi Cees,
You said this was a simplification, perhaps a
fragment of SQL? It is not clear what you are
trying to do: Are you looking for NULL values in
the "S" key? Or in either key? Or is this
fragment just to show the supposed anomaly?

In the SQL ref, it says: Nulls in a join operation
Nulls need special handling in join operations.
If you perform a join operation on a column that
can contain null values, consider using an outer join.

It seems to me that the LEFT JOIN "ON" clause
can't match if one or both are NULL. Perhaps you
need a nested table expression on the first table
as well, with the NULL or NOT NULL explicit in each table expression?

Hope this helps.

Larry Kintisch [login to unmask email]

At 10:46 AM 5/22/2020, you wrote:

>Hi Bernd,
>
>I am running these queries in an environment
>with no other users, so the tables being queried do not change.
>
>Ik can also reproduce the results by running the
>same query again. The same erroneous rows area returned consistently.
>
>CeesÂ
>

Walter Janißen

RE: SQL LEFT JOIN BUG?
(in response to Cees Slot)

Hi Cees

What happens, if you use GROUP BY instead of DISTINCT?

Cees Slot

RE: SQL LEFT JOIN BUG?
(in response to Walter Janißen)

This problem was given to me by a developer who was checking keys in another table. All keys were there, yet one was reported missing. To illustrate the point, I have joined the table with itself which clearly should never report any missing keys.

LEFT JOIN by the way is an OUTER join.

As suggested, I have tried the GROUP BY on another table, this table has around 8 million rows. Also tried the common table expression. Both methods show the same anomaly. Looks like a problem in the DB2 engine.

WITH S AS (                              
SELECT ENTI_KEY_1
FROM THEVIB
GROUP BY ENTI_KEY_1
)
SELECT E.ENTI_KEY_1
FROM THEVIB E
LEFT JOIN S
ON E.ENTI_KEY_1 = S.ENTI_KEY_1
WHERE S.ENTI_KEY_1 IS NULL
---------+---------+---------+---------+-
ENTI_KEY_1
---------+---------+---------+---------+-
AH080008601994092510481734
DSNE610I NUMBER OF ROWS DISPLAYED IS 1

 

 

SELECT E.ENTI_KEY_1                     
FROM THEVIB E
LEFT JOIN
(
SELECT ENTI_KEY_1
FROM THEVIB
GROUP BY ENTI_KEY_1
) S
ON E.ENTI_KEY_1 = S.ENTI_KEY_1
WHERE S.ENTI_KEY_1 IS NULL
---------+---------+---------+---------+
ENTI_KEY_1
---------+---------+---------+---------+
AH080008601994092510481734
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
Edited By:
Cees Slot[Organization Members] @ May 22, 2020 - 07:55 PM (Europe/Amsterdam)
Cees Slot[Organization Members] @ May 22, 2020 - 07:56 PM (Europe/Amsterdam)

bernd oppolzer

SQL LEFT JOIN BUG?
(in response to Cees Slot)
Ok, thanks.

If there are really no concurrent INSERTs, IMO this is an error in DB2
processing, too.
(Or in some of the indexes involved).

I would suggest to try the following alternate SQLs:

First:

SELECT *
  FROM THEPRS E
   LEFT JOIN
       (SELECT ENTI_KEY_1
          FROM THEPRS) S
    ON E.ENTI_KEY_1 = S.ENTI_KEY_1
  WHERE S.ENTI_KEY_1 IS NULL;

that is: without DISTINCT

Second:

SELECT *
  FROM THEPRS E
 WHERE NOT EXISTS
      (SELECT 1
         FROM THEPRS
        WHERE ENTI_KEY_1 = E.ENTI_KEY_1);

and: I would try to rebuild the index on ENTI_KEY_1 (I guess there is one)
on THEPRS and see if the error disappears. (The workfile from the subquery
is IMO built using an indexonly scan, so an inconsistency between index and
table would explain it, too).

Kind regards

Bernd


Am 22.05.2020 um 16:46 schrieb Cees Slot:

> Hi Bernd,
>
> I am running these queries in an environment with no other users, so
> the tables being queried do not change.
>
> Ik can also reproduce the results by running the same query again. The
> same erroneous rows area returned consistently.
>
> Cees
>
>
> -----End Original Message-----

bernd oppolzer

SQL LEFT JOIN BUG?
(in response to bernd oppolzer)
Additional question(s):

are the keys shown always at the low end of the key range (lowest key in
the table)?
Or somewhere in the middle? Or at the high end (don't think so, given
the key values in your examples)?

Could you do a SELECT COUNT (UNIQUE Key) and compare it to the SELECT
COUNT (*)
on the tables, which have the problem?

Kind regards

Bernd


Am 22.05.2020 um 20:11 schrieb Bernd Oppolzer:
>
> Ok, thanks.
>
> If there are really no concurrent INSERTs, IMO this is an error in DB2
> processing, too.
> (Or in some of the indexes involved).
>
> I would suggest to try the following alternate SQLs:
>
> First:
>
> SELECT *
>   FROM THEPRS E
>    LEFT JOIN
>        (SELECT ENTI_KEY_1
>           FROM THEPRS) S
>     ON E.ENTI_KEY_1 = S.ENTI_KEY_1
>   WHERE S.ENTI_KEY_1 IS NULL;
>
> that is: without DISTINCT
>
> Second:
>
> SELECT *
>   FROM THEPRS E
>  WHERE NOT EXISTS
>       (SELECT 1
>          FROM THEPRS
>         WHERE ENTI_KEY_1 = E.ENTI_KEY_1);
>
> and: I would try to rebuild the index on ENTI_KEY_1 (I guess there is one)
> on THEPRS and see if the error disappears. (The workfile from the subquery
> is IMO built using an indexonly scan, so an inconsistency between
> index and
> table would explain it, too).
>
> Kind regards
>
> Bernd
>
>
> Am 22.05.2020 um 16:46 schrieb Cees Slot:
>
>> Hi Bernd,
>>
>> I am running these queries in an environment with no other users, so
>> the tables being queried do not change.
>>
>> Ik can also reproduce the results by running the same query again.
>> The same erroneous rows area returned consistently.
>>
>> Cees
>>
>>
>> -----End Original Message-----
>
>
> -----End Original Message-----
--

Oppolzer-Informatik
Dipl. Inf. Bernd Oppolzer
Bärenhofstraße 23
70771 Leinfelden-Echterdingen
—————————————————————
Tel.: +49 711 7949591
priv.: +49 711 7949590
mobil: +49 151 75005359
eMail: [login to unmask email] <mailto:[login to unmask email]>
Web: http://bernd-oppolzer.de/job.htm
—————————————————————
Für Umsatzsteuerzwecke:
SteuerNr.: 97 076 / 29921
USt-ID-Nr.: DE 147 700 393
—————————————————————
Oppolzer-Informatik 1983 - 2019
36years of experience in computer science**



Philip Sevetson

SQL LEFT JOIN BUG?
(in response to bernd oppolzer)
Cees, Bernd,

I don’t know how widespread the problem is in your system. One experiment worth the time, I think, is to establish that the problem exists for some given table with relevant index; do a CHECK INDEX; rebuild the index (whether or not problems are indicated); do the CHECK INDEX again; rerun the query to see if the results are unchanged (or what the change is).

Two EXPLAINs would probably also be useful, both in existing statistics and immediately after a REORG/RUNSTATS/REBIND (if it’s static).

-phil


Philip Sevetson
Computer Systems Manager
FISA-OPA
5 Manhattan West
New York, NY 10001
[login to unmask email]<mailto:[login to unmask email]>
917-991-7052 m
212-857-1659 f
[cid:[login to unmask email]

From: Bernd Oppolzer <[login to unmask email]>
Sent: Friday, May 22, 2020 2:12 PM
To: Cees Slot <[login to unmask email]>
Subject: [DB2-L] - RE: SQL LEFT JOIN BUG?


Ok, thanks.

If there are really no concurrent INSERTs, IMO this is an error in DB2 processing, too.
(Or in some of the indexes involved).

I would suggest to try the following alternate SQLs:

First:

SELECT *
FROM THEPRS E
LEFT JOIN
(SELECT ENTI_KEY_1
FROM THEPRS) S
ON E.ENTI_KEY_1 = S.ENTI_KEY_1
WHERE S.ENTI_KEY_1 IS NULL;

that is: without DISTINCT

Second:

SELECT *
FROM THEPRS E
WHERE NOT EXISTS
(SELECT 1
FROM THEPRS
WHERE ENTI_KEY_1 = E.ENTI_KEY_1);

and: I would try to rebuild the index on ENTI_KEY_1 (I guess there is one)
on THEPRS and see if the error disappears. (The workfile from the subquery
is IMO built using an indexonly scan, so an inconsistency between index and
table would explain it, too).

Kind regards

Bernd



Am 22.05.2020 um 16:46 schrieb Cees Slot:

Hi Bernd,

I am running these queries in an environment with no other users, so the tables being queried do not change.

Ik can also reproduce the results by running the same query again. The same erroneous rows area returned consistently.

Cees

-----End Original Message-----


-----End Original Message-----
**This e-mail, including any attachments, may be confidential, privileged, or otherwise legally protected. It is intended only for the addressee. If you received this e-mail in error or from someone who was not authorized to send it to you, do not disseminate, copy, or otherwise use this e-mail or its attachments. Please notify the sender immediately by reply e-mail and delete the e-mail from your system.**
Attachments

  • image001.png (3.3k)

Cees Slot

RE: SQL LEFT JOIN BUG?
(in response to Philip Sevetson)

Did a final test:

Created a new universal tablespace like the tablespaces that contains THEVIB and a table THEVIB9 identical to THEVIB. There are no indexes on table THEVIB9.

Populated THEVIB9 with INSERT INTO THEVIB9 SELECT * FROM THEVIB;

SELECT *                      
FROM THEVIB9 E
LEFT JOIN
(
SELECT DISTINCT ENTI_KEY_1
FROM THEVIB9
) S
ON E.ENTI_KEY_1 = S.ENTI_KEY_1
WHERE S.ENTI_KEY_1 IS NULL

The result is the same anomalous row as before.

Then I added 1 row to the table and run the query again. Again 1 rows is reported, but a different row from the one reported before, but not de row I added.

Then I reorganized the new tablespace (suppose this doesn't change a lot because there is no clusterindex) and run the query again. Makes no difference, still one row is reported.

I changed the compression from COMPRESS YES to NO, did a REORG and run the query again. Makes no difference, the same row as before is reported.

Created an index, run the query again, no difference (makes sense, the index is not used) 

REORGed the tablespace once more, since there now is a clusterindex, the rows will be reordered. Run the query again. Makes no difference at all.

DROP the index, changed ENTI_KEY_1 ASC to ENTI_KEY_1 DESC, CREATE INDEX, REORG tablespace. 

Run the query again: NO MISSING ROWS ANYMORE !

DROP the index, changed ENTI_KEY_1 DESC to ENTI_KEY_1 ASC, CREATE INDEX, REORG tablespace.

(So this is a situation we have seen before). 

Run the query again: NO MISSING ROWS ANYMORE !

I think I now have enough material to take this to IBM.

Thanks for your input

Cees 

 

Edited By:
Cees Slot[Organization Members] @ May 23, 2020 - 05:27 PM (Europe/Amsterdam)

Joe Geller

RE: SQL LEFT JOIN BUG?
(in response to Cees Slot)

As Phil suggested, Explains (of the various test cases) are very important and I am sure IBM will ask you for them.  If it is a bug it may show itself only with certain access paths and not others.

Joe

In Reply to Cees Slot:

Did a final test:

Created a new universal tablespace like the tablespaces that contains THEVIB and a table THEVIB9 identical to THEVIB. There are no indexes on table THEVIB9.

Populated THEVIB9 with INSERT INTO THEVIB9 SELECT * FROM THEVIB;

SELECT *                      
FROM THEVIB9 E
LEFT JOIN
(
SELECT DISTINCT ENTI_KEY_1
FROM THEVIB9
) S
ON E.ENTI_KEY_1 = S.ENTI_KEY_1
WHERE S.ENTI_KEY_1 IS NULL

The result is the same anomalous row as before.

Then I added 1 row to the table and run the query again. Again 1 rows is reported, but a different row from the one reported before, but not de row I added.

Then I reorganized the new tablespace (suppose this doesn't change a lot because there is no clusterindex) and run the query again. Makes no difference, still one row is reported.

I changed the compression from COMPRESS YES to NO, did a REORG and run the query again. Makes no difference, the same row as before is reported.

Created an index, run the query again, no difference (makes sense, the index is not used) 

REORGed the tablespace once more, since there now is a clusterindex, the rows will be reordered. Run the query again. Makes no difference at all.

DROP the index, changed ENTI_KEY_1 ASC to ENTI_KEY_1 DESC, CREATE INDEX, REORG tablespace. 

Run the query again: NO MISSING ROWS ANYMORE !

DROP the index, changed ENTI_KEY_1 DESC to ENTI_KEY_1 ASC, CREATE INDEX, REORG tablespace.

(So this is a situation we have seen before). 

Run the query again: NO MISSING ROWS ANYMORE !

I think I now have enough material to take this to IBM.

Thanks for your input

Cees 

 

Michael Hannan

RE: SQL LEFT JOIN BUG?
(in response to Joe Geller)

Cees,

I would be interested to know the datatype of you join column. Is it VARCHAR, by any chance and defined as EBCDIC data, Bit Data or what? and your Function Level.

I have hit a self join incorrect result myself, Left Join not necessary, where the problem join column was COLGROUPCOLNO in SYSCOLDIST  which is VARCHAR for BIT DATA.    On Db2 12 zOS FL100 (compatible with V11).

In my case  I incorrectly got no rows in the result. Added an extra filter condition WHERE clause, then rows appeared.

Also when I changed my join predicate:   AND X.COLGROUPCOLNO = Y.COLGROUPCOLNO

to 

AND  STRIP(X.COLGROUPCOLNO) = STRIP(Y.COLGROUPCOLNO)

then i got some rows. Using HEX function on both sides also got some rows (not necessarily all rows expected).
On a V11 subsystem, all was working fine, as far as I could detect.

I was thinking about raising it with IBM. I suppose I may as well publish my incorrect results SQL query (on V12 FL100 not New Function):

  with cd as                                             
(SELECT *
FROM SYSIBM.SYSCOLDIST
where tbowner like 'ENV%'
and TBname = 'table name'
)
,cdj as
(select x.type, x.name
from cd y
join cd x
on x.tbname = y.tbname
and x.tbowner = y.tbowner
-- Apparent bug joining this column V12.
-- join on hex works.
and x.colgroupcolno = y.colgroupcolno
-- and rtrim(x.colgroupcolno) = rtrim(y.colgroupcolno)
-- and hex(x.colgroupcolno) = HEX(y.colgroupcolno)
and x.name = y.name
and x.numcolumns = y.numcolumns
and x.type = y.type
and x.colvalue = y.colvalue
)
select * FROM cdj
with ur;

 

It is easy to see this self join should return rows, if first subquery finds rows (I changed the real table name), but it failed to find rows, until I adjusted the join predicate (to versions currently commented out) or added further WHERE predicates.

My original query was much more complex but I simplified it as far as possible while still getting wrong results. This may help to reproduce the problem somewhere else. The original query had been working for years, so was a surprise.

I wasn't particularly concerned about the access path for my incorrect results. I seemed to get incorrect results with a variety of access paths in different version of the query while I took out various parts of it till getting to the minimum that failed. So I think the problem is not likely to be related to the access path, in my case (you never know). Internal flags? 

I will get this raised with IBM.

P.S. Just tested my query at another V12 different FL site. Fails to get correct results there too. No rows returned, but changing the problem join predicate on COL GROUPCOLNO to use HEX function on both sides, then returned some rows , but just 21 rows.  431 rows in first subquery.  If I change to a Y LEFT JOIN X, with original join predicates (no HEX function), got 431 rows of Nulls, which should not be Null. Changed join predicate to use HEX again, got most rows with Nulls, 21 with data, totalling 431 rows. Weird every row should match itself with no mismatched rows.   Adding DISTINCT to first subquery made no difference to results. I am pretty much convinced there is a fault in there somewhere (at 2 sites). It may relate to unusual data types.

Last SQL incorrect output I ever found many releases ago related to FLOAT(4) data which hardly anyone ever uses. Most of my past problems were SQLs hitting errors (and failing) when I could not explain why, but found some way to get around it.

Michael Hannan,

DB2 Application Performance Specialist

Edited By:
Michael Hannan[Organization Members] @ May 24, 2020 - 03:38 PM (Australia/Melbourne)
Michael Hannan[Organization Members] @ May 24, 2020 - 03:44 PM (Australia/Melbourne)
Michael Hannan[Organization Members] @ May 24, 2020 - 04:38 PM (Australia/Melbourne)
Michael Hannan[Organization Members] @ May 24, 2020 - 04:40 PM (Australia/Melbourne)
Michael Hannan[Organization Members] @ May 24, 2020 - 05:02 PM (Australia/Melbourne)

Cees Slot

RE: SQL LEFT JOIN BUG?
(in response to Michael Hannan)

Hi Michael,

the columns used are EBCDIC CHAR FOR SBCS. Nothing special there.

We're on function level V12R1M100.

One more final test :-)

- I have unloaded my table having the anomaly using DSNTIAUL to file A.

- Then I created a new PBR tablespace TSA  and table TBA  without indexes.

-  Then I loaded file A into the table TBA with the DB2 LOAD utility (so an empty tablespace is loaded)

- Run query on table TBA, the anomaly is still there.

So I suppose in can give the (un)load file A and the DDL for TSA and TBA to IBM and hope they can recreate the problem.

Cees  

Edited By:
Cees Slot[Organization Members] @ May 24, 2020 - 12:34 PM (Europe/Amsterdam)
Cees Slot[Organization Members] @ May 24, 2020 - 12:37 PM (Europe/Amsterdam)

Michael Hannan

RE: SQL LEFT JOIN BUG?
(in response to Cees Slot)

My SQL seems to get incorrect results on Db2 zOS V11 as well on recent test. So maybe not version specific.
Hopefully will get the Sysprog to report it shortly.

I think my problem is specific to VARCHAR for BIT DATA cols. CONTOKEN is BIT DATA but fixed length and no join issues there.

Michael Hannan,
DB2 Application Performance Specialist

Edited By:
Michael Hannan[Organization Members] @ May 25, 2020 - 01:52 PM (Australia/Melbourne)
Michael Hannan[Organization Members] @ May 25, 2020 - 01:53 PM (Australia/Melbourne)

Michael Hannan

RE: SQL LEFT JOIN BUG?
(in response to Michael Hannan)

I can confirm my problem is access path dependent, the Common Table Expression causes materialization to a workfile. The two workfiles (actually just 1) are joined by Merge Scan Join.

If I don't use a CTE or a workfile, problem goes away.

Michael Hannan,
DB2 Application Performance Specialist

Daniel Luksetich

SQL LEFT JOIN BUG?
(in response to Michael Hannan)
Please let us know if someone is going to open an incident with IBM. Otherwise I can see if I can do it.

Thanks,

Dan



+--------------------------------------+-----------------------------------------------------------+

| Daniel L Luksetich | IBM Certified Advanced Database Administrator – |

| IBM GOLD Consultant | Db2 10.1 for Linux UNIX and Windows |

| IDUG Content Committee Past-Chairman | IBM Certified Database Adminstrator – Db2 12 for z/OS |

| IDUG DB2-L Administrator | IBM Certified System Administrator – Db2 11 for z/OS |

| URL: https://db2expert.com https://db2expert.com | IBM Certified Application Developer – Db2 11 for z/OS |

+--------------------------------------+-----------------------------------------------------------+





From: Michael Hannan <[login to unmask email]>
Sent: Monday, May 25, 2020 1:18 AM
To: [login to unmask email]
Subject: [DB2-L] - RE: SQL LEFT JOIN BUG?



I can confirm my problem is access path dependent, the Common Table Expression causes materialization to a workfile. The two workfiles (actually just 1) are joined by Merge Scan Join.

If I don't use a CTE or a workfile, problem goes away.

Michael Hannan,
DB2 Application Performance Specialist



-----End Original Message-----

Attachments

  • image001.png (9.6k)
  • image002.png (14k)
  • image003.png (10.4k)
  • image004.png (11.7k)
  • image005.png (13.7k)
  • image006.png (13.1k)
  • image007.png (14.2k)
  • image008.png (7.5k)

Cees Slot

RE: SQL LEFT JOIN BUG?
(in response to Daniel Luksetich)

Hi Daniel,

I have asked the service provider that is responsible for our DB2 installation and maintenance to report the problem to IBM. I have an unload file and a job which easily recreates the problem, so I hope IBM will be able to recreate the problem  too.
I think it is very important to inform IBM about problems like these.

Cees

Michael Hannan

RE: SQL LEFT JOIN BUG?
(in response to Cees Slot)

Cees,

I am also  getting Sysprog to open an incident, since my problem may be different. IBM wanted DSN1COPY but SYSCOLDIST is in a large tablespace of several tables so I have created a small extract table with just 361 rows. Now easy to send them an Unload and a DSN1COPY. I was surprised that IBM claimed could not reproduce it as I have reproduced on 3 subsystems all at different Db2 levels. I reckon many sites could probably reproduce my problem easily.

I have found that changing join to non materialisation access path is also not producing correct results in some cases. So my original thought about not access path dependent maybe closer to the mark. 

I am thinking its an obscure problem in my case related to VARCHAR FOR BIT DATA in Joins. That would be unusual.

I may also try out Data Studio's "Capture Query Environment" under red  Start Tuning button. It produces a series of diagnostic files, but needs possibly all the Explain tables or a good subset of them. I was missing DSN_QUERY_TABLE at last try. In that subsystem had only created the "interesting" explain tables on my userid. Ha ha.

If anyone knows a lot about this "Capture Query Environment" feature and what it captures, would be interested.

On a tangent, to the topic, found that Tablespaces in Catalog still have multiple tables per tablespace. Is this likely to change in the future? Also found 3 tables from Directory or Catalog with Type blank, Segsize 0 and Partitions 0. Are these "Simple" Tablespaces? or a special case? Can we expect them to change soon?

Michael Hannan,

DB2 Application Performance Specialist

Edited By:
Michael Hannan[Organization Members] @ May 27, 2020 - 09:39 AM (Australia/Melbourne)
Michael Hannan[Organization Members] @ May 27, 2020 - 09:42 AM (Australia/Melbourne)

Daniel Luksetich

SQL LEFT JOIN BUG?
(in response to Michael Hannan)
Michael,

The capture query environment is the equivalent of service sql on z/OS. Hopefully it works for you as I have found IBM Data Studio bugs have increased with Db2 12 for z/OS. I have not tried the capture query environment myself under v12. Basically, what it does is captures a bunch of information from the EXPLAIN tables in a series of files on your workstation, and them gives you the ability to upload them to the PMR. However, not sure if that functionality still works. I can try it myself if I get the chance.

Cheers,

Dan



+--------------------------------------+-----------------------------------------------------------+

| Daniel L Luksetich | IBM Certified Advanced Database Administrator – |

| IBM GOLD Consultant | Db2 10.1 for Linux UNIX and Windows |

| IDUG Content Committee Past-Chairman | IBM Certified Database Adminstrator – Db2 12 for z/OS |

| IDUG DB2-L Administrator | IBM Certified System Administrator – Db2 11 for z/OS |

| URL: https://db2expert.com https://db2expert.com | IBM Certified Application Developer – Db2 11 for z/OS |

+--------------------------------------+-----------------------------------------------------------+





From: Michael Hannan <[login to unmask email]>
Sent: Tuesday, May 26, 2020 6:34 PM
To: [login to unmask email]
Subject: [DB2-L] - RE: SQL LEFT JOIN BUG?



Cees,

I am also getting Sysprog to open an incident, since my problem may be different. IBM wanted DSN1COPY but SYSCOLDIST is in a large tablespace of several tables so I have created a small extract table with just 361 rows. Now easy to send them an Unload and a DSN1COPY. I was surprised that IBM claimed could not reproduce it as I have reproduced on 3 subsystems all at different Db2 levels.

I have found that changing join to non materialisation access path is also not producing correct results in some cases. So my original thought about not access path dependent maybe closer to the mark.

I am thinking its an obscure problem in my case related to VARCHAR FOR BIT DATA in Joins. That would be unusual.

I may also try out Data Studio's "Capture Query Environment" under red Start Tuning button. It produces a series of diagnostic files, but needs possibly all the Explain tables or a good subset of them. I was missing DSN_QUERY_TABLE at last try. In that subsystem had only created the "interesting" explain tables on my userid. Ha ha.

If anyone knows a lot about this "Capture Query Environment" feature and what it captures, would be interested.

Michael Hannan,

DB2 Application Performance Specialist



-----End Original Message-----

Attachments

  • image001.png (9.6k)
  • image002.png (14k)
  • image003.png (10.4k)
  • image004.png (11.7k)
  • image005.png (13.7k)
  • image006.png (13.1k)
  • image007.png (14.2k)
  • image008.png (7.5k)

Michael Hannan

RE: SQL LEFT JOIN BUG?
(in response to Daniel Luksetich)

In Reply to Daniel Luksetich:

The capture query environment is the equivalent of service sql on z/OS. Hopefully it works for you as I have found IBM Data Studio bugs have increased with Db2 12 for z/OS. I have not tried the capture query environment myself under v12. Basically, what it does is captures a bunch of information from the EXPLAIN tables in a series of files on your workstation, and them gives you the ability to upload them to the PMR. However, not sure if that functionality still works. I can try it myself if I get the chance.

Thanks Dan, I decided to go with the "Service SQL"  from samplib member DSNTEJ6I, since could waste more time fiddling with the Data Studio feature, this time.  The "Service SQL" is a slightly strange piece of jargon for me (lacking clear meaning just a little), since it is a Stored Proc that executes presumably various queries to capture  Information IBM need. Perhaps in the old days it was not a Stored Proc. However Service SQL seems to be the IBM jargon.

Was amusing to me that it outputs a very large amount of info about Accelerated Tables enabled to IDAA even though they have no relevance to my query at all. I guess the mod to handle Accelerated Tables in the info is a bit basic. I could have picked a subsystem with no IDAA, to reduce the mass of data produced. 

It is possible my issue is confined to Merge Scan Join paths, and with VARCHAR FOR BIT DATA join columns.

Michael Hannan,
DB2 Application Performance Specialist

Michael Hannan

RE: SQL LEFT JOIN BUG?
(in response to Michael Hannan)

IBM are working on my join problem TS003741596 which I would characterize as a Merge Scan Join problem (most likely - work files involved) on VARCHAR FOR BIT DATA columns, can be inner join or left join, V11 or V12. 

Michael Hannan,
DB2 Application Performance Specialist