Clustering (was Oracle to DB2 z/OS conversion)

Phil Grainger

Clustering (was Oracle to DB2 z/OS conversion)
That’s also why I’ve been trying to wean people OFF the knee-jerk “clusterratio reorg” metric (hands up – I used to do this very thing when I was a DBA), but take into account how the data is accessed as well – which is made much easier with the advent of REORGCLUSTERSENS.

If CLUSTERRATIO is low AND so is REORGCLUSTERSENS, then it’s probably safe to skip a reorg (or perhaps investigate whether you have the right clustering key in the first place!)
________________________________

Phil Grainger

Principal Enablement Manager

[login to unmask email]

Direct



+44 (0)118 921 8000

Mobile



+44(0)7808 643 479


E2, Eskdale Road
Winnersh
Berkshire
RG41 5TS


[http://media.cms.bmc.com/images/corp_signature_bmclogo_2014.jpg] http://www.bmc.com

[https://acclaim-production-app.s3.amazonaws.com/images/2429c3cd-a1de-44fc-b4f3-bc762bb2f963/IBM%2BChampion%2B-%2BAnalytics%2B2018.png]





From: Gallagher,Bill R [mailto:[login to unmask email]
Sent: 07 December 2018 12:40
To: [login to unmask email]
Subject: [DB2-L] - RE: Oracle to DB2 z/OS conversion

That’s a fascinating article about chaotic storage.

And I agree with your comment about clustering indexes . . . if there is no defined or identified processing need for sequential access, then I see no advantage of specifically defining a clustering index. Particularly if it causes you to perform unnecessary REORGs just to maintain a high cluster ratio which, in the end, provides little to no benefit.

Bill Gallagher | Senior Systems Engineer, DBA | Data Administration

From: Grainger, Phil <[login to unmask email]<mailto:[login to unmask email]>>
Sent: Friday, December 7, 2018 7:25 AM
To: [login to unmask email]<mailto:[login to unmask email]>
Subject: [DB2-L] - RE: Oracle to DB2 z/OS conversion

If data access is TOTALLY random (ie NO sequential access at all) then why worry about clustering (except, of course, Db2 will choose a clustering index even if we don’t)

And there IS a cost in attempting to maintain clustering – perhaps for no advantage

Reminds me of something I only recently learned about how Amazon stocks its warehouses - http://www.systemid.com/learn/why-chaotic-storage-is-perhaps-the-best/ https://urldefense.proofpoint.com/v2/url?u=http-3A__www.systemid.com_learn_why-2Dchaotic-2Dstorage-2Dis-2Dperhaps-2Dthe-2Dbest_&d=DwMFaQ&c=UrUhmHsiTVT5qkaA4d_oSzcamb9hmamiCDMzBAEwC7E&r=EAGrd_qzLADPfI8dgytr8sbCG7_U9QfXwQMLgK1Zo30&m=yem_24TeyEDJtmesJBhx10KtYAqmmiWTKKSFIF9OEjA&s=N4sHZvo6FwIINGUA2WMtKQESOSlLBOuUhMOAlK29bbQ&e=
-----End Original Message-----
BMC Software Limited Registered Office: Building E2, Eskdale Road, Winnersh, Wokingham, Berkshire, United Kingdom, RG41 5TS Registered in England No. 1927903 The content of this email is confidential. If you are not the addressee, you may not distribute, copy or disclose any part of it. If you receive this message in error, please delete this from your system and notify the sender immediately.
-----End Original Message-----


________________________________
This message (including any attachments) may contain confidential, proprietary, privileged and/or private information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.

TRVDiscDefault::1201
-----End Original Message-----
BMC Software Limited Registered Office: Building E2, Eskdale Road, Winnersh, Wokingham, Berkshire, United Kingdom, RG41 5TS Registered in England No. 1927903 The content of this email is confidential. If you are not the addressee, you may not distribute, copy or disclose any part of it. If you receive this message in error, please delete this from your system and notify the sender immediately.
Attachments

  • image001.jpg (8k)
  • image002.png (9.3k)

Michael Hannan

RE: Clustering (was Oracle to DB2 z/OS conversion)
(in response to Phil Grainger)

In Reply to Phil Grainger:

That’s also why I’ve been trying to wean people OFF the knee-jerk “clusterratio reorg” metric (hands up – I used to do this very thing when I was a DBA), but take into account how the data is accessed as well – which is made much easier with the advent of REORGCLUSTERSENS.

If CLUSTERRATIO is low AND so is REORGCLUSTERSENS, then it’s probably safe to skip a reorg (or perhaps investigate whether you have the right clustering key in the first place!)

I agree in some aspects. Yes there is little point to Reorg to maintain Cluster Index that serves no usefullness. It is  commonly seen that a single column randomised unique key was made the cluster index. There is little point to this clustering if only a single row is ever accessed, and not a group of data rows in the cluster sequence.

However we can use performance monitors to detect access paths via a poorly clustered cluster index (or non clustered index), where the index has relatively low Getpages, and the the data pages have vastly higher Getpages due to chaotic non-clustered access. Sometime the solution to this performance problem is to make Index Only access , completely eliminating data Getpages, and sometimes the solution is to Reorg to improve the Clusterratio from perhaps 98% to 100%, and likewise improve the more representative Datarepeatfactorf. I use 98% as an example of poor clustering for a cluster index, in case people think 80% or 0.8 is the mark for Reorg.

Sites have joins between major tables by a key column(s). These joins can perform vastly better if the tables are not chaotic by the join keys.

So important to decide which index clustering, would benefit performance a lot, and only then worry about keeping it Reorged.

Wrongly chosen Cluster indexes are of not much benefit, whether table is Reorged or not. 

Also worth noting that a Cluster index will never reach 100% clustered if it is opposed to the partitioning scheme so that successive rows in the cluster sequence are typically not in the same partition.  That wrecks Data getpages for processes using the index sequentially. That tends to be nearly as bad as totally chaotic, and can mean the partitioning scheme was not chosen to suit performance of business processes.

The criteria for Reorging a table, should not be based on any fixed clusterratiof or datarepeatfactorf deterioration. Different requirements for different tables, depending on how the applications use data access via the cluster index, or any other index with high clustering similar to the cluster index.

Yes I know many sites do not use Clusterratiof of Datarepeatfactorf at all to determine possible need for Reorg, since those stats maybe out of date, and hence Realtime stats are used in preference, like Unclustered Inserts. Unfortunately there is no equivalent to finding deteriorated Datarepeatfactorf in the realtime stats. It is an indicator of the blowout in Data Getpages via the cluster index when List Prefetch is not used, due to disorganisation. That is why they introduced it.

REORGCLUSTERSENS seem to be the number of times that data has been read sensitive to the Clustering of the index. I have not had time to investigate, what one "time" means as yet. One SQL call? or perhaps one query block execution. This column maybe quite difficult to interpret, if a non zero value. Does Db2 maintain it accurately? 

If was an SQL count, would be hard to get meaning, as some SQLs do trivial things and others do massive joins.So a Getpage based number would be more meaningful for me.

Normally I complain about performance problems due to badly disorganised tables (when disorganisation is causing the very high Getpages and not purely random access), usually in multi table joins, and the topic of reorging too often hardly comes up unless a site is Reorging the wrong objects frequently. So I rarely care about trivial objects Reorged too often. Certainly for large objects that are difficult to Reorg, we want to understand why it is going to help, even plot the change of performance of key Packages to show change before and after Reorg.

Not a fan of Chaos (random distribution) at all in Db2 Tables.

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ Dec 09, 2018 - 11:54 AM (Europe/Berlin)