Partition By Growth Table Spaces - Partition 2, Living with PBGs

In Part 1 of this blog we looked at the different ways to create or convert to Partition by Growth (PBG) table spaces. In this part we will review the advantages and disadvantages of PBGs, when it might be appropriate to use them, and making changes to existing PBGs.

Size

PBGs can provide a huge capacity at 128Tb, the maximum possible size depends on the particular combination of pagesize, DSSIZE and MAXPARTITIONS. As with all universal table spaces, the relationship between the maximum possible number of partitions, the size of each partition and the pagesize for the table space is an important one to get right. In addition to MAXPARTITIONS, we have to consider the pagesize most suitable for the table space and then decide on the size of each partition with the DSSIZE parameter which leads us to the maximum number of partitions possible and ultimately the maximum size possible for that combination.

The following table is a cut-down version of what you can find in the CREATE TABLE SPACE section of the SQL Reference. It shows the various combinations and results of pagesize, DSSIZE, maximum possible partitions and maximum possible table space size. Bear in mind that these sizes are just for the base table space, if you have LOBs then the total possible size of the LOB table space would be 254 Petabytes (4096 partitions x 256Gb (DSSIZE) x 254 datasets)

The sizes with an asterisk are the largest size for that range of DSSIZEs, for example with pagesize 4K, DSSIZE 1G, MAXPARTITONS 4096, the total size of that table space would be 4Tb. Or a 32K page, with DSSIZE 4G and MAXPARTS of 4096, the total size of that table space would be 16Tb.

 Pagesize K DSSIZE G MAXPARTS Total Size Tb 4 1-4 4096 16* 4 8 2048 16 4 16 1024 16 4 32 512 16 4 64 256 16 4 128 128 16 4 256 64 16 8 1-8 4096 32* 8 16 2048 32 8 32 1024 32 8 64 512 32 8 128 256 32 8 256 128 32 16 1-16 4096 64* 16 32 2048 64 16 64 1024 64 16 128 512 64 16 256 256 64 32 1-32 4096 128* 32 64 2048 128 32 128 1024 128 32 256 512 128

Insert Performance

Universal Table Spaces (UTS) in conjunction with Member Cluster provide excellent concurrent insert performance. Member Cluster manages space for inserts in a data sharing environment on a member-by-member basis. If you have PBGs and heavy sequential insert processes from multiple members and clustering order is either not particularly important, or if you can REORG before the application needs to query the table, then PBG with Member Cluster could be a really good choice for optimum performance.

Speed and Simplicity of Implementation

Increasingly often, we know less about the physical requirements for table spaces, and we might not be given sufficient information about the potential size and any possible limit keys which could be used in Range Partitioning. Sometimes there aren’t suitable limit keys, or there might be an ever ascending key. In this scenario, a PBG might be the best tablespace option to support the application’s requirements (or lack of!).

Many new features and functions of Db2 require that the table space be of type UTS, in particular, online schema changes. This means that many structural changes which would have previously required a drop and recreate, or “offline” change, can now be achieved with an SQL ALTER followed by an online (SHRLEVEL CHANGE or REFERENCE) REORG to materialise the change. This is not possible with non-partitioned segmented table spaces or classic partitioned table spaces.

Also, if we want to use inline LOBs, this is another feature which requires the table space to be UTS, and if there’s no need or possibility for range partitioning, then PBG could be the way to facilitate inline LOBs.

In these circumstances, and because of the IBM recommendation to create table spaces as UTS rather than non-partitioned segmented, PBG is a great option as they are so quick and easy to set up and, potentially, forget about. In part 1 of this blog, we saw that it is easy to either set up implicit or explicit PBG table spaces with very simple DDL.

XML Columns

If you have XML columns in your table, then there is one prime advantage to creating the table space as (or converting the table space to) Partition by Growth rather than segmented. This is because universal table spaces can support multiple XML versions, which is a feature to optimise concurrency through lock avoidance and memory usage.

Size

Size is both an advantage and a potential disadvantage to PBGs. They can become huge with very little set up, but “large” PBGs can be difficult to manage and maintain, particularly if you need to REORG them.

PBGs with LOBs can become particularly cumbersome, especially if you need to do a table space level REORG (e.g. to materialise Pending Definition Changes such as converting a segmented table space with a large amount of LOB data to PBG, or to increase the DSSIZE at Db2 11).

Up to and including Db2 11 if your table space has non-zero values for PCTFREE and FREEPAGE, you may find that a partition level REORG fails. This is expected – if you specify a single partition, or a subset of partitions and the data that is unloaded does not fit back into those partitions once the free space has been re-introduced, the REORG will fail. IBM suggests 2 things, either REORG at table space level, rather than partition level, or alter the PCTFREE and FREEPAGE values to 0, the latter of these is what the REORG manual states as the preference. At Db2 11 there is an additional zParm which is deprecated at Db2 12, REORG_IGNORE_FREESPACE, and when set to YES means that Db2 will always ignore the PCT FREE and FREEPAGE for partition level PBG REORGs. But this applies to all PBG REORGs at partition level, which might not always be what you want or need.

Db2 12 resolves this issue completely by being able to create a new partition to overflow into if required during REORG, even at partition level (up to the MAXPARTITIONS limit).

Indexes are NOT Partitioned, Ever

It is not possible to have a partitioning index or data partitioned secondary indexes (DPSIs) on a PBG, all indexes are non-partitioning indexes (NPIs), so elapsed times for REBUILDs either as part of REORG, LOAD or RECOVER could be high due to the lack of parallelism that you might normally get with partitioned indexes.

Under-Estimating MAXPARTITIONS Value

In the first part of the blog, we talked about choosing a realistic value for MAXPARTITIONS to avoid unnecessarily wasting thread storage, however, things can change and you might find yourself in a position where you need to increase the value of MAXPARTITIONS.

This SQL statement is an immediate ALTER:

`ALTER TABLE SPACE dbname.tsname MAXPARTITIONS xxxx;`

but, it will invalidate all packages dependent on the table and therefore requires a REBIND (even at Db2 12). If you have packages associated with the table of your PBG table space, you might want to consider a REBIND with EXPLAIN ONLY and APCOMPARE(WARN) prior to running the ALTER so that you can establish if there may be any negative effect on the access path after the change.

Conversely, if you decide to decrease your MAXPARTITIONS value, then you can do so with an immediate ALTER which does NOT invalidate packages.

Limitations with Common Db2 Utilities

In addition to the comments above on REORG and data not fitting back into the partition (prior to Db2 12), there are a couple of other important utility considerations.

You can’t run a REORG REBALANCE against a PBG, however, a table space level REORG (without a LOB) will condense the data into as few partitions as possible (including any freespace that you have specified in your table space/index DDL). If you have a LOB, then you will need to specify AUX YES to include the associated LOB table spaces and to permit data to move between partitions. If you don’t specify AUX YES then data will be REORG’d but within each partition where the data already exists, it cannot move between partitions if the auxiliary table space is not in scope.

You cannot run the LOAD utility at partition level, only at table space level.

If you convert from a simple or segmented table space to PBG, you cannot RECOVER to a point prior to the materialising REORG, similarly, you cannot ALTER back to non-partitioned segmented table space.

Other Restrictions of PBG

It’s not possible to ROTATE partitions with a PBG table space, nor is it possible to issue an ALTER PARTITION, although you may ALTER the table and ADD new partitions.

At Db2 10, surplus partitions cannot be dropped once they have been created. You may occasionally find that you have a number of unwanted empty partitions. There are a few ways that this might happen. There could have been a decrease in data volume possibly due to either an SQL DELETE process or REORG DISCARD or DUMMY LOAD REPLACE that has run against the table. Or perhaps NUMPARTS may have been incorrectly specified meaning that more partitions than necessary were originally defined. A third, but less likely, scenario is that someone manually added extra partitions to the table using the ALTER TABLE ADD PARTITION statement. You can’t run an ALTER to DROP these partitions.

When to PBG?

The latest recommendation from IBM is that PBGs shouldn’t be used without some deliberation, but they can be considered in these situations:

• When there are no obvious limit keys to support Partition by Range. With PBGs there is no need to specify a partitioning key or limit key values, so if the application design doesn’t provide a suitable key you can still enjoy some of the benefits of partitioning with a PBG table space.
• When you have an ever-ascending key. Tables that have ever-ascending keys are great candidates for PBG, e.g. a sequence number, or timestamp as the key. It’s also useful to be PBG if the oldest data is archived or purged, then a REORG can be used to move data into a fewer number of partitions. At Db2 11 there is a zParm option to enable the dropping of empty partitions during REORG, or at Db2 12 this is enhanced and give us an option in the REORG to drop empty partitions (these zParms are covered in more detail later). This gives an advantage over PBR because the creation of new and removal of old partitions can be handled very easily and dynamically with REORG without having to consider limit keys.
• When clustering isn’t hugely important to the application or table. It’s perhaps uncommon for clustering to not be important, but sometimes applications have bucket or black hole table, where data needs to be inserted very quickly but is rarely read or updated, or performance of the read/update isn’t as important as the speed of getting the data into the table in the first place. In this case PBG with Member Cluster in datasharing is good performance choice.
• When the application has direct access to a few rows only. For example, index access with = or IN predicate. With PBGs page range screening isn’t possible as rows could be in any partition, so PBGs should probably only be considered where page range screening is not required. This could mean that online transaction tables where the application accesses a few rows only rather than tables that are heavily used by batch processes are better candidates for PBG.
• Avoid very large table spaces being PBG if they require a lot of maintenance (especially REORG). There’s always a lot of discussion on how we define “large”, and it’s not a surprise to say that “it depends”. If you have a small maintenance window which your REORGs must complete within, then you will need to work out approximate number of Gb that you can REORG in that window. That might be something like 100Gb.
• When the tablespace is unlikely to grow beyond a single partition. Many of the catalog tables in Db2 11 and 12 are defined with DSSIZE 64G and MAXPARTITIONS 1, aka a segmented table space! Maybe this is a good pattern to follow for PBGs?

Considerations When Converting to PBG and Backout Options

There might be situations where you have run the ALTER for MAXPARTITIONS (and possible SEGSIZE and DSSIZE too) and then either changed your mind or maybe the materialising REORG failed. At this point you can clear reverse the ALTERs, or Pending Definition Changes (PDCs).

The SYSIBM.SYSPENDINGDDL table holds the unmaterialised changes. If you want to run the materialising REORG later and leave the ALTERs in place ready for that, then you can, but be aware that any REORG with SHRLEVEL CHANGE or REFERENCE for the entire table space (including LOB/XML if they exist) will potentially materialise the change. If you want to mitigate that risk and control when the conversion takes place, you can drop the PDCs.

You can’t run an SQL DELETE against the SYSIBM.SYSPENDINGDDL catalog table, which means you can’t selectively DELETE some pending changes and leave others. You have to run another ALTER against the table space, and it’s an all or nothing approach:

`ALTER TABLESPACE dbname.tsname DROP PENDING CHANGES;`

After the ALTER the table space will be in AREOR until the next REORG completes. However, you can use the  REPAIR utility to reset AREOR status:

`REPAIR OBJECT SET TABLE SPACE dbname.tsname NOAREORPEND`

As we mentioned earlier, after the materialising REORG completes there is no “online” backout for PBGs back to segmented table spaces. If for any reason it is necessary to backout, you would have to unload the data, drop and recreate the object as segmented and reload the data to it.

As simple table spaces have been deprecated for some time now, a backout from a PBG to a simple table space is impossible. The closest you could achieve would be to unload, drop, recreate as segmented, and then reload.

Making Changes to Existing PBGs

For parameters like DSSIZE, SEGSIZE, MEMBERCLUSTER and BUFFERPOOL (if it is to a different pagesize), these are all Pending Definition Changes (PDCs), so as long as you have a valid combination of parameters, then you can run the ALTER which will put the table space into Advisory REORG status and then REORG with SHRLEVEL CHANGE or REFERENCE, remember that as with all Pending Definition Changes, you need to plan for dependent packages being invalidated at the end of the materialising REORG.

If you need to change the value of MAXPARTITIONS and the table has dependent packages (potentially program, trigger, stored proc and function packages) bear in mind that it is an immediate ALTER (no REORG required) and an increase will invalidate dependent package and therefore requires a REBIND (which could be an auto rebind), but a decrease does not invalidate any packages so no rebind is required.

Dropping unwanted partitions (even if empty) is not possible at Db2 10.  At Db2 11 if you have zParm REORG_DROP_PBG_PARTS set to ENABLE then any REORG of a PBG where there are candidate partitions to drop, will be dropped during the UTILTERM phase. At Db2 12 if you have either the zPARM REORG_DROP_PBG_PARTS set to ENABLE or include DROP_PART YES in your REORG TABLESPACE statement, then empty partitions will be dropped during the UTILTERM phase of the REORG (SHRLEVEL CHANGE or REFERENCE only).

As with all UTS table spaces, from Db2 11, changes which require a materialising REORG can take advantage of the syntax SORTDATA NO RECLUSTER NO. The result is likely to be a greatly improved elapsed time, however, the data will not be any more clustered after the REORG. This syntax can also be used in the materialising REORG when converting to PBG.

Summary

We can see by looking at the advantages and disadvantages, that there’s a time and a place for Partition by Growth table spaces, and establishing that is one of the trickiest things. They can be really useful to help us to get the benefits of UTS in place very quickly and without requiring too much analysis, and we have a reasonable amount of flexibility in getting them to work for us, and grow with our applications. But, we need to remember that with great size, comes painful maintenance, which can be made more painful with PBGs.