DB2 10 For LUW: What’s In It for Me?

Whenever a new DB2 version is announced I get asked “Should I be thinking about upgrading?” My answer is always “Yes.” That doesn’t mean that I think you should upgrade your production systems the minute DB2 10 becomes generally available.  The key thing is that you take account of the fact that you will be upgrading at some stage and that when that time comes you are as ready for it as possible.

There are two aspects to being ready. The one which most people think about is knowing about new features so that these can perhaps be exploited. Perhaps more important is understanding what has changed or, most importantly, what isn’t there anymore. Not using a new feature won’t cause a production outage, although it may mean you are less efficient than you could be. Not realising that “Feature A”, which sits at the centre of “Core Business Application,” won’t be there any more is going to cause you serious issues.

Not that I’m underestimating the important of understanding the new features. We’ll come to those shortly. I just want to make sure you don’t skip over the “Deprecated and Discontinued” pages of the “What’s New” guide. Among the old operating systems and development tools no longer supported there’s something that I’ve hoped for but was starting to think would never happen – Control Center has finally been laid to rest, and with it the DAS (DB2 Admin Server) which was needed to use some of its features.   If you’ve not got to grips with IBM Data Studio, now is a good time to get started (I’ll have more on this later).

The next question I get asked is “When should I upgrade?” The answer to that depends on a lot of different factors, very few of which are technical. Some people only upgrade to keep themselves in supported territory: remember that a new version of DB2 being available also normally means that the oldest one which is currently supported is about to become unsupported. Many have to wait on guidance or clearance from their third party software vendor and this can take some time. And then there are those who see a use for something in the new version which will prompt an early upgrade.

So let’s turn to what has been catching my attention in DB2 10. This is very much a personal list. Not all the items you’ll find here are “big ticket” items from the IBM press releases, although some of them are there. But they are all features that I believe I’ll be able to make use of right away, and some of them are features that I wish I’d had many years ago although I probably didn’t realise it at the time.

The first feature I want to look at IS definitely a “big ticket” item, and one which deserves to be at the top of the list. It’s “Time Travel Query”: great feature, terrible name (apart from anything else, “query” implies SELECT only when the impact of this feature is across the whole CRUD spectrum). Having worked on many systems which manually implement temporal solutions and especially on projects which retrospectively apply data corrections, I have known the pain and complexity that is involved.  Having had some input to the design of the DB2 solution, which has also already been shipped in DB2 10 for z/OS, I’ve been waiting for this to finally make its appearance on LUW with eager anticipation. I’m now thinking not only about where I can use this feature in new systems, but also what would be involved in migrating some of the existing manual implementations to use this new feature. Certainly this is something we’ll be hearing a lot more about in the future.

Next on my list of great new features comes under the pureXML banner. The feature which has attracted the most attention is the “binary XML” format, which allows data transfer between DB2 and a Java (JDBC or SQLJ) application carrying out SAX or StAX parsing without actually having to parse the document.   While this may be useful in some cases it isn't nearly as generally useful in my opinion as another feature which has really not had much attention paid to it.

The pureXML feature I'm referring to is the ability to create functional indexes. This means that the XPath expression defining the index can contain functions as well as XML elements.  The two functions supported are fn:upper-case, which converts a text node to the its upper case character representation (useful for doing case insensitive searches) and fn:exists, which returns a boolean value depending on whether a particular element exists or not.   It is this latter one that is particularly useful.  Checking whether an element exists is a very common requirement and at present the only way to achieve this is via a “trick” which indexes the column and then uses an expression such as -

XMLEXISTS('$MYCOL/*:path/*:to[*:element > “”]')

There are two problems with this. Firstly it only works for leaf elements. Secondly if the text node associated with the element is very large then the index could get very large too, or even potentially could not be created. This new facility, where index entries will simply be (very small) boolean values, will be very efficient both in terms of storage and performance. The rest of the package of pureXML new features, including  the ability to evaluate prefix predicates using fn:starts-with in XML VARCHAR indexes and the ability to create XML indexes with INTEGER and DECIMAL data types, are all useful but the functional index support is for me the cream of a good overall crop.   

Next on my list of favourite features are a couple which come under the general category of data movement facilities. The first is the highlighted “continuous ingest” feature, which allows a Business Intelligence environment to be updated in near real time with minimal impact on query users.   It will be interesting to see how this facility is used in the real world. The second feature in this space has not been nearly so widely publicized but is really worth its weight in gold.  This is the ability to specify “DATA CAPTURE CHANGES” at a schema level, so that every new table created within this schema is set up for replication.   In contrast to the previous situation where each new table had to be explicitly set up for replication this is a real time saver.

The various new storage management features including the new storage groups, multi-temperature data storage and adaptive compression are also worthy of a mention.  However once again the new feature which attracted my attention the most was another less well publicized one : an online index reorg facility to allow the reclamation of storage freed up after a large scale delete exercise.

I never thought I'd mention new security features in an article covering “my favourite things”. However for once I believe that RCAC (Row and Column Access Control) is actually a good thing, mainly because it is relatively simple to use.  It is obvious a lot of thought and consultation has went into the development of this feature, so as not to repeat the mistakes of the earlier LBAC (Label Based Access Control) facility which I've never managed to find anyone actually using.

My final favourite new feature isn't even in the main DB2 server itself. It is the latest incarnation of IBM Data Studio. Don't be fooled by the fact that the version has only moved from 3.1 to 3.1.1: this is a significant upgrade.  For me, the best thing about it is not that it supports many of the new features in DB2 10 including Time Travel Query support, but that the underlying Eclipse architecture has been moved up from the 2008-era Eclipse 3.4 to the more modern (albeit still over a year old) Eclipse 3.6.  Why is this significant ?  Data Studio, as with all Eclipse-based tools, is extensible and most developers will add a variety of their favourite plugins to the base install.  Many of the plugins commonly used by developers were either just not available at Eclipse 3.4 level or had been greatly improved since. Among the signficant plugins not available at all was support for the Git source code control system which is rapidly becoming the standard for source control in large parts of the development community, partly because of its advanced capabilities for distributed development and partially because of the popularity of web-based repository hosting services such as Github and Bitbucket. The development community tends to be much more progressive than the database administration community but also tends to be fiercely loyal to their favourite tools. So it is important for fostering the use of DB2 in the wider development community outside the large corporates, and therefore foster the availability of DB2 skills in the future which is so vital to DB2's ongoing success, that   the DB2 tooling is kept up to date.

So that's me reached the end of the list of “my favourite things” in DB2 10. Obviously others will have a different list. Hopefully there will be something in there that you hadn't spotted before and which will encourage you to seriously look at being an early adopter of this important new release.

As always “Don't just do it, DB2 (10) it” !!!

Recent Stories
An Introduction to IBM Data Studio

The Basics of SQL Performance and Performance Tuning

Index Decluttering Opportunities in DB2 for Linux, UNIX, and Windows