Blogs

Do You Really Need That Extra Copy of the Database?

By Joe Geller posted Oct 07, 2014 08:46 AM

  

How many copies of your databases do you have? I’m talking about both how many test databases with test data as well as full copies of your production data.  In many shops, developers are well behaved and get along nicely. They share test databases. If they have to have their own data to play with (ok, work with – this is a professional organization), they work out a scheme with different key ranges for each developer. In other shops, they don’t play well together. They do not know how to share (perhaps because they haven’t taken the time to really learn the database model) and each one requests their own database to test with.

Of course, the needs of each organization are different. Batch jobs that update records throughout a table will change the data that may be needed by a different developer so sharing is more difficult.

Even worse are the requests for copies of the production data. There are many issues to contend with here:

  • Data security issues – it is common today to have regulations controlling access to production data (government, audit or just company internal rules). Even if you have access to production strictly controlled, you probably do not have the same controls for test environments.  If you just do a straight copy, you’ve exposed all of the sensitive data.  It becomes essential to have some type of data masking in place to prevent this. Of course, doing so places limitations on the possible methods of cloning the data.
  • Data storage space – production databases are often huge. The more copies you make, the more disk space you need.
  • The time to make the copy. There are a number of different ways to copy databases. Some are fairly quick, but others (especially the more flexible methods) can take hours to run.  You have to balance the cost, the time to make the data available (it’s probably needed ASAP) and the DBA’s time to run the process.
  • The cost – many of the best solutions are vendor products and these, of course, cost money.
  • The consistency of the data. If you copy directly from the live database (most up to date data), it is really not feasible to ensure that the copies of all of the tables are in synch with each other.  It is therefore probably necessary to use a (consistent) backup and copy the data as of a specific point in time. This affects whether you need an offline backup or if you need a vendor product that can allow you to do an online backup and then transform the online backup into a consistent offline backup (using the logs). 
  • What point in time do you need the data to be as of?  If your reason for making this copy is to investigate the state of the database as of xxx time (or to see the effect of a program change on the data) then you need to retain your backups for a period that will cover all of your scenarios.

If you need to go back in time, you need a mechanism that can handle evolving schemas.

  • If the table structure has changed, you cannot simply use DSN1COPY to quickly copy an image copy to the target tablespace.
  • Similar issues prevent the use of Flash Copy to overlay the DASD.
  • If a table or tablespace has been dropped and recreated in the interim, an Unload from image copy cannot be done with the IBM Unload (but can be done with some Vendor products).

What are Good Reasons and Poor Reasons for Another Copy?

Let’s first look at some use cases that are probably valid for making a copy of production data.

There is a need to see the exact state of the database as of a prior point in time.√

  • This may be to verify what the data looked like at that time.
  • If you are using system temporal tables, you may be able to accomplish this without having to make a copy from a prior backup.

There is a need to rerun a set of jobs or transactions using different input parameters or a modified program. √

  • An error may have been discovered, but before implementing a fix in production you may need to test the results.

** note that these two use cases may require the exact production data – not masked for security – in order to reproduce the desired results.  Security measures must be put in place to ensure that only authorized personnel can view this data.

The developer(s) need to test their programs and they don’t understand the database and data well enough to create their own test data. Using a copy of production data will give them a good base of data to work with. X

  • Our first thought is that this is a terrible reason. Just because the developer is too lazy or unconcerned to take the time to learn what the data means, the relationships between the entities and between the data and their program, does not mean they should be hand-fed copies of production data.  In addition, since this is only for development purposes, it is essential that all sensitive data be masked.
  • Our second thought is that in today’s rapid development environment, where upper management considers developers (and DBAs too) to be commodities with interchangeable parts, there is pressure to develop quickly with any available programmer. Knowledge and understanding of the database is not part of the job. There is even a trend to consider the database simply as a place to stick (persist) the data from the program rather than as an essential core of the application. Therefore providing a ready-made test database may be what management wants.
    • This is not necessarily a completely new development. Many years ago (in the days long before Agile development) I had a manager who wanted a couple of weeks of analysis immediately followed by coding.  This was not Agile iterative development; it was monolithic coding of the entire program.
  • On the third hand, knowledge of the data model and data is an essential, integral part of the development process.  Most code today follows an object-oriented paradigm.  In this paradigm, objects are really not just program objects embedded within the code.  An object should model the real world and consists of the real world data along with the processes that will access and manipulate that data. Many developers have not fully understood that concept and think the world revolves around their program in isolation.
  • As part of that third point, even if a developer has a production copy of the database to work with, there is absolutely no guarantee that without understanding that data, they will be able to tell whether or not their program is functioning correctly. They may not know how to query related data and they may not know how to verify that the changes made to the data by their program are what they intended (or even what the business analyst intended).

The bottom line, is that copies of the database (whether copies of production or copies of test data) should not and cannot replace the need for the developers to understand the data well enough to create their own test data.

Where Do These Copies Come From?

For the purpose of creating a test copy, you could just directly copy the data from the source database to the target. However, if the source is being concurrently updated, then it will not be possible to make a coherent copy by selecting or unloading the data one table at a time from the source. To ensure that all of the target tables are in synch with each other, the source needs to be a consistent copy of the source database.  There are a number of ways of making the consistent copy and using that copy. A plain sharelevel change image copy (with full concurrent read/write access by the applications) will not suffice.

I’m not going to give you detailed recommendations on what your strategy should be – because it all depends on your needs.  Instead I will present some possible approaches and some general guidelines. There are actually 3 different reasons to do backups:

  • “Normal” recovery scenarios.  Actually, normal recovery is handled automatically by DB2 with backout for system and program failures.  This involves rollback and the backups are not even needed.  The kind of recovery I am talking about here is not routine but can occur from time to time – DASD failure, data corruption due a program error, etc.
  • Disaster recovery – In this situation, you need to recreate your entire database (or at least the parts you can’t live without), often at a remote site.
  • Cloning the database or a subset of tables, as discussed in this article.

You may actually need multiple strategies, types of backups and retention policies to handle all those requirements. Online backups will suffice for normal recovery, but may not (by themselves) produce a consistent set of copies for cloning. Conversely, for production recovery (normal or disaster), you do not have to do any data masking, but for copies of the data, you might.

Backup Types

  • Full database backup – this will copy the entire database.
  • On DB2 LUW you issue a backup database command.  In addition to all of the tablespaces (which contain the tables and indexes), it will backup the DB2 catalog tablespaces and the DB2 logs (if you are running an online backup).  Since the logs are part of the backup, you can recover the entire database to a point in time.
  • On DB2 z/OS, you can do Flash Copy of all of the database DASD.  The BACKUP utility will back up the entire database, invoking Flash Copy.  In order to ensure consistency between the tables, you start by issuing the SET LOG SUSPEND command.  This will pause the writing of records to the DB2 log  After the Flash Copy is started, you then SET LOG RESUME.  This is the fastest way to backup the entire database.
  • Image Copies – on z/OS these are done at the tablespace level.  Indexes can also be backed up, but that is not essential as they can be rebuilt from the tables.  On LUW, you can also backup individual tablespaces.
  • Unloads – you can do an unload (export on LUW) on the individual tables.  This gives you the most flexibility in that you can (depending on the tool) unload the exact row image, subsets of the columns, expressions and also do selective unloads.  This is also the slowest method because it works on the row level (some unload tools use SQL to Select each row, others read at the page level and bypass SQL to process the rows).
  • Now that you’ve got some backups, you have some choices for making the copy:
  • The simplest, fastest way is to use the full database backup and copy it to a new database.  Again, Flash Copy methods are the fastest – quite a bit faster than even dataset level copies.  On LUW you can restore the backup to a new database.  On z/OS you can copy to a new set of DASD on a different subsystem.  An important consideration is the dataset names.  The DB2 catalog has the names of the tablespaces.  You obviously cannot have 2 datasets with the same name on the same system.
  • It is possible to copy/restore individual tablespaces from a full database backup if you do not need every table.
  • From individual tablespace backups, you can copy the backup dataset to a new tablespace.  On z/OS you could use DSN1COPY for example.  Of course it is not that simple.  The database & tablespace IDs are imbedded in the copy and will be different from the IDs on the target subsystem, so you have to specify a translation in the DSN1COPY control cards.  There are vendor tools that can help automate this for you; some shops have written their own automated process.
  • If you run online backups, but need consistent tables in your copy, there are vendor tools that can take the online image copies and apply the logs to produce a set of consistent offline copies.
  • You can use the Unload utility (z/OS) against the image copy to create an unload file which can then be loaded into the target table.  The advantage of this is that the unloads will be against a set of consistent image copies.  This is the slowest method, but the most flexible.
  • Flexibility is important for several reasons:
  •  You may not need to copy all of the data.  Unloads and Loads (import/export) can selectively copy the data.
  • Data masking.  If you don’t mask the data while unloading it, you will need a process to do it at some point before the sensitive data can be viewed by unauthorized personnel.
  • Evolving schemas.  If the table definition has changed since the time of the backup (quite likely if you need to go back in time), dataset level copies will not work (although again, there are vendor tools which can handle some situations).  If you’ve dropped and recreated the tablespace or table (even with the exact same column definitions), you cannot do a simple DSN1COPY or even an Unload from Image Copy (with the IBM Unload) because new object IDs will have been assigned by DB2.

In summary, you’ve got your work cut out for you to research and determine the best strategy for your needs.  It won’t be cheap – either vendor tools or homegrown processes; tape and dasd for the copies, processing costs, etc.  My best advice – don’t make that copy if you don’t really need it.

 

1 comment
5 views

Comments

Oct 07, 2014 07:25 PM

Great blog article that addresses challenges of managing test environments

Joe covers the topic well.  There is considerable effort and coordination in maintaining test and QA environments.   Long Unload/Load run times, masking, and consistency (plus across objects) are a challenge.... not to mention DBA having to baby sitting long tedious processes.   For large copies  there are ISV tools that are probably better to use than Unload/Load  elapsed time and effort wise.

FYI there is a webinar on October 16th at 2 PM EDT that addresses some of these items... especially 'consistent' copies from production.  Webinar registration & info can be found at this webinar link

Rob