Using Physical Data Models in IBM Data Studio - by Ludovic Janssens and Dan Luksetich

By Daniel Luksetich posted Jul 27, 2015 10:13 AM


Why Data Studio?

IBM’s Data Studio holds a lot of features that are not really commonly known by the DB2 audience. Many people regard it as the replacement of the old Control Center, but as such this isn’t true, as Control Center held features that are no longer in Data Studio and Data Studio has got a lot of goodies that weren’t included in the Control Center.

In this article we would like to focus on the usage of physical data models. Physical Data Models provide a snapshot of a database setup on a given time and date. They allow you to design new changes without actually implementing them immediately.

From our experience as DB2 for z/OS DBAs, we are mostly acquainted with green screens with no graphical elements whatsoever. DB2 for LUW guys will probably have been confronted more often with graphical representations of their databases.

Data Studio has got it all: ERD Diagrams for those who like a graphical touch and text-based tree models that represent the physical implementation of our database objects.

Getting Started

If you want to create a physical data model for your database, you have to open the DATA perspective.

In the upper left corner, you will find the ‘Data Project Explorer’ view, just above the ‘Data Source Explorer’, if you haven’t changed the default setup. In this view you can create a new DATA DESIGN PROJECT, using file > new > data design project.

1 - The layout of a data design project

Creating a Physical Data Model

To create a new physical model you can either choose and existing project or create a new project in the ‘Data Project Explorer’ view. Once you have selected a project simply open the project and right click on ‘Data Diagrams’ where you will be presented with the option to create a new physical model. 


This action will open a dialog that will let you create a model using a template or from reverse engineering, giving you the ability to create a model from scratch or from an already existing database.

Option 1: from reverse engineering

Have you ever found yourself in a situation where you were suddenly required to produce a physical database model based upon existing structures to support a new project or an audit? This happens a lot, and as always management wants the documentation immediate. Data Studio can be used to produce this documentation very quickly. The more complete your database implementation, especially including database defined (enforced or not enforced) referential integrity, the more quickly a good looking model can be produced.

There are two choices for reverse engineering a physical data model, from DDL or from a database. The selection you make depends upon the number of table you would like to model. If you choose to reverse engineer from the database it is important to note that the only filter available is by schema, and if the schema selected contains a large number of objects the process of pulling in all those object definitions can take a long time and can also be more susceptible to errors. So, if only a few tables are needed it may be more efficient to generate the DDL from the ‘Data Source Explorer’ and then use the generated file as input to the reverse engineering.

Once the definitions have been imported you can select to generate an overview diagram. If you choose not to generate an overview diagram you can create a blank diagram and then begin pulling in tables one at a time. An explorer view exists under the schema or schemas from which you pulled in the definitions in the ‘Data Project Explorer’ under your data diagram. You can simply drag tables into the model editor as needed. Here is an example of an overview diagram generated from DDL input in a test schema called ‘DOLADU1’. Some DB2 sample tables were copied into this test schema for application testing purposes and the developers running the tests wanted a data model.


The overview diagram contains all of the objects and relationships pulled in from the DDL, but contain little detail, such as primary keys, foreign keys, and columns. You can expand these objects by right clicking on them, selecting ‘filters’ and choosing to show various ‘compartments’.


Objects can be moved about and adjusted until the model suits the requirements. If there are relationships missing they can easily be created by moving your cursor next to a table. A couple of small arrows will pop up that can be dragged to the target of the relationship.  


Clicking on any object in the diagram will populate the ‘properties’ view, exactly how you would see it for any object you click on in the data source explorer. Once inside the properties view you can edit the properties, including keys, columns, indexes, relationships and more.


Option 2: from scratch

When you start a data model from a blank template you can simply right click on the blank diagram and begin adding objects. Use can use the properties view to enter all of the characteristics for the object, or enter them directly into the object you are creating in the model.

Once you begin entering objects you can interact with the model and objects much in the same way that is described in the reverse engineering section above.

From scratch of not from scratch?

Probably, you will not often have to create a new physical data model from scratch. If you need to do so, just use it to model the very first version of your database before deployment.

As models are linked to the resource they are reverse engineered from, it is best that, once you have deployed your data model to the database, you work on a newly reverse engineered physical data model.

One reason to continue to work on the model created out of scratch however would be to have a central reference for several environments. Having a model without any linkage to an environment could be useful to compare to other reverse engineered models that need to follow the standard set by the central model.

Using Physical Data Models in a release cycle

Comparing Models

Once you have created a model, you can use it in various ways. An interesting feature of Data Studio is the inclusion of an Object Comparison Tool. As with other object comparison software, you can compare one data model with another one and generate the DDL to make one environment look like the other.


Compare with original source or another data object

If you reverse engineered a model from a given data source, the physical data model is automatically linked to the source you used. This allows Data Studio to provide you with the option to compare your altered model to the database you originally created your model from.

Right click any database object and choose ‘Compare with > original source’ from the context menu.  This will start a wizard in which you can choose in which detail you would like to do the comparison. When you hit ‘finish’, the comparison editor appears. In ‘Compare with > another data object’, you can choose the database object with which you want to do the comparison.


3 - Comparison Editor with the copy and generate delta DDL buttons marked

In the comparison editor, you can synchronize your target environment with the model you have created by clicking ‘copy from left to right’ or ‘copy from right to left’.  Once you have synchronized an item you can generate the DDL for either side.

The DDL comparison quality is moderate. On many occasions DDL with errors is generated, but once you know which type of errors that are to be expected, they are easy to deal with. For DB2 for z/OS, I opened quite a few PMRs the past two years. Most of them were fixed by IBM within the scope of the next release. I would advise everyone to do the same as this will improve the quality of this free product vastly.

Compare with each other

Instead of comparing a model with a database, you can also compare physical data models with each other. This is ideal when your deployment is not immediate or when your database is quite large.

Having your delta DDL upfront, allows you to include it in any change and configuration management tool.

Large databases require a lot of processing and decrease the performance of Data Studio. This performance issue is resolved when the compare is done locally between data models, as you omit the JDBC processing.

Generate Diff reports

In the compare editor an option exists to generate difference reports. You can generate reports on the differences or the synchronizations in various formats: XML, Excel, PDF, html … This is ideal when communicating dba actions to the development/maintenance teams.

You can find the buttons to generate the difference reports above the comparison matrix in the comparison editor. (See the above screendump).

Generate Delta DDL

When generating the delta DDL, there is an option to execute the DDL immediately. Do not use this option; it is better to open the DDL in the SQL editor. Thus you can validate the quality of the script and amend your DDL whenever required.

As mentioned above, the quality of the delta varies. Make sure everything is correct before executing the code.


Change Management

As Data Studio is Eclipse based and models are stored in XML format, any change management repository that is compatible with Eclipse is usable.

Personally, I have been using Subversive, the official SVN plugin of Eclipse, to keep track of changes in my data model. Obviously, you could choose alternatives, such as EGIT or even proprietary tools that are eclipse based.

The idea is that you integrate the change management in a view in Eclipse that can do check in and check out of the Data Design Project you use for you database administration.

Configuration Management

Data Studio has configuration management capabilities. If you are interested in how they work, take a look at the documentation on deployment groups.

As with change management, many alternatives exist. If you have a large Java development in your company, check with the teams whether they make use of software like Maven. Proprietary tools that do configuration management with Eclipse projects could do a similar job.


Data Studio holds a lot of hidden features that enable you to work structurally. It includes all functionality a data developer/designer needs. Unlike the old Control Center, it is not meant for database system administration; IBM released some new tools with this regard for both DB2 z/OS and DB2 LUW.

The tool still needs to mature, but thanks to the platform it is using, everything you create can be integrated in your own enterprise processes. For large companies it can collaborate with existing infrastructure. For small companies, Data Studio could be the ideal starter software to establish a change and configuration management at a very low cost.