Introduction to XML in DB2 11 for z/OS - Part 1

Introduction to XML in DB2 11 for z/OS – Part 1

Have you ever thought about using XML in DB2 for z/OS, but were afraid that you would need to study a lot of material before you could actually begin? Don’t worry! You do not need to know much before you start. You can get by with basic DB2 for z/OS knowledge. This is the first part of a series of blog posts that will take you from the very beginning to a solid foundation in XML in DB2 for z/OS. We will start with basic XML document manipulation. Then we will continue through elements of internal structures, until we get to the use of XML Schema and XPath and XQuery for more complex queries and updates.

Let’s review some key aspects of XML we will use throughout the series

XML stands for eXtensible Markup Language. It is the standard textual data format defining a set of rules for encoding documents. The format is both human and machine readable. XML was designed to describe data. Its main use is for exchanging information between various devices, systems and applications.

An XML document consists of elements and their attributes. Each element consists of a start tag, an end tag, and a value. The maximum length of an element or attribute name in DB2 (as of version 11) is 1000 bytes. Elements can contain nested elements to express relationships between the elements and to build hierarchical structures. You can express one-to-many relationships between elements with nested and repeating elements. Element attributes are used to add additional information to elements. Each element can contain multiple attributes within their start tag. You can picture an XML document as a tree. The following is a very simple example of an XML document and its tree representation.

<?xml version= "1.0" ?>
<blog id="3011975">
      <title>Cook Diary</title>
      <author>
            <nickname>Iloita</nickname>
      </author>
      <url>cookdiary.com</url>
</blog>

xml_structure.png
Color legend:

  • Gray – document node
  • Dark blue – element
  • Orange – attribute
  • Light blue – element value

Refer to https://en.wikipedia.org/wiki/XML or to www.w3.org (advanced) to get more information about XML documents and standards. 

How do I get my XML document into my DB2 database?

XML is a regular data type in DB2 (since DB2 V9, when IBM introduced pureXML). So you store your XML document similarly to how you store any other data type – in a column of the table. Of course it’s a little more complicated under the covers. The XML document is actually stored in its own table. When you create or add an XML column, DB2 implicitly creates the following objects:

  • One additional column in the base table, to store the DB2-generated document ID of the XML document
  • A table for the XML column and a tablespace for this XML table
  • Two indexes on the XML table

We will discuss how DB2 internally stores XML data in more detail in a future post.

Let’s create a simple table with two columns: one to store our XML documents, and the other one to represent the documents’ identification numbers. This document ID will identify the document in any WHERE clauses of SQL statements that apply to these documents.  

CREATE TABLE BLOGS (ID INTEGER GENERATED ALWAYS AS IDENTITY, XML_DATA XML);

You have several options of how to insert your XML document into the created table. The largest document size that can be stored in the DB2 table is 2GB, but not every option allows you to do that. Each option has its own specifics. We will focus on SQL in this article and cover other options in the next one.

Manipulating an XML document

The most straightforward way to insert an XML document into your table is to use simple interactive SQL, or static SQL from your application program. The XML document is represented by a string constant and is enclosed in quotes. This imposes a size limit of 32k on the XML document. If you try to insert a bigger document, you will get an SQL -102 error.

INSERT INTO BLOGS (DATA) VALUES ('<root>my xml document</root>');

When you run the statement above, the XML document is parsed by the DB2 XML parser. Only well-formed XML documents are accepted. A well-formed element fulfills a list of syntax rules provided in the XML document specification, such as rules about the characters that can appear in names, case-sensitiveness of tags, proper nesting of elements without overlaps, etc. The document is well formed when one single element known as the root element contains all of the other elements in the document, and all these elements are themselves well formed. In short, a well-formed document can be represented as a tree with one root element.

If you try to insert an XML document which is not well-formed, you will most likely get a -20398 SQL error. For example, if the XML document has more than one root element:

DSNT408I SQLCODE = -20398, ERROR:  ERROR ENCOUNTERED DURING XML PARSING
          AT LOCATION nn Multiple document elements not allowed.RC=000C-
          ,RSN=3065.                                                   

Or if the XML document breaks a syntax rule:

DSNT408I SQLCODE = -20398, ERROR:  ERROR ENCOUNTERED DURING XML PARSING
         AT LOCATION nn An incorrect character is found within        
         markup.RC=000C,RSN=3029.                                     

But what if you need to insert an XML document that is larger than 32k with SQL? You can use a host variable and insert an XML document as a LOB. Consider the following program in C, which inserts an XML document up to 4M in size:

#include "stdio.h"
#include "string.h"

EXEC SQL INCLUDE SQLCA;          
EXEC SQL BEGIN DECLARE SECTION;          
    SQL TYPE IS XML AS CLOB(4M) xml_data;
    /*  Which results in the following DB2-generated structure:    
    struct {                  
        unsigned long length;
        char data[4194304];
    } xml_data; */
EXEC SQL END DECLARE SECTION;
int main(void) {
    strcpy(xml_data.data, "<root>my xml document</root>");  
    xml_data.length = strlen(xml_data.data);
    EXEC SQL  INSERT INTO BLOGS (XML_DATA) VALUES (:xml_data);    
    return(sqlca.sqlcode);                       
} //main

As with inserting an XML document into a table, you can use SQL to retrieve, update or delete an XML document.

Some SQL data manipulation statement examples follow:

  • SELECT XML_DATA FROM BLOGS WHERE ID = 10;
  • UPDATE BLOGS SET XML_DATA='<root>updated xml document</root>' 
    WHERE ID = 1;
  • DELETE FROM BLOGS WHERE ID = 25;

And with host variables:

  • EXEC SQL BEGIN DECLARE SECTION;
        SQL TYPE IS XML AS CLOB(4M) xml_data;
        long int id;
    EXEC SQL END DECLARE SECTION;
    id = 1;
    EXEC SQL SELECT XML_DATA INTO :xml_data FROM BLOGS WHERE ID = :id;
  • EXEC SQL BEGIN DECLARE SECTION; 
        SQL TYPE IS XML AS CLOB(4M) xml_data;
        long int id;
    EXEC SQL END DECLARE SECTION;
    strcpy(xml_data.data, "<root>updated xml document</root>"); 
    length= strlen(xml_data.data);
    id = 2;
    EXEC SQL UPDATE BLOGS SET XML_DATA = :xml_data WHERE ID = :id;

Notice that an XML document is treated as an integral element in all the operations above. The ID column is used to determine the appropriate document.

You might want to have more flexibility in your SQL statements than host variables can provide. Dynamic SQL can help in this situation. Let’s look at an example in the following paragraph.   

Dynamic SQL

If a program must use so many different types and structures of SQL statements that it cannot contain a model for each one, consider using dynamic SQL. 

Unlike static SQL statements, dynamic SQL statements cannot contain host variables. The only way to use host variables is to use parameter markers. A parameter marker is designated with a “?” sign within the prepare statement. When included in a CAST function, it is called “typed.” The second type of parameter marker, without the CAST function, is untyped.  Using the typed parameter marker is highly recommended. It is also more efficient. The following program modifies the previous example to insert an XML document with dynamic SQL:  

#include "stdio.h"
#include "string.h"

EXEC SQL INCLUDE SQLCA;           
EXEC SQL BEGIN DECLARE SECTION;  
    long int id;
    SQL TYPE IS CLOB(2M) stmt;         
    SQL TYPE IS XML AS CLOB(4M) xml_data;
EXEC SQL END DECLARE SECTION;
int main(void) {
    sprintf(stmt.data, "INSERT INTO BLOGS (XML_DATA) 
VALUES CAST(? AS CLOB(4M))"); 
    stmt.length = strlen(stmt.data);                               
    EXEC SQL PREPARE ins1 FROM :stmt;                              
    strcpy(xml_data.data, "<root>my xml document</root>");         
    xml_data.length= strlen(xml_data.data);                       
    EXEC SQL EXECUTE ins1 USING :xml_data;                        
    EXEC SQL COMMIT;                                               
    return(sqlca.sqlcode);                       
} //main

Conclusion

We have discussed how you can insert, retrieve, update, or delete an XML document in DB2 for z/OS using static and dynamic SQL. There are other methods which might be more effective in certain situations. What if you need to insert hundreds or even thousands of XML documents? What if you cannot hold the entire document in the memory? LOAD and UNLOAD utilities and file reference variables can be very helpful in those cases. We will discuss them in the next article.

References:

1 Comment
4 Likes

Very good introductory content.

April 13, 2016 06:11 AM by Nicolás R. Ascanio Pena


I like your comment , I'll do get other people.

Successes .!!!!

Recent Stories
Introduction to Db2 for z/OS System Profiles

10 questions to review and understand your active log data set configuration

DB2 for z/OS query support over FTP