Introduction to XML in DB2 11 for z/OS – Part 2

Introduction to XML in DB2 11 for z/OS – Part 2

In the previous post, we discussed how you can insert, retrieve, update, or delete an XML document in DB2 for z/OS using static and dynamic SQL. However, in certain situations those methods are not effective. For example, those methods are insufficient if you need to insert hundreds or even thousands of XML documents at once, or if your application cannot hold the entire document in the memory when manipulating it. You need more powerful approaches for such situations. LOAD and UNLOAD utilities or file reference variables can help you with these tasks. Let’s take a closer look how these options can be used in today’s installment of this Introduction to XML in DB2 11 for z/OS series.

LOAD

The LOAD utility is the preferred option when you need to insert many XML documents at once, or when you need to insert an XML document that is too big for other methods. You can use the utility to load an XML document directly from an input record. Alternatively, you can insert an XML document from a referenced file. This second option allows you to insert an XML document up to 2GB in size, which is the DB2 limit for XML documents. The first option imposes the same 32k limit as with static SQL, because an XML document in an input record is treated as a VARCHAR. Note the document is also parsed when inserted and only well-formed documents are accepted.

Two simple examples of LOAD utility statements follow. The first example shows how to insert an XML document from the input record. The second one shows how to insert it using the reference file.

  • Example of LOAD where the XML comes from the input record of a delimited file. The PRESERVE WHITESPACE parameter causes the whitespaces are not removed when the XML document is parsed.
    LOAD DATA LOG NO INDDN SYSREC REPLACE
    FORMAT DELIMITED
    INTO TABLE BLOGS
    (
    NAME VARCHAR,
    XML_DATA XML PRESERVE WHITESPACE
    )

    Where SYSREC contains:

    First,<root>my first xml document</root>
    Second,<root>my second xml document</root>
  • Example of LOAD where the XML comes from a reference file. Note the CLOBF parameter which indicates that the input field contains the name of a CLOB file.
    LOAD DATA LOG NO INDDN SYSREC REPLACE 
    FORMAT DELIMITED
    INTO TABLE BLOGS
    (
    NAME VARCHAR,
    XML_DATA VARCHAR CLOBF
    )

    Where SYSREC contains:

    First,HLQ.XMLDATA1
    Second,HLQ.XMLDATA2
    the HLQ.XMLDATA1 dataset contains:
    <root>my first xml document</root>
    the HLQ.XMLDATA2 dataset contains:
    <root>my second xml document</root>

UNLOAD

Use the UNLOAD utility to unload rows from an entire tablespace, a partition, or a table. You can also select specific columns of a table. As in the case of LOAD, you have two options: Unload an XML document into an output record, or unload an XML document into a separate file. Both options have the same limits as LOAD does. Consider the following examples:

  • Unloading the XML document directly to the output record
    UNLOAD DATA FROM TABLE BLOGS
    (NAME VARCHAR, XML_DATA XML)
    DELIMITED

    After the unload the SYSREC dataset contains:

    "First","<?xml version="1.0" encoding="IBM500"?><root>my first XML document</root>"
    "Second","<?xml version="1.0" encoding="IBM500"?><root>my second XML document</root>"
  • Unloading the XML document into a file
    Use the TEMPLATE control statement to create the XML output file and filename. If the data set type is not specified on the template, UNLOAD will use PDS. Note the CLOBF parameter which indicates that the output field contains the name of a file to which the XML document is to be unloaded.
  • TEMPLATE XMLTEMPL DSN 'HILVL.&DB..&TS..UNLD' 
    UNIT SYSDA
    SPACE (1,1) CYL
    UNLOAD DATA FROM TABLE BLOGS
    (NAME VARCHAR,
    XML_DATA VARCHAR CLOBF XMLTEMPL)
    DELIMITED

    After the unload the SYSREC data set contains:

    "First","HILVL.DSN02739.XBLO0000.UNLD(FN0Y7CE7)"
    "Second","HILVL.DSN02739.XBLO0000.UNLD(FN0Y7CGO)"

    the HILVL.DSN02739.XBLO0000.UNLD(FN0Y7CE7) data set contains:

    <?xml version="1.0" encoding="IBM500"?><root>my first XML document</root>

    and the HILVL.DSN02739.XBLO0000.UNLD(FN0Y7CGO) data set contains:

    <?xml version="1.0" encoding="IBM500"?><root>my second XML document</root>

To improve I/O operations performance during unload processing, consider unloading the data in spanned record format. This is especially helpful with large XML documents. Using spanned record format, all XML data is written to an individual sequential file. Spanned record format can be also used for LOAD. The binary format of XML data also improves performance. We will cover this topic in more detail later.

File reference variables

If your application cannot hold an XML document in memory when manipulating it, or you need an efficient way to insert XML documents received from outside the application, consider using file reference variables. DB2 offers the use of file reference variables for XML data in the same manner as for LOBs.

File reference variables are variables that point to documents in the file system. You can use them to refer to the documents without reading the contents of the documents into the memory of the application. They can efficiently insert XML documents that you receive from outside the application (if you do not need to manipulate the contents before inserting). You can also use them to retrieve documents that you want to pass on because they are unmodified. Note the document is also parsed, and only well-formed XML documents are accepted for insertion into the DB2 table.

The following example shows how you can use the file reference variables in a C program to insert and retrieve an XML document from/to a file.

#include "stdio.h"
#include "string.h"

EXEC SQL INCLUDE SQLCA;          
EXEC SQL BEGIN DECLARE SECTION;          
    SQL TYPE IS XML AS CLOB_FILE xml_file;
    /*  Which results in the following DB2-generated structure:    
    struct {
        unsigned long name_length;  // File name length
        unsigned long data_length;  // Data length
        unsigned long file_options; // File options
        char name [255];            // File name
    } xml_file;
    */
EXEC SQL END DECLARE SECTION;

int main(void) {
    EXEC SQL CONNECT TO database;

    /* Insert document */
    strcpy(xml_file.name, "HQL.XML.IN");
    xml_file.name_length = strlen(xml_file.name);
    xml_file.file_options = SQL_FILE_READ; 

    EXEC SQL INSERT INTO BLOGS (XML_DATA) VALUES(:xml_file);

    /* Retrieve document */
    strcpy(xml_file.name, "HQL.XML.OUT");
    xml_file.name_length = strlen(xml_file.name);
    xml_file.file_options = SQL_FILE_CREATE; 

    EXEC SQL SELECT XML_DATA INTO :xml_file FROM BLOGS WHERE ID=n;
   
    return(sqlca.sqlcode);                       
} //main

The value of the file_options variable determines the operation performed with an XML document. The following table list all available options:

  • SQL_FILE_READ (constant value is 2)
    Read from a file – input from the application.
  • SQL_FILE_CREATE (constant value is 4)
    Write to a file – output from the database. If the file already exists, an error is returned.
  • SQL_FILE_APPEND (constant value is 8)
    Write to a file – output from the database. Appends the retrieved data into an existing file (new file is created, if one does not exist).
  • SQL_FILE_OVERWRITE (constant value is 16)
    Write to a file – output from the database. Overwrites any existing file (new file is created, if one does not exist).

Conclusion

In this post, we have discussed how you can insert or retrieve an XML document in DB2 for z/OS using LOAD and UNLOAD utilities and file reference variables. These approaches complement the use of static and dynamic SQL that was described in the first post. Our XML document was still an integral entity which could be manipulated only as a whole.

But what if you need to update only XML documents that contain a particular value in a particular element? Or you would like to select only a small part of the document?

You need a more robust construct than what we have just discussed. Before we move to the basics of XPath and XQuery, which will allow you to perform such operations, we will cover the storage concept of XML and how DB2 processes XML documents. Storage is a theoretical subject but it greatly helps in understanding how to use XPath or XQuery to work with XML documents.

References:

Recent Stories
Introduction to Db2 for z/OS System Profiles

10 questions to review and understand your active log data set configuration

DB2 for z/OS query support over FTP