Introduction to XML in DB2 11 for z/OS – Part 4 – XML schema

The last post about XML internals mentioned one item that deserves further discussion and this post is dedicated to it. The topic this time is an overview of XML schema validation in DB2 for z/OS. 

What is an XML schema?

When you exchange XML documents between different systems or applications, you may need a construct that ensures your applications will “understand” the XML documents that are being exchanged. An XML schema is such a construct. It describes the type and structure of an XML document. Its purpose is to define the valid building blocks of an XML document, such as the data type of elements and attributes, which elements and attributes can appear in the document, the number (and order) of their child elements, and so on. So one can create their own markup language based on XML. Many such languages exist, define in XML schemas. An XML schema language (XML Schema Definition language (XSD) in case of DB2, which is an XML in fact) is used to express a set of rules an XML document must follow in order to be considered "valid", according to that schema. Document Type Definition (DTD) language is also supported, but XSD is preferred. An XML schema consists of XML schema documents that contain the source language definitions of all schema components.

Refer to https://en.wikipedia.org/wiki/XML_Schema or to www.w3.org (advanced) to get more information about XML schema documents, standards and languages.

DB2 XML storage is independent of XML schema. XML schema are optional in DB2 and you do not need to validate an XML document against an XML schema in order to store the document into the DB2 database. But what if you want to validate XML documents? An XML schema must be registered in the DB2 XML Schema Repository before it can be used in an XML schema validation process. 

XML Schema Repository and registration

The XML Schema Repository (XSR) is a set of DB2 tables that store XML schemas used for the XML schema validation process. The key tables are SYSIBM.XSROBJECTS, SYSIBM.XSROBJECTCOMPONENTS, SYSIBM.XSROBJECTHIERARCHIES and SYSIBM.XSRANNOTATIONINFO. Note that all XML schema documents must be in the Unicode encoding scheme. To begin the registration process, you register the first XML schema document. Than you add any additional XML schema document that extend the existing XML. Finally, when all XML schema documents has been registered, you complete the registration. DB2 offers several ways how this process can be done:

  • Use the DB2-supplied stored procedures (available from DB2 10)
    • XSR_REGISTER, SYSPROC.XSR_ADDSCHEMADOC and SYSPROC.XSR_COMPLETE
  • Invoke the com.ibm.db2.jcc.DB2Connection.registerDB2XmlSchema JDBC method
    • Performs the SYSPROC.XSR_REGISTER, SYSPROC.XSR_ADDSCHEMADOC and SYSPROC.XSR_COMPLETE functions
  • Use the REGISTER, ADD and COMPLETE commands from the command line processor
  • You can also use IBM Data Studio, which simplifies registering the XML schema.

To remove an XML schema from the DB2 XSR, you can

  • Use the DB2-supplied stored procedure SYSPROC.XSR_REMOVE
  • Invoke the com.ibm.db2.jcc.DB2Connection.deregisterDB2XMLObject JDBC method

The following examples register the BLOG2016 XML schema.

  • Stored procedures example:
  •  CALL SYSPROC.XSR_REGISTER (
    'SYSXSR',
        'BLOG2016',
        'www.blogs.cz/Blog2016.xsd',
        :schema_content,
        :schema_property)
    CALL SYSPROC.XSR_COMPLETE
        'SYSXSR',
        'BLOG2016',
        :schema_property, 0)
  • Command line processor example:
  • REGISTER XMLSCHEMA www.blogs.cz/Blog2016.xsd 
    FROM file:///u/user/blogs/Blog2016.xsd AS BLOG2016
    COMPLETE
  • Data Studio example:
    DS1.jpg
    DS2.jpg

Refer to Command Line Processor on DB2 for z/OS article on this blog, to get more information how to use CLP.

Once an XML schema is properly registered in the DB2 XSR, it can be used for the validation process for any XML document that is being stored in the DB2 database (on the same subsystem). The following section describes how the validation can be done. 

XML schema validation

Generally, there are two different ways to validate an XML document against the XML schema that is stored in the DB2 XSR: Automatically, whenever the XML document is inserted or updated; or manually, on request for a particular insert or update statement.

Whether an XML document is to be validated automatically is determined at the column creation time by adding an XML type modifier to an XML column definition. Use the XML type modifier to associate an XML schema (or several XML schemas) with the XML data type. This XML type modifier enforces validation of all XML documents that are stored in this XML column against one of the XML schemas that are specified by this type modifier. The XML schema is specified by either its registered XML schema name (ID), or the target namespace URI and the location URI. Remember that whatever is used, the specification must uniquely identify the registered XML schema in XSR. If, for example, two XML schemas are registered with the same target namespace URI, the location URI must also be used to identify the schema that applies.    

Consider the following registered schemas and examples:

XML_schema1.jpg

  • CREATE TABLE BLOGS (
    ID INTEGER GENERATED ALWAYS AS IDENTITY,
        XML_DATA XML(XMLSCHEMA URI 'blogs.cz/Blog'));
    An SQL error occurs when this statement is executed, as the namespace URI identifies two registered schemas. 
  • CREATE TABLE BLOGS (
    ID INTEGER GENERATED ALWAYS AS IDENTITY,
        XML_DATA XML(XMLSCHEMA URI 'blogs.cz/Blog'                                    
        LOCATION 'www.blogs.cz/Blog2016.xsd'));
    In order to fix the error, add a location URI to uniquely identify the BLOG2016 schema. 
  • ALTER TABLE BLOGS (
    ALTER XML_DATA XML(
        XMLSCHEMA IDBLOG2015);
    If you want your documents to conform to a different schema, you can ALTER the XML type modifier. The registered ID uniquely identifies the BLOG2015 schema.  

If the XML type modifier is not used when creating or altering the XML column, the DSN_XMLVALIDATE function can be used to request schema validation. The XML schema is identified similarly to how it works with an XML type modifier, by either the registered schema ID, or the namespace URI and location URI. Using the same schemas above, consider the following examples to request schema validation on insert:

  • INSERT INTO BLOGS VALUES(
    DSN_XMLVALIDATE('<root>my xml document</root>',
                    'SYSXSR.BLOG2016'));  
  • INSERT INTO BLOGS VALUES(
    DSN_XMLVALIDATE('<root>my xml document</root>',
                     'blogs.cz/Blog',
                     'www.blogs.cz/Blog2016.xsd'));

If the XML document is not valid against the requested XML schema, it is not stored in the database and DB2 issues SQL -20399 ERROR: ERROR ENCOUNTERED DURING XML VALIDATION. To check whether an XML document that is stored in a table has previously been validated, use the SQL XMLXSROBJECTID scalar function. It returns the registered XML schema identifier of the XML schema that was used to validate the XML document, or 0 if the document has not been validated.

Note that XML schema validation introduces a performance overhead, so the different situations and use cases require different approaches. If you are working mainly with XML documents from trusted applications, you probably do not want to use automatic schema validation due to the impact on performance. Or you may consider using the binary XML format (to be covered later). 

Conclusion

An XML schema is an integral part of any implementation of XML document usage. It ensures that all applications and database subsystem work with valid XML documents according to a particular need. We have covered the basics of how DB2 stores XML schemas and how to validate an XML document against the schema registered in DB2 Schema Repository. For more information about XML Schemas and schema validation in DB2, refer to the links below.

References:

pureXML Guide

https://en.wikipedia.org/wiki/XML_Schema or to www.w3.org (advanced) to get more information about XML schema documents and standards.

Recent Stories
Introduction to XML in DB2 11 for z/OS – Part 6 – XQuery basics

DB2 11 Fundamentals for z/OS

Introduction to XML in DB2 11 for z/OS – Part 5 – XPath basics