Pattern Matching using Regular Expression and Utilizing Services outside Db2 for z/OS

With decade’s development of Db2 for z/OS, it provides a lot of wonderful features in each release.  Among these features, some of them are extremely useful but less “well-known”. Pattern matching using regular expression is one of these hidden gems. In the following sections, I will give a short introduction on this topic.

While Db2 for z/OS is evolving, same thing happen in the outside world. In this era, we cannot close our eyes and ignore whatever happen outside Db2. It is critical to combine the strength of both sides to build a better product or service. In the following sections, I will show you how you can leverage these outside services and information from Db2 using REST.

Pre-requisite

  • Pattern matching using regular expression : Db2 10 for z/OS
  • DB2 REST UDFs: HTTPGETCLOB (for sending services outside Db2): Db2 11 for z/OS. Some configuration is required, see Resources section for details.
  • JSON SQL APIs: Db2 11 for z/OS

Use SQL to do pattern matching using Regular Expression

Db2 for z/OS provides LIKE predicate to do pattern matching, in which we use

  • the underscore character(_) represents any single character
  • the percent sign(%) represents a string of zero or more characters

However, this may not be sufficient in certain scenarios or when the query become very complicated to achieve what we are looking for.

For examples, let’s have a look in SYSIBM.SYSDATABASE catalog table:

SELECT NAME, DBID FROM SYSIBM.SYSDATABASE;
   +--------------------------------------+
   |           NAME           |   DBID    |
   +--------------------------------------+
 1_| DSN00001                 |       266 |
 2_| DSN00002                 |       267 |
 3_| DSNOPTDB                 |       276 |
 4_| DSNATPDB                 |       256 |
 5_| DSN5JSDB                 |       277 |
 6_| DSNADMDB                 |       259 |
 7_| DSNDB01                  |         1 |
 8_| DSNDB04                  |         4 |
 9_| DSNDB06                  |         6 |
10_| DSNMQDB                  |       262 |    
(Result truncated)

Suppose we want to find all the NAME and DBID with NAME start with “DSN”, but end with a number. How to write the SQL statement? NAME LIKE ‘DSN%’, then what? We probably need a lot of AND/OR?

Fortunately, the XML features in Db2 for z/OS allow us to do pattern matching using regular expression. To achieve what we are looking for above, we use an XPATH function, fn:matches , as follows:

SELECT NAME, DBID
FROM SYSIBM.SYSDATABASE
WHERE XMLEXISTS
(  '$newDoc[fn:matches(., "^(DSN).*[0-9]$")]' PASSING
      XMLQUERY('<doc>{$NameCol}</doc>' PASSING NAME as "NameCol")
      as "newDoc");

fn:matches only takes XML as input. So, we first create an XML document using the NAME column content using XQuery Constructor (see highlighted above). Then, we pass the constructed XML (“newDoc”) as input to fn:matches function. The second parameter of fn:matches is pattern we search for. It supports regular expression.

Output:

    +--------------------------------------+
    |           NAME           |   DBID    |
    +--------------------------------------+
  1_| DSN00001                 |       266 |
  2_| DSN00002                 |       267 |
  3_| DSN00005                 |       278 |
  4_| DSN00006                 |       279 |
  5_| DSN00007                 |       280 |
  6_| DSN00008                 |       281 |
  7_| DSN00009                 |       282 |
  8_| DSNDB01                  |         1 |
  9_| DSN00011                 |       284 |
 10_| DSN00010                 |       283 |
 (Result truncated)

As you can see from this example, XPATH function, fn:matches further enhance the pattern matching in Db2 by supporting regular expression.

Now, we can do pattern matching using regular expression in DB2!!

Utilizing Services outside Db2 for z/OS

To utilize the services outside Db2 for z/OS, we need to send out a service request and then consume the service response.

Use SQL to send a REST request

HTTPGETCLOB is a Db2 UDF that sends a request to a specified URL through an HTTP GET request. It takes two parameters: URL and an optional http header. Output is a CLOB (5M). There are other similar UDFs, like HTTPGETBLOB, etc. See Resources section for details.

Suppose we are asking for a stock quote for IBM to Yahoo. In a web browser (like Firefox), we can enter the following url:

https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22IBM%22)&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys

You may see the following result:

pic1.png

(Result truncated)

Now, let’s write our SQL to do the same thing. It is simple enough, just put the url (the one we enter in the web browser above) as the first parameter to DB2XML.HTTPGETCLOB function; the second parameter is not required in this case.

SELECT DB2XML.HTTPGETCLOB(  CAST 
('https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22IBM%22)&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys' AS VARCHAR(255)),
  CAST(NULL AS CLOB(1K)))
FROM SYSIBM.SYSDUMMY1;

You may see the following output:

{"query":{"count":1,"created":"2017-08-13T04:38:40Z","lang":"en-US","diagnostics":{"url":[{"execution-start-time":"1","execution-stop-time":"1","execution-time":"0","content":"http://www.datatables.org/yahoo/finance/yahoo.finance.quotes.xml"},{"execution-start-time":"4","execution-stop-time":"5","execution-time":"1","content":"http://download.finance.yahoo.com/d/quotes.csv?f=aa2bb2b3b4cc1c3c4c6c8dd1d2ee1e7e8e9ghjkg1g3g4g5g6ii5j1j3j4j5j6k1k2k4k5ll1l2l3mm2m3m4m5m6m7m8nn4opp1p2p5p6qrr1r2r5r6r7ss1s7
  1 record(s) selected

(Output has been truncated.)

Use SQL to consume a REST response in JSON format

As you may notice above, the format of REST response is JSON. We can use the Db2 JSON features to extract the fields we want. Suppose we want to extract the symbol, DaysLow, DaysHigh, YearLow, and YearHigh from the response (see red circle in the output above), we can use the following SQL statement:

With TABLEA as(
SELECT DB2XML.HTTPGETCLOB(
  CAST ('https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22IBM%22)&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys' AS VARCHAR(255)),
  CAST(NULL AS CLOB(1K))) AS RESPONSE
FROM SYSIBM.SYSDUMMY1)
SELECT  JSON_VAL(SYSTOOLS.JSON2BSON(RESPONSE), 'query.results.quote.symbol', 's:20') as symbol,
  JSON_VAL(SYSTOOLS.JSON2BSON(RESPONSE), 'query.results.quote.DaysLow', 'f') as DaysLow,
  JSON_VAL(SYSTOOLS.JSON2BSON(RESPONSE), 'query.results.quote.DaysHigh', 'f') as DaysHigh,
  JSON_VAL(SYSTOOLS.JSON2BSON(RESPONSE), 'query.results.quote.YearLow', 'f') as YearLow,
  JSON_VAL(SYSTOOLS.JSON2BSON(RESPONSE), 'query.results.quote.YearHigh', 'f') as YearHigh
FROM TABLEA;

Output:

SYMBOL       DAYSLOW       DAYSHIGH     YEARLOW     YEARHIGH
IBM          141.52        142.59       141.19      182.79
  1 record(s) selected

 

We first use an inline table (common table expression) TABLEA to store the REST output as RESPONSE. With this approach, we don’t need to store the REST response/result in a physical table. Then, we use JSON2BSON function to convert RESPONSE to BSON (binary JSON) format. Finally, we use JSON_VAL function to extract symbol, DaysLow, DaysHigh, YearLow, and YearHigh value. If the values we want to extract is inside a array, JSON_TABLE function may be needed.

Use SQL to consume a REST response in XML format

 Although JSON is the most common REST response format, occasionally, the response may be in XML format.

 For the following request:

https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22IBM%22)&format=xml&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2FalltableswithkeysYou may get the following XML output:

pic2.png

Suppose we want to extract the symbol, DaysLow, DaysHigh, YearLow, and YearHigh from the XML response (see red circle in the output above), there are two options. The first one does not use XMLTABLE function while the second one uses XMLTALBE function.

-- not use XMLTABLE
With TABLEA as(
SELECT DB2XML.HTTPGETCLOB(
  CAST ('https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22IBM%22)&format=xml&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys' AS VARCHAR(255)),
  CAST(NULL AS CLOB(1K))) AS RESPONSE
FROM SYSIBM.SYSDUMMY1)
SELECT
  XMLCAST(XMLQUERY('/query/results/quote/@symbol' PASSING XMLPARSE(RESPONSE)) AS VARCHAR(10)) as symbol,
  XMLCAST(XMLQUERY('/query/results/quote/DaysLow' PASSING XMLPARSE(RESPONSE)) AS DOUBLE) as DaysLow,
  XMLCAST(XMLQUERY('/query/results/quote/DaysHigh' PASSING XMLPARSE(RESPONSE)) AS DOUBLE) as DaysHigh,
  XMLCAST(XMLQUERY('/query/results/quote/YearLow' PASSING XMLPARSE(RESPONSE)) AS DOUBLE) as YearLow,
  XMLCAST(XMLQUERY('/query/results/quote/YearHigh' PASSING XMLPARSE(RESPONSE)) AS DOUBLE) as YearHigh
FROM TABLEA;

Similar to the way we consume JSON output, we first use an inline table (common table expression) TABLEA to store the REST XML output as RESPONSE.  We pass the XML response (RESPONSE) to XMLPARSE function to make it XML datatype, then use XMLQUERY function to extract the fields we want. Finally, we use XMLCAST function to cast the result to SQL type we want.

 

The second option is to use XMLTABLE function to extract the XML fields we want.

-- use XMLTABLE
With TABLEA as(
SELECT DB2XML.HTTPGETCLOB(
  CAST ('https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22IBM%22)&format=xml&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys' AS VARCHAR(255)),
  CAST(NULL AS CLOB(1K))) AS RESPONSE
FROM SYSIBM.SYSDUMMY1)
SELECT XT.* FROM
TABLEA, XMLTABLE(
  '/query/results/quote' PASSING XMLPARSE(RESPONSE)
  COLUMNS
    symbol   VARCHAR(10)   PATH '@symbol',
    DaysLow  DOUBLE        PATH 'DaysLow',
    DaysHigh DOUBLE        PATH 'DaysHigh',
    YearLow  DOUBLE        PATH 'YearLow',
    YearHigh DOUBLE        PATH 'YearHigh') XT;

Output:

SYMBOL     DAYSLOW                DAYSHIGH               YEARLOW      YEARHIGH
IBM        141.51999999999998     142.58999999999997     141.19     182.79
  1 record(s) selected

Summary

There are a lot of practical SQLs for developer. In this article, we talk about pattern matching using regular expression, how to invoke external REST services using HTTPGETCLOB, and how to consume the REST response in JSON and XML using the JSON or XML features in Db2 for z/OS. Hope you can take advantage of them in your application.

 

Resources

Creating REST user-defined functions
https://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/inst/src/tpc/db2z_createrestudfs.html

HTTPGETBLOB and HTTPGETCLOB
https://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/sqlref/src/tpc/db2z_udf_httpget.html

Regular Expression
https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/xml/src/tpc/db2z_regularexpression.html

Use a SQL Interface to handle JSON data in DB2 11 for z/OS
http://www.ibm.com/developerworks/data/library/techarticle/dm-1501sql-json-db2/index.html

JSON_VAL
http://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/json/src/tpc/db2z_bif_jsonval.html

JSON_TABLE
http://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/json/src/tpc/db2z_bif_jsontable.html

Convert JSON data to Relational Format
http://www.worldofdb2.com/profiles/blogs/convert-json-data-to-relational-format

1 Like
Recent Stories
Db2 for z/OS locking basics

A Beginners Guide to Locks, Latches, Claims and Drains in DB2 for z/OS by Steve Thomas

Fun with Date Ranges