LOAD FROM CURSOR Using Named Pipes

How many times did you wonder how you are going to move data from one database to another? Should you ask for yet another way-too-slow NFS mount between two servers? Will you first dump the data on the disks of the source server, maybe compress it and then copy the file to the target server, then uncompress it before you can use it? Do both source and target have disk space enough to hold the data file in the first place?

All those questions I had a while ago once again when being challenged to copy a 2.7 TB database to a 500 GB disk. This is of course an impossible task - a 500 GB target disk is never going to be big enough to hold the files with extracted data and the database into which the data would be copied. I thought I would have to come up with a way to copy just a subset of the data.

I came across, for me, a new feature within the “LOAD FROM CURSOR” statement: using named pipe. I  found that this “new” feature has been around for quite some time, but that doesn’t mean it isn’t worth mentioning! By using this method, I would not have to dump data first to any kind of a disk but instead open up a data stream on the source side and insert the data directly into the target database. Another plus is that you do not need to flip the FEDERATED database manager configuration attribute to YES (and stop/start the instance), which you would have to do if you wanted to use LOAD FROM CURSOR against federated nicknames. There are multiple ways to get the job done using this method, but these are the steps I used to the job done – all on the target database server:

1. Create a file – using SQL – with a list of tables I needed

As some tables have definitions a load should know about, I needed first a SQL script which dumped the necessary information. Other scripts would need this information to come up with the correct syntax, e.g. a load command using the appropriate keywords for tables using identity columns, LOBs or XML. Check the doGetTables.sql script included below.

2. Catalog the source database locally

Catalog first the source database server as node and catalog the source database for this node.

3. Create a script to open up a communication pipe channel and start the export

At first, I wanted to have a script that could load from multiple cursors at the same time and therefore I created a pipe per table. During the testing phase of my script it seemed that the number of concurrent open pipes is limited. I kept the approach of opening an individual pipe per exported table in my script but eventually didn’t go through trying to load using multiple parallel sessions. The script startExportPusher.ksh is included below too. An example of such an export stripped from any overhead could be:

db2 connect to SRC_SAMPLE user ... using ...
# You have totally freedom in choosing the name of the pipe,
#   here ‘copyData.pipe’
mkfifo copyData.pipe
db2 "EXPORT TO copyData.pipe OF IXF MESSAGES copyData.out.msg SELECT * FROM SRC_INST.EMPLOYEES"
rm -f copyData.pipe
db2 connect reset

These few lines will open up a communication channel, start the export and wait for the receiving script to transfer all the data. Once all the data is copied, the communication channel is removed and the connection towards the source database is closed. 

4. Create a second script listening on the communication channel and load the data from cursor

With one script pulling data from the source database and pushing it through the pipe communication channel, we need another script to pick up the data stream and load it into the database. The script startLoadListener.ksh does exactly that. A clean example could be:

db2 connect to TRG_SAMPLE
db2 "LOAD FROM copyData.pipe OF IXF MESSAGES copyData.in.msg REPLACE INTO DB2INST1.EMPLOYEES NONRECOVERABLE"
db2 connect reset

 Afterwards you’ll still need to perform some checks:

    • Are the number of exported rows (found in copyData.out.msg) the same as the number of committed rows (copyData.in.msg)?
    • Is there data lost while checking the integrity of the data?

SQL and Script Files

doGetTables.sql

with config(TABSCHEMA, NUMBER_OF_RECORDS) AS (
VALUES ('TESTER', 0)
),

getTableInfo AS (
SELECT
t.TABSCHEMA,
t.TABNAME
FROM
SYSCAT.TABLES T,
config
WHERE T.TABSCHEMA = config.TABSCHEMA
AND T.TBSPACE NOT IN ('SYSTOOLSPACE')
AND T.TABNAME NOT LIKE 'ADVISE_%'
AND T.TABNAME NOT LIKE 'EXPLAIN_%'
AND T.TYPE = 'T'
),

getColumnInfo AS (
SELECT
t.TABSCHEMA,
t.TABNAME,
c.IDENTITY,
c.GENERATED
FROM
getTableInfo t,
SYSCAT.COLUMNS c
WHERE
c.TABSCHEMA = t.TABSCHEMA
AND c.TABNAME = t.TABNAME
GROUP BY
t.TABSCHEMA, t.TABNAME, c.IDENTITY, c.GENERATED
),

decisionMatrix AS (
SELECT
t.TABSCHEMA
, t.TABNAME
, COALESCE ( (SELECT
'Y'
FROM
getColumnInfo C
WHERE
c.TABSCHEMA = t.TABSCHEMA
AND c.TABNAME = t.TABNAME
AND c.IDENTITY = 'Y'
), 'N') as IDENTITY
, CASE
WHEN COALESCE ( ( SELECT
1
FROM
getColumnInfo C
WHERE
c.TABSCHEMA = t.TABSCHEMA
AND c.TABNAME = t.TABNAME
AND c.IDENTITY = 'N'
AND c.GENERATED IN ('A','D')
) , 0 ) = 1
THEN
( SELECT
c.GENERATED
FROM
getColumnInfo C
WHERE
c.TABSCHEMA = t.TABSCHEMA
AND c.TABNAME = t.TABNAME
AND c.IDENTITY = 'N'
AND c.GENERATED IN ('A','D')
)
ELSE 'N'
END as GENERATED_1
, CASE
WHEN COALESCE ( ( SELECT
1
FROM
getColumnInfo C
WHERE c.TABSCHEMA = t.TABSCHEMA
AND c.TABNAME = t.TABNAME
AND c.IDENTITY = 'Y'
AND c.GENERATED = 'D'
) , 0 ) = 1
THEN 'D'
ELSE 'N'
END as GENERATED_2
, COALESCE( (SELECT
DISTINCT 'Y'
FROM
SYSCAT.COLUMNS C
WHERE
c.TABSCHEMA = t.TABSCHEMA
AND c.TABNAME = t.TABNAME
AND ( c.TYPENAME LIKE '%LOB%'
OR c.TYPENAME LIKE '%XML%')
), 'N') HAS_LOBS
FROM
getTableInfo t
)

SELECT
TRIM(dM.TABSCHEMA) || ':' || TRIM(dM.TABNAME) || ':' ||
TRIM(dM.IDENTITY) || ':' ||
CASE WHEN dM.GENERATED_1 = 'N' AND dM.GENERATED_2 <> 'N'
THEN dM.GENERATED_2
ELSE dM.GENERATED_1
END || ':' ||
dM.HAS_LOBS ||
CASE WHEN c.NUMBER_OF_RECORDS > 0
THEN ':' || TRIM(CAST(c.NUMBER_OF_RECORDS as VARCHAR(10)))
ELSE ''
END
FROM
decisionMatrix dM,
config c
ORDER BY
dM.TABSCHEMA, dM.TABNAME
@

startLoadListener.ksh

#!/bin/ksh

typeset lDatabase=

function stopWorking {

db2 connect reset > /dev/null 2>&1
exit

}


typeset lTableDefs=$( cat tableList.txt | egrep -v '^$|^#|^\-\-' )
typeset -u lToets="Next"

typeset -i lNumberOfTables=$( echo "${lTableDefs}" | wc -l | tr -d ' ' )
typeset -i lCurrenTableNo=1

db2 connect to ${lDatabase}
for lTableDef in ${lTableDefs}
do
typeset lSchema=$( echo "${lTableDef}" | cut -d ':' -f1 )
typeset lTable=$( echo "${lTableDef}" | cut -d ':' -f2 )
typeset lIdentity=$( echo "${lTableDef}" | cut -d ':' -f3 )
typeset lGenerated=$( echo "${lTableDef}" | cut -d ':' -f4 )
typeset lHasLobs=$( echo "${lTableDef}" | cut -d ':' -f5 )
typeset lNumberOfRecords=$( echo "${lTableDef}" | cut -d ':' -f6 )

echo "${lSchema}.${lTable}"
rm -f ${lSchema}_${lTable}.in.msg
typeset lLoadCmd="load from datapipe.${lSchema}.${lTable} of ixf
modified by ##Identity## ##Generated## ##Lobs##
messages ${lSchema}_${lTable}.in.msg
REPLACE INTO "${lSchema}"."${lTable}"
NONRECOVERABLE
"

##
## Decision table
##
## IDENTITY COLUMN | GENERATED | IdentityOverride | GeneratedOveride |
## ------------------+-------------+--------------------+-------------------+
## Yes | Always | | |
## - | Always | Yes | Yes |
## - | Always | | |
## ------------------+-------------+--------------------+-------------------+
## Yes | Default | No | No |
## ------------------+-------------+--------------------+-------------------+
## Yes | Always | Yes | No |
## ------------------+-------------+--------------------+-------------------+
## - | Always | No | Yes |
## ------------------+-------------+--------------------+-------------------+
##

if [ "${lIdentity}" == "Y" ] ; then
if [ "${lGenerated}" == "D" ] ; then
lLoadCmd=$( echo "${lLoadCmd}" | sed -e 's/##Identity##//g' -e 's/##Generated##//g' )
fi
if [ "${lGenerated}" == "A" ] ; then
lLoadCmd=$( echo "${lLoadCmd}" | sed 's/##Generated##/GENERATEDOVERRIDE/g' )
fi
lLoadCmd=$( echo "${lLoadCmd}" | sed 's/##Identity##/IDENTITYOVERRIDE/g' )
fi
if [ "${lGenerated}" == "A" ] ; then
lLoadCmd=$( echo "${lLoadCmd}" | sed 's/##Generated##/GENERATEDOVERRIDE/g' )
fi
if [ "${lHasLobs}" == "Y" ] ; then
lLoadCmd=$( echo "${lLoadCmd}" | sed 's/##Lobs##/LOBSINFILE/g' )
fi

lLoadCmd=$(
echo "${lLoadCmd}" \
| sed -e 's/ ##[A-Za-z]*##//g' \
| grep -iv '^[ ]*modified by[ ]*$'
)

echo "--"
echo "(${lCurrenTableNo}/${lNumberOfTables}) ${lLoadCmd}:"
db2 "${lLoadCmd}"

[[ ${lCurrenTableNo} -lt ${lNumberOfTables} ]] && read lToets?"Press a key to continue"
[[ "${lToets}" == "S" ]] && stopWorking
lCurrenTableNo=$(( lCurrenTableNo + 1 ))
done
stopWorking

StartExportPusher.ksh

#!/bin/ksh

typeset lDatabase=
typeset lUser=
typeset lPwd=

function stopWorking {

db2 connect reset > /dev/null 2>&1
rm -f datapipe.*
exit
}

typeset lTableDefs=$( cat tableList.txt | egrep -v '^$|^#|^\-\-' )
typeset -u lToets="Next"

typeset -i lNumberOfTables=$( echo "${lTableDefs}" | wc -l | tr -d ' ' )
typeset -i lCurrenTableNo=1
rm -f datapipe.*
db2 connect to ${lDatabase} user ${lUser} using ${lPwd} > /dev/null 2>&1
for lTableDef in ${lTableDefs}
do
typeset lSchema=$( echo "${lTableDef}" | cut -d ':' -f1 )
typeset lTable=$( echo "${lTableDef}" | cut -d ':' -f2 )
typeset lIdentity=$( echo "${lTableDef}" | cut -d ':' -f3 )
typeset lGenerated=$( echo "${lTableDef}" | cut -d ':' -f4 )
typeset lHasLobs=$( echo "${lTableDef}" | cut -d ':' -f5 )
typeset lNumberOfRecords=$( echo "${lTableDef}" | cut -d ':' -f6 )
typeset lOrderBy=$( echo "${lTableDef}" | cut -d ':' -f7 )

mkfifo datapipe.${lSchema}.${lTable}
rm -f ${lSchema}_${lTable}.out.msg

typeset lExportCmd="export to datapipe.${lSchema}.${lTable} of ixf
##Lobs##
messages ${lSchema}_${lTable}.out.msg
SELECT * FROM "${lSchema}"."${lTable}"
##OrderBy##
##Fetch##
WITH UR
FOR READ ONLY
"
if [ "${lHasLobs}" == "Y" ] ; then
lExportCmd=$( echo "${lExportCmd}" | sed "s/##Lobs##/LOBS TO ./g" )
fi
if [ "${lNumberOfRecords}" -gt 0 ] ; then
lExportCmd=$( echo "${lExportCmd}" | sed "s/##Fetch##/FETCH FIRST ${lNumberOfRecords} ROWS ONLY/g" )
fi
if [ "${lOrderBy}" != "" ] ; then
lExportCmd=$( echo "${lExportCmd}" | sed "s/##OrderBy##/ORDER BY ${lOrderBy}/g" )
fi
lExportCmd=$( echo "${lExportCmd}" | egrep -v "##Fetch##|##Lobs##|##OrderBy##" )

echo "--"
echo "${lSchema}.${lTable} (${lCurrenTableNo}/${lNumberOfTables})"
echo "${lExportCmd}"
db2 "${lExportCmd}"
rm -f datapipe.${lSchema}.${lTable}

[[ ${lCurrenTableNo} -lt ${lNumberOfTables} ]] && read lToets?"Press a key to continue"
[[ "${lToets}" == "S" ]] && stopWorking
if [ "${lHasLobs}" == "Y" ] ; then
rm -f datapipe.${lSchema}.${lTable}.*.lob
fi
lCurrenTableNo=$(( lCurrenTableNo + 1 ))
done
stopWorking

1 Like
Recent Stories
Pattern Matching using Regular Expression and Utilizing Services outside Db2 for z/OS

How to get started with Db2 for z/OS and Apache Spark on a distributed platform

Db2 12 for z/OS – Two Highly Requested SQL Enhancements – Part 2