Db2: Loading and unloading of data on remote clients with external tables
Unload Db2 data to remote client
Today, I am going to take a look at another variation of external tables. They can be used to load data from a remote Db2 client or unload data to that client – directly from or into a file. I tested it with the Db2 CLP and with a Python script and it was quite easy.
Remote clients
In my previous posts on Db2 External Tables, I showed examples where the files with the external data were located either on local file system available to the Db2 database server or on remote storage. Remote clients now add a third location type, the local file system available to a Db2 client.
The CREATE EXTERNAL TABLE statement has an option REMOTESOURCE. Its default value is LOCAL. It basically means "not remote" and that files are on the database server. Other possible values are "YES, "GZIP", and "LZ4". You can also specify "JDBC", "ODBC", or "OLE-DB" instead of "YES". In all cases it means that the data file is not on the database server, but resides on a different (remote) machine, typically the database client/app server. The values "GZIP" and "LZ4" indicate the compression algorithm to be applied before/after transfering the data. If only "YES" is specified, the data is not compressed on the wire.
Loading and unload data with external tables
In my last post on external tables, I explained named and transient external tables. You can use transient tables, those without a catalog entry, to load data from and unload data to a Db2 client, here shown for the Db2 Command Line Processor (CLP).
db2 "create external table 'cattables.csv' using (remotesource yes nolog true) as select tabname, tabschema from syscat.tables"
The above statement would result in a file on the client with data sourced from the SELECT statement (unload). Similarly, it is possible to load data from a file on the Db2 client and INSERT it into a database table:
db2 "insert into mytable select * from external 'mydata.csv' using (remotesource yes nolog true)"
The above, would also work when done in an app. The Db2 documentation has pages dedicated to Java and unloading or loading data from remote clients. I tested it with a Python script which I connected to Db2 on Cloud on IBM Cloud and "from outside" to Db2 Community Edition running in a container.
import ibm_db |
|
|
|
# connect to Db2 CE |
|
conn_str='database=testdb2;hostname=localhost;port=50000;protocol=tcpip;uid=db2inst1; |
|
ibm_db_conn = ibm_db.connect(conn_str,'','') |
|
|
|
|
|
# create table if not yet present and make sure it is empty |
|
ibm_db.exec_immediate(ibm_db_conn,'create table if not exists testcat(n varchar(128),s varchar(128))') |
|
ibm_db.exec_immediate(ibm_db_conn,'delete from testcat') |
|
|
|
# check that table is empty |
|
res=ibm_db.exec_immediate(ibm_db_conn,'select count(*) from testcat') |
|
if res is not None: |
|
row = ibm_db.fetch_tuple(res) |
|
if row is not None: |
|
print("Count: {}\n".format(row[0])) |
|
|
|
# load from local file on client to the (remote) Db2 server |
|
ibm_db.exec_immediate(ibm_db_conn, "insert into testcat select * from external 'my_cat_tables.csv' using (REMOTESOURCE YES)") |
|
|
|
# check that data was inserted |
|
res=ibm_db.exec_immediate(ibm_db_conn,'select count(*) from testcat') |
|
if res is not None: |
|
row = ibm_db.fetch_tuple(res) |
|
if row is not None: |
|
print("Count: {}\n".format(row[0])) |
|
|
|
# close connection |
|
ibm_db.close(ibm_db_conn) |
view rawdb2_remote_external_table.py hosted with ❤ by GitHub
In another script version I used a CREATE EXTERNAL TABLE statement similar to the one above to unload and download the data to the client. No handling of CSV files is necessary in the Python script, all done by the Db2 code.
About the Author
Dr. Henrik Loeser (“Data Henrik”) is an Open Group certified Distinguished IT Specialist. Based in Germany, he helps customers tackle data and AI-related problems and has been around the IT scene for some decades.
Original Post:
Data Henrik: Db2: Loading and unloading of data on remote clients with external tables
If you have feedback, suggestions, or questions about this post, please reach out to me on Mastodon (@data_henrik@mastodon.social) or LinkedIn.