Insert performance

Ivanna Vieira

Insert performance

Hello, I've always heard that using parameter markers in an insert operation performs better than using insert with literals. But I had an opposite experience that I'd like to understand.

 

I'm using DB2 11.1.3.3 on Linux.

 

I needed to insert 57000 rows into an empty table from a CSV file. I am not allowed to use the load utility in the target database.

 

I used some anonymous blocks as exemplified below for every 5000 rows (first I tried putting all rows within the same block, but it hit the command limit):

 

BEGIN
declare v_insert_text varchar(32672);
declare v_insert_stmt statement;
set v_insert_text = 'insert into <schema>.<target_table> (<list_of_columns>) values (?, ?, ?)';
prepare v_insert_stmt from v_insert_text;
execute v_insert_stmt using value1, value2, value3;
...
execute v_insert_stmt using value1, value2, value3;
commit;
END@
...
BEGIN
declare v_insert_text varchar(32672);
declare v_insert_stmt statement;
set v_insert_text = 'insert into <schema>.<target_table> (<list_of_columns>) values (?, ?, ?)';
prepare v_insert_stmt from v_insert_text;
execute v_insert_stmt using value1, value2, value3;
...
execute v_insert_stmt using value1, value2, value3;
commit;
END@

 

I put all the commands in a SQL file and executed the script using the db2 clp. It took 01:33:26 to complete. During execution, I noticed on the monitor that most of the time DB2 was working on the preparation of the statement, as far as I understood.

 

(Please find attached the monitor screen)

 

Then I truncated the table and did the same task using insert row to row with literals, again with commit every 5000 rows. It took 00:03:11 to complete.

 

Can anyone help me undestand what happened?

 

Best Regards,

Ivanna Vieira

Attachments

  • db2top.txt (3.7k)

Michael Hannan

RE: Insert performance
(in response to Ivanna Vieira)

Ivanna,

I am mainly a DB2 for zOS person, however it should be pointed out that if a Dynamic SQL statement is prepared it may well cost a lot more that the simple Insert especially if very few Indexes.

For Insert with parameter markers, you should be executing the Prepare once (in a normal application program), followed by the Insert many times (5000) with different parameter markers, although intervening Commit may cause a need to re Prepare.

I would not suggest commit after every row.

These are general principals for Dynamic SQL, not really related to your means of executing the script (I know know what things it does behind the scenes).

So are you executing Prepare too many times when you use markers instead of literals?

Michael Hannan,

DB2 Application Performance Specialist
CPT Global Ltd

Ivanna Vieira

RE: Insert performance
(in response to Michael Hannan)

Hello Michael,

Thank you for the analysis.

I'm not committing after each row but at every 5000 rows. And yes, the statement is being prepared many times since I'm starting a new anonymous block every 5000 rows. I'm doing this because I could not insert all 57000 rows in a single block.

Best Regards,

Ivanna Vieira

Walter Jani&#223;en

AW: Insert performance
(in response to Ivanna Vieira)
Hi

Did you try KEEPDYNAMIC (YES) bind-option?

Kind regards
Walter Janißen [standard_IBM+Champ+7+Yr+Analytics]

ITERGO Informationstechnologie GmbH
Anwendungsentwicklung
Technische Anwendungsarchitektur
Victoriaplatz 2
D-40198 Düsseldorf
[login to unmask email]<mailto:[login to unmask email]>

ITERGO Informationstechnologie GmbH
Vorsitzender des Aufsichtsrats: Christian Diedrich
Geschäftsführung: Dr. Bettina Anders (Vorsitzende),
Lothar Engelke, Ina Kirchhof, Dr. Michael Regauer
Sitz: Düsseldorf, Handelsregister: Amtsgericht Düsseldorf HRB 37996

Von: Ivanna Vieira <[login to unmask email]>
Gesendet: Dienstag, 27. November 2018 13:02
An: [login to unmask email]
Betreff: [DB2-L] - RE: Insert performance


Hello Michael,

Thank you for the analysis.

I'm not committing after each row but at every 5000 rows. And yes, the statement is being prepared many times since I'm starting a new anonymous block every 5000 rows. I'm doing this because I could not insert all 57000 rows in a single block.

Best Regards,

Ivanna Vieira

-----End Original Message-----
Attachments

  • image001.png (2.6k)

Michael Hannan

RE: Insert performance
(in response to Ivanna Vieira)

If you only Prepare once per 5000 Insert statements, destroyed by a COMMIT, then Prepare is not likely to dominate your costs.  Keep Dynamic is to retain prepared path longer but relies on Prepare not being re executed. Explicit Prepare should always do a Prepare. So my understanding is a program must be written with Keep Dynamic in mind, to be able to benefit from it.

So I am a bit baffled where the cost is going.

In Reply to Ivanna Vieira:

Hello Michael,

Thank you for the analysis.

I'm not committing after each row but at every 5000 rows. And yes, the statement is being prepared many times since I'm starting a new anonymous block every 5000 rows. I'm doing this because I could not insert all 57000 rows in a single block.

Best Regards,

Ivanna Vieira


Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Edited By:
Michael Hannan[Organization Members] @ Nov 27, 2018 - 02:38 PM (Europe/Berlin)

bernd oppolzer

Insert performance
(in response to Ivanna Vieira)
Hello Ivanna,

I'm doing (almost) the same as you do;
Prepare once and Execute often (thousands of time, until COMMIT -
the COMMIT frequency is controlled by a global parameter),
then Prepare again etc.;
I observed a significant performance benefit in my situation
by doing this instead of using literals in every insert and preparing
every time
(this was what I did in the beginning).

But there are some differences:

a) I am doing this in a C program; I don't know what kind of programming
environment you are using

b) I don't use parameter markers; I use a SQLDA (sql descriptor area)
where I have the addresses of the variables for the INSERT; the SQLDA is
built during prepare process, and the variables are filled in before
every INSERT by the application program.

(that is: EXECUTE STMT USING DESCRIPTOR :sqlda)

Kind regards

Bernd


Am 27.11.2018 um 13:01 schrieb Ivanna Vieira:
>
> Hello Michael,
>
> Thank you for the analysis.
>
> I'm not committing after each row but at every 5000 rows. And yes, the
> statement is being prepared many times since I'm starting a new
> anonymous block every 5000 rows. I'm doing this because I could not
> insert all 57000 rows in a single block.
>
> Best Regards,
>
> Ivanna Vieira
>
>
> -----End Original Message-----

Michael Hannan

RE: Insert performance
(in response to bernd oppolzer)

Ivanna,

Is it possible that your application does something strange "behind the scenes" that a normal program might not do, such as Describe, Commit, or Prepare? Are you really sure that the high cost is in Prepare? If so would be nice to get the Db2 Acctng 101 data for the thread that includes the number of prepares, Describes, etc. executed, and breaks down CPU time into types.

 

Michael Hannan,
DB2 Application Performance Specialist
CPT Global Ltd

Greg Palgrave

RE: Insert performance
(in response to Ivanna Vieira)

Hi Ivanna,

I can understand you cannot use the LOAD utility as it tends to leave things in pending states, but are you able to use the IMPORT utility with the INSERT option?

Assuming there are no generated columns or defaults that you need to worry about, something like this would work fairly quickly from the Db2 command line:

 

CONNECT TO <database>;

IMPORT FROM "your_data_file.csv" OF DEL MESSAGES "your_data_file.import.txt" INSERT INTO <schema>.<target_table>;

CONNECT RESET;

 

57,000 rows isn't really that many for an import, but you could add in a COMMITCOUNT if you are concerned about the commits/logging:

IMPORT FROM "your_data_file.csv" OF DEL COMMITCOUNT 5000 MESSAGES "your_data_file.import.txt" INSERT INTO <schema>.<target_table>;

It's all documented: https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0008304.html

 

Cheers

Greg
 
In Reply to Ivanna Vieira:

I'm using DB2 11.1.3.3 on Linux.

 

I needed to insert 57000 rows into an empty table from a CSV file. I am not allowed to use the load utility in the target database.

Ivanna Vieira

RE: AW: Insert performance
(in response to Walter Janißen)

It was a one-time execution task, so I did not find it necessary to have a program to do this. But it's a good tip I'll keep in mind for next time. Thank you.

Best Regards,

Ivanna Vieira

Ivanna Vieira

RE: Insert performance
(in response to Greg Palgrave)

Hello Greg,

Yes, IMPORT would be a good alternative, especially using partial commit. Thanks for the tip.

Best Regards,

Ivanna Vieira