DB2 - L

Expand all | Collapse all

Db2 v12 High frequence parallell insert

  • 1.  Db2 v12 High frequence parallell insert

    Posted 12 days ago

    Hi,

    We have an application team planning for a feature and have requested a table to be redesigned to use for high frequency, 24/7, single row inserts by multiple transactions in parallel.

    This particular table will function as sort of a logging table where you in certain cases can backtrack the feature. So the main workload for the table will be single inserts (with rare selects). No updates or deletes.

    Some table details
    - The first column is a DEC(16,0) and will be a application controlled sequence, counting upwards 1, 2, 3....
    - There is only a few additional columns, no XML/LOB/CLOB etc.
    - There is a unique and clustering index on the first column
    - There is no good partition key on current columns

    Some considerations I have is regarding performance, locking contention and whether to go for PGB or PBR.

    PBG or PBR?
    My take is that PBR should have an advantage over PBG in regard to parallelism.

    One could create a sequence object, add a new column as a partitioning column and in the insert use the sequence objects number to generate a unique value that's also good for partitioning (you reformat the sequence number a little). Would this validate using PBR over PBG? It there any other methods or preferences to take into consideration in this scenario?

    Locking contention
    As I understand it, when Db2 does an insert it looks for a suitable page, puts it into the BP, X lock it on row level and writes to that page inside the BP. Later, depending on some stuff, it will be written to disk.

    Since it will be a lot of parallel inserts, all wanting to be placed at the end of the table (or possible end of partition in case of PBR), is there a danger of page locking contention or lock escalation? I'm pondering if LOCKSIZE ROW is a good idea to avoid lock escalation.

    Also I'm thinking about APPEND YES. Due to the unique index on first column, who happen to act as a sequence, all new columns will want to end up at the end. So is it more efficient to use APPEND YES to avoid Db2 looking for a suitable place to put the row and just have it forcefully put in the end?

    Performance
    All design decisions have some performance implication and LOCKSIZE ROW, as well as APPEND YES, could be considered an expensive in a lot of cases. Always the risk-reward...


    Looking forward to some great insights!

    Kind regards
    Johan



    ------------------------------
    JohanSundborgSwedbank AB
    ------------------------------


  • 2.  RE: Db2 v12 High frequence parallell insert

    Posted 12 days ago
    I have significant experience with this design concept. Here is a short list of recommendations. I should mention that you can test these performance choices yourself. Take a look at my blog at db2expert.com for testing ideas.

    1) PBR RPN. If you can find a way to identify the threads, or coordinate with application developers to see if they can add something to identify the threads then you can split the work up across multiple partitions. That may be a stretch so the second choice would be to see if you can generate something, like you mentioned, where the inbound insert can be directed by Db2 to a distinct partition. If you do this make a lot of partitions.
    2) table space defined with MEMBER CLUSTER PCTFREE 0 FREEPAGE 0. If the rows are going to be updated and varying length then you may want to consider PCTFREE FOR UPDATE > 0.
    3) table defined with APPEND YES.
    4) I have NOT used INSERT ALGORITHM 2, but it might be an option. I don't know it's current status, but if all the processes will run on a single member then it might be worth testing. By all means test it before you use it.
    5) A minimum of indexes, preferably 1 Partitioned index, if you can manage that. Index should be a larger page size and PCTFREE 0 FREEPAGE 0 if you will be adding entries on the end. If you have additional NPIs then you should set PCTFREE such that you do not incur page splits between index REORGs. Use FREEPAGE if you cannot avoid page splits between index REORGs. I have an article on index design on my web site.

    Using a design similar to this I have hit insert rates upwards of about 57,000 per second on a single table in a data sharing environment. The entire application itself achieved over 80,000 data change operations per second.

    Now, with data sharing you may want to consider group buffer pool cache none for the table space and index space since it is just a log and won't really be shared by this high volume application. However, you should be careful and test it.

    Your results may and likely will vary so do this at your own risk and TEST, TEST, TEST! Keep us posted as to your progress!

    Dan

    +--------------------------------------+-----------------------------------------------------------+
    | Daniel L Luksetich | IBM Certified Advanced Database Administrator – |
    | IBM GOLD Consultant | Db2 10.1 for Linux UNIX and Windows |
    | IDUG Content Committee Past-Chairman | IBM Certified Database Adminstrator – Db2 12 for z/OS |
    | IDUG DB2-L Administrator | IBM Certified System Administrator – Db2 11 for z/OS |
    | URL: https://db2expert.com | IBM Certified Application Developer – Db2 11 for z/OS |
    +--------------------------------------+-----------------------------------------------------------+




  • 3.  RE: Db2 v12 High frequence parallell insert

    Posted 12 days ago
    From my experience on highly updated / inserted tables : 
    - Minimize access to the space map pages , one of the possible bottleneck, remove trackmod ddl option if possible.
    - Eliminate unnecessary search for space to insert, in addition of what have been said,  allocate the space quantity needed so you don't have to search for extensions. 
    - Reduce probability of contention : less pages/ rows handled by a space map, less rows in pages ... 










  • 4.  RE: Db2 v12 High frequence parallell insert

    Posted 6 days ago
    Hi Dan! Thanks for great ideas, I've read your article about index design too and got some more inspiration.

    I got some more info from my developer and he estimates that when this goes live the table till receive roughly 150.000 inserts per day, where most of it will be during daytime. A fair guess could be
    08 - 18 (10h): 100.000 inserts. 10k per hour or 2,8 per sec (with some random peak times)
    18 - 08 (14h): 50.000 inserts. 1 per sec

    This is not quite the high volume I was expecting, and way less than yours 57-80.000 inserts per second.

    1) I think it would be hard to identify the threads and it might also be overkill for this scenario. The key that the developers want to use will add up into 3-5 partitions with even insert spread. That's less than what I had in mind, but I'm wondering if that still might be enough for this type of volume, what do you think?
    2) There will be no update, delete of columns with varchar, so PCTFREE 0, FREEPAGE 0, MEMBER CLUSTER and LOCKSIZE ROW might be a good way to go.
    3) And also include APPEND YES to avoid Db2 wasting time at looking for free space instead of just putting all at the end.
    4) We are not allowed to use insert algorithm 2 in out shop for now, so I'll skip this :)
    5) We are planning on going for 1 index that will be clustering and unique on the first column. Using a larger than normal page size is a good idea, I hadn't considered the cost of page slits but they of cause adds up.

    /Johan

    ------------------------------
    Johan Sundborg, Swedbank AB
    ------------------------------



  • 5.  RE: Db2 v12 High frequence parallell insert

    Posted 6 days ago
    I think all your choices are reasonable given the projected insert rate. Remember to test your choices! I have an article on that as well.
    Cheers,
    Dan

    +--------------------------------------+-----------------------------------------------------------+
    | Daniel L Luksetich | IBM Certified Advanced Database Administrator – |
    | IBM GOLD Consultant | Db2 10.1 for Linux UNIX and Windows |
    | IDUG Content Committee Past-Chairman | IBM Certified Database Adminstrator – Db2 12 for z/OS |
    | IDUG DB2-L Administrator | IBM Certified System Administrator – Db2 11 for z/OS |
    | URL: https://db2expert.com | IBM Certified Application Developer – Db2 11 for z/OS |
    +--------------------------------------+-----------------------------------------------------------+




  • 6.  RE: Db2 v12 High frequence parallell insert

    Posted 5 days ago
    Hi Johan!
    Just en few additional details.
    Use TRACKMOD NO to avoid space map page contention.
    If definting the tablespace with COMPRESS YES would give you a good compression ratio this could be a good idea as this could help as you write shorter records
    If you set PCTREE FOR UPDATE to a value that corresponds to the lengt of one row rounded up to Db2 would mark the pages as full just beforde they really are full. This would stop Db2 to try to insert rows on a page where they do not fit. Say that the uncopressed row lengh is 100 bytes and you use 4k pages, set PCTFREE FOR UPDATE to 100*(100/4056) rounded up = 3
    A good solution could be to use UTS PBR and partition by a hidden ROWID.
    Use a large PRIQTY for both the tablespace and the index and not -1
    I do not think that IAG2 would be a good solution in this case, but I may be missing something in your description.
    Have a good weekend!
    Regards Martin

    ------------------------------
    MartinÅlundHandelsbanken
    ------------------------------



  • 7.  RE: Db2 v12 High frequence parallell insert

    Posted 2 days ago
    Hi Martin!

    Indeed, TRACKMOD NO is likely the way to go, but have to skip incremental image copy and only use full image copies then.

    Compress might help with space, but doesn't it also slow down the inserts?

    Interesting idea with PCTFREE for update, but doesn't that just lower the roof of possible amount if inserts per page by 1 while presenting Db2 with the same issue where is must check if any space is available for one more row or not? Or does Db2 have a smarter handling of the last piece of space if it's ear marked to UPDATE and not the hard roof of the page?

    ROWID is not without complications and will not persist between tables (yes you can have the receiving table as GENERATED BY DEFAULT, but in a 'reorg-discard from base and load into archive'-relationship you might run into duplicate ROWIDs from the base table). It also makes recovery more restricted in regard to options available. What strengthens do you see with using ROWID as partitioning key?

    Totally agree with preallocating a large PRIQTY, should improve performance.

    Thanks, hope your weekend was good!
    Regards
    Johan

    ------------------------------
    Johan Sundborg, Swedbank AB
    ------------------------------



  • 8.  RE: Db2 v12 High frequence parallell insert

    Posted 2 days ago
    Hello Johan,

    you can still use incremetal copies but it will look on every page for some changes instead using the spacemap-pages.
    You can run some tests to see if the (cpu-)costs for iincremental copy on TRACKMOD NO Tablespace change a lot vs. a TRACKMOD YES-TS.

    Kind regards, Rolf

    ------------------------------
    RolfDreesFinanz Informatik GmbH & Co. KG
    ------------------------------



  • 9.  RE: Db2 v12 High frequence parallell insert

    Posted 2 days ago
    Hi Johan!

    Feels a little strange to write in English to you...

    You can still use incremental copy with TRACKMOD NO. You will get a copy with just the changes in a smaller data set than for en full copy. The back side is that it will use aproximately the same amount of CPU and elapse time as i full copy as it has to read all the pages and not just the space map pages to find which rows are new or updated.

    As compress use hardware compression, the time for compressing the rows is not an issue in most cases. Do not use it if you have a low compression ratio.

    The old-fashioned way to make Db2 mark the pages as full when there isn't any room for more pages is to use MAXROWS. With variable length rows and/or compression it is impossible to calculate the right number for MAXROWS. Using PCTFREE FOR UPDATE is a "smarter" way to get the same result. You want Db2 to mark the page as full when there isn't room enough for another row. If you for example use PCTFREE FOR UPDATE 10 Db2 will reserve 406 bytes that can only be used by UPDATE operations, when the UPDATE causes the row to grow. But that isn't all true. If you have filled 3500 bytes using insert you can still insert another row that occupy space from the reserved 406 bytes. When that row is inserted the page is marked as full with just a "bit". At the next INSERT Db2 will look at that bit and see that this page is full. If you set PCTFREE FOR UPDATE 0 there is no reserved space. If you have filled 4000 bytes with data en tries to insert another row, Db2 will have to try to insert the row in that page and then realise that there isn't enough space. It is much better if Db2 can see directly that the page is full by just looking at one bit.

    The idea of using a hidden ROWID for partitioning is that you get better parallelism with UTS PBR than with UTS PBG. That is an alternative when you do not have a good partitioning key in the data. You will never use the value in any way, it is just for partitioning. It is not a problem if you do not get the same ROWID when recovering using undo och redo records.

    Regards Martin

    ------------------------------
    MartinÅlundHandelsbanken
    ------------------------------



  • 10.  RE: Db2 v12 High frequence parallell insert

    Posted 10 days ago
    Hello Johan,

    I know that when it comes to high-volume Inserts everyone loves APPEND YES. But there is one point that you may want to consider. When DB2 locks a page for insert, it takes an unconditional lock. So if the page is already locked by parallel running insert, it will not wait until the lock is released. Instead DB2 will look for the next page and tries to X-lock that page. 
    In the result you may get for parallel running insets pages that holds only 1 row (depending on how many UoWs run parallel), what is not good for further processing (what you don't have) and may result in high space-consumption of the table space..

    Kind regards, Rolf



    ------------------------------
    RolfDreesFinanz Informatik GmbH & Co. KG
    ------------------------------



  • 11.  RE: Db2 v12 High frequence parallell insert

    Posted 10 days ago
    Edited by Gareth Copplestone-Jones 10 days ago
    Hello Johan.

    To answer your original question is a round-about way, Dan Luksetich comes up with some good points about MEMBER CLUSTER, FREEPACGE, FREESPACE and APPEND.  

    That needs to be supplemented with Rolf Drees's point about the potential for rapid growth of the tablespace in terms of size. For that reason, with MEMBER CLUSTER and APPEND, you should use LOCKSIZE ROW. The overhead is minimised because, when using MEMBER CLUSTER, there will be no page P-lock negotiation provided there is no access from the other members. You are right that it is it more efficient to use APPEND YES to avoid Db2 looking for a suitable place to put the row. Much more efficient.

    In your case, when you have SELECT activity, then there will be occasions when the data sharing overhead is increased. Be aware, however, that MEMBER CLUSTER and APPEND are likely to lead to poorer performance for applications that read the data sequentially, because clustering is not honoured.

    One final point, not relevant in your case, is that any update or delete activity from the other members will nullify the MEMBER CLUSTER benefits and is likely to lead to high levels of contention and data sharing overhead.

    And one more final point - you should not use IAG2, at least until APAR PH26498 is closed and you have applied the fix.

    Kind regards, Gareth


    ------------------------------
    Gareth Copplestone-Jones
    Triton Consulting
    ------------------------------



  • 12.  RE: Db2 v12 High frequence parallell insert

    Posted 9 days ago
    Hi,

    Just to add one point from my own experience of using this technique (MC00 with APPEND) to facilitate high INSERT activity.  I would normally ensure that frequent REORG of these objects is in place too.
    (Apologies if somebody has already made this point!).

    Regards,
    Paul





  • 13.  RE: Db2 v12 High frequence parallell insert

    Posted 2 days ago
    Hi Rolf and Gareth,

    Yes the space part is very interesting and I tried to search the web for some examples and details regarding the issue, but found only very few and old blog posts, good to know that it still works about the same in V12.

    I wont matter if the selects gets bad performance since they will only occur in case of someone want to use the table for troubleshooting (expected to be rare).

    Thanks for the clarification around MEMBER CLUSTER and APPEND, you should use LOCKSIZE ROW :)

    /Johan

    ------------------------------
    Johan Sundborg, Swedbank AB
    ------------------------------