DB2 - L

Expand all | Collapse all

Reset Log RBA in DB2 Z/OS 10

  • 1.  Reset Log RBA in DB2 Z/OS 10

    Posted 10 days ago
    We recently are getting close to the end of DB2-10 6 Bytes RBA and we have fallowed the instruction pointed out by IBM in
    https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/admin/src/tpc/db2z_subsytemdatashrrba.html to solve the problem.
    we could do this successfully in test environment but there are some points that you might help.

    First we encountered some minor issues like taking an immediate Full image copy from SYSRTSTS Catalog Tablespace to solve STOPE problem during db2 startup. This is not mentioned in IBM recommendation.
    I want to know if any of you do this in real production environment and can share your experience so we can avoid any unseen pitfall.

    Second, We want to decrease downtime in production (Non-Datasharing) environment. One solution is to use DSN1COPY to reset big tablespaces/indexspaces PGLOGRBA that do not have updates and exclude them from Copy utility in maintenance mode. So I want to know if we can rely on DSN1COPY and is it a safe utility in production ? if not, is there any better solution to do this.

    Many Thanks.


  • 2.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 10 days ago
    Edited by Michael Hannan 10 days ago

    You are also very close to end of Service for V10. Sorry, that was a bad joke.

    I am guessing most people would recommend going to Db2 V11 and converting to larger RBA. Seem to recall could be done in Compatibility Mode rather than New Function.  Hard to think why upgrade from V10 was not the way to go in the past.  It maybe too urgent now, I don't know.  V10 was already end of service some time ago?  Just guessing without checking.  Seem to recall V11 End of Service was extended a bit.

    Do you get absolutely top level of IBM help if you are still on V10?   Even the slow sites should be just about on V12 since it was GA maybe late 2016.

    So going to V12 now is very conservative indeed. 

    I don't know the technical details to solve the RBA problem in V10 (not my area of expertise), but I assume it needs significant planning, as would going to V11. Possibly not many Listserv people will have already had to solve this problem. However hope you get lucky and find an expert.
    It's rare though we great tech detail for difficult questions on here. We get a flood of answers to the easy questions. You never know though, some guys can handle nearly any topic.

    ------------------------------
    MichaelHannanBankwest
    ------------------------------


  • 3.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 10 days ago
    Thanks Michael,

    But this is a managerial decision and we have had no success to convince him on upgrading to DB2 V11. Now we should stick to this plan and hope to use IDUG people experiences to boost it in short period of time.

    ------------------------------
    SteffenRichterSecurity Sporting Goods
    ------------------------------



  • 4.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 8 days ago

    Steffen,

     

    If I were in the position which you're in, I'd say to propose the following to your boss (and do not rely on the generosity of strangers to save your business!!!):

    ·         TEST THIS:

    o   Create a new sandbox subsystem.

    o   Copy all of the DDL to the new system, including GRANTs.

    o   Copy all existing data to the new system, using UNLOAD and LOAD.

    o   Create all code objects (Stored procedures, UDFs, packages)

    o   Run cycles of programs until you're satisfied that it all works.

    ·         GO LIVE:

    o   Capture all DDL.  TWO COPIES, OR THREE.

    §  Be sure that stored procedure and UDF definitions are captured.

    §  Include all settings for BINDs

    §  Verify that all code recompiles and rebinds are included

    o   Verify by inspection that all DDL has been captured.

    o   Start the production outage.

    o   Run summary reports for production data and catalogs.

    o   Unload all data.  TWO COPIES, OR THREE.

    o   Verify by inspection that all UNLOADs contain data.

    o   Drop all objects.

    o   Cold-start the system with the RBA back at zero (I'm not sure I'm saying that correctly! I've never done it.)

    o   Create all data objects.

    o   Reload all data.

    o   RUNSTATS on all data.

    o   Create all code objects (Stored procedures, UDFs, packages)

    o   Run summary reports for production data and catalogs. Compare with previous run and explain any discrepancies.

    o   Back up the system!!!

    o   Run reports.  Run the most urgent production and inspect all results.

    o   End the production outage.

    o   Restart the system.

    o   Senior coders and DBAs should probably live at work for the first two weeks and the first monthend after this.

     

    I think this will work.  It covers all of the bases of which I'm aware.  It ought to scare the crap out of any sane manager, though.

     

    -phil

     

     

    Philip Sevetson

    Computer Systems Manager

    FISA-OPA

    5 Manhattan West

    New York, NY 10001

    psevetson@fisa-opa.nyc.gov

    917-991-7052 m

    212-857-1659 f

     

     






  • 5.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 8 days ago
    I am not very intuitive, so can't tell how ironical is Philip's post.  However got a chuckle out of me. Normally irony is not working on me.

    ------------------------------
    Michael Hannan
    DB2 SQL and Performance Specialist
    ------------------------------



  • 6.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 7 days ago
    Michael,

    It was my off-the-cuff design for a system reset with maximum safety, but "maximum" in this case is still a really low level of safety.

    I think they're headed for just a heap of trouble (and my idiom here has been censored for the open list), but if the manager won't authorize the correct/safest process, the remaining choices are (1) do something like this or (2) resign.

    -phil


    Philip Sevetson
    Computer Systems Manager
    FISA-OPA
    5 Manhattan West
    New York, NY 10001
    psevetson@fisa-opa.nyc.gov
    917-991-7052 m
    212-857-1659 f




  • 7.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 5 days ago
    Dear Philip,

    First I should thank for your explanation in detail. We already have several TEST subsystems that resetting was successful on them. But on production as you may know the story is different. We also have mirror disks in production and may recover data if something goes wrong,  wish not.
    About Using Load/Unload utility for this task I think we will kill lots of time and copy utility is faster unless you have a good reason for it. Also your checkpoints and backup strategy is a great help and we will consider them in runsheet.

    ------------------------------
    SteffenRichterSecurity Sporting Goods
    ------------------------------



  • 8.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 4 days ago
    Steffen,

    Briefly - my strategy recommendation involved dropping tablespace level objects. In that situation, COPY datasets can only be restored with DSN1COPY; there are a couple of limitations to DSN1COPY across similar objects, which we don't have to go into here, unless you're considering this as a method of resetting RBAs.

    -phil


    Philip Sevetson
    Computer Systems Manager
    FISA-OPA
    5 Manhattan West
    New York, NY 10001
    psevetson@fisa-opa.nyc.gov
    917-991-7052 m
    212-857-1659 f




  • 9.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 4 days ago
    Edited by Steffen Richter 4 days ago
    Dear Philip,

    We do not want to rely totally on DSN1COPY, but we have several archiving tables that do no have any updates. With DSN1COPY we can reset and prepare them before the fireworks. But DSN1COPY has no idea about SMS and managing storage and we should do it manually nor it guarantees data integrity and safety. Do you have any experience with this tool and its buggy behavior ? Can we for example with RUNSTAT make sure everything is fine?

    ------------------------------
    SteffenRichterSecurity Sporting Goods
    ------------------------------



  • 10.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 8 days ago
    Hi Steffen,

    Many years ago when I worked for IBM I had a customer that ran into the same problem, so I did some research. I found a few customers that had actually gone through the process successfully. My customer ultimately decided that the risk was too high and the outage too long, so they took a different approach*.

    You (or rather your manager) will have to consider:
    a) Do we migrate to V11 as 1000s of other Db2 installations have already done. Then convert to extended RBAs. The outage is minimal and the risk is near-zero. Bonus: you'll now be running a supported level of Db2 11 albeit not for long (Db2 11 EOS is April 2021).
    - or -
    b) Do we execute a RBA reset procedure that maybe 5 customers have ever executed? The production outage can be very long depending on how much data you have and the risk potentially high. Db2 10 is long out of support, so if you run into a problem you're basically *censored*. Even if you have successfully done this in test it likely has much less data and a different lineage. I would seriously consider cloning production and running the RBA reset procedure on the clone with lots of application testing (essentially a full regression test of your entire application portfolio).


    [*] they migrated to 1-way DS; added a new member; quiesced original member -- the feasibility of this approach depends on how close you are to end of RBA as the conversion to DS might result in a LSRN delta if your RBA > D90000000000. Running out of LSRN is much worse than running out of RBA!!! This customer already has other Db2 subsystems running data sharing, so the conversion was familiar and simple.

    ------------------------------
    Jørn Thyssen
    Rocket Software
    2020 IBM Champion
    ------------------------------



  • 11.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 4 days ago
    Dear Jørn,

    How could migrating to 1-way DS help ? does it grow slower than RBA ? or any other benefits it has ? can you explain more.

    Thanks


    ------------------------------
    SteffenRichterSecurity Sporting Goods
    ------------------------------



  • 12.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 4 days ago
    Hi Steffen,

    The conversion does not help you, but once you've migrated to 1-ds, you can then add a new member to the group (so you're now 2-way), and then quiesce the original member (back to 1-way).

    The new member will have RBAs starting at zero. However, there are several issues with this approach:
    a) If you don't already have data sharing and sysplex experience, it will be complicated
    b) if your Db2 has passed RBA D90000000000 then your will end up with a so called LSRN STCK delta, which can accelerate running out of LSRNs which is really bad
    c) data sharing does add some CPU overhead, even for 1-way

    My recommended solution is still:
    1) Migrate to V11 NFM
    2) Convert to extended RBAs
    It has been tested to exhaustion by almost all IBM Db2 customers, so very little risk.

    ------------------------------
    Jørn Thyssen
    Rocket Software
    2020 IBM Champion
    ------------------------------



  • 13.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 4 days ago
    Many thanks Jørn for your recommendation,
    I got the point but we also have some space consideration and cannot bring up another instance. Sysplex on the other hand is another chapter which we are not good at it.
    About DB2 v11, we do not have much time to run complete tests before migrating, this is our future plan.  

    Thanks again

    ------------------------------
    SteffenRichterSecurity Sporting Goods
    ------------------------------



  • 14.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 7 days ago
    Steffen,

    I have successfully performed this procedure a couple of times, basically for the same reason you're looking at it.
    Most recently was less than a year ago.  We're currently on V11 CM and so it's basically still a V10 system, and I
    used the V10 doc.  I did ask IBM for any guidance on this and they do seem to have limited experience but did help
    some.  The process was successful both times.  Previous time was 4 or 5 years ago.  
    Some suggestions:
     
       - Practice the process on all of your non-production systems.  We seemed to find a new issue with every subsystem

       - We used the steps from IBM which did not include using DSN1COPY.

      -  I wrote some REXX execs to capture every object that wasn't in RW mode since we have quite a few.
            The REXX was to build commands to Start them in  RW,  and to start them back to RO or UT after the process was complete.


    Bottom line, is that the process is pretty simple, but you have to be extremely careful and thorough.

    I think our system is around 4TB, (or is it 6?), of data so we're not huge.  It's all based on how fast you can run image copies.  We ran several jobs in parallel and wrote to disk instead of virtual tape to speed things up.  If everything goes smoothly, you could probably do that in 2 or 3 hours with decent hardware.  That's best case for us.  We had a few problems with auto ops starting DB2 and such and I think it took us about 4 hours this last time.

    It's definitely a pucker time, and I wouldn't recommend doing it alone.  It's really nice to have someone to talk to in the middle of the night
    when something doesn't go as expected.  And I wouldn't try this if you're not a fairly seasoned DB2 person.  

    And we did not have a great fallback plan.  Basically, we would have been in disaster recovery mode.  We do take volume backups right before, so our storage team would have had to restore all the DB2 volumes for us to get up and running.  Lots extra hours of down time to do that.  And we'd still in the same boat with the RBA heading toward the limit.


    And lastly, it will take a year or two off your life expectancy.  I plan to retire before I ever have to do that again.  

    Fred
     





  • 15.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 4 days ago
    Edited by Steffen Richter 4 days ago
    Dear Fred,

    I am glad to find someone, have taken the risk and done it not once but several times. Could you please explain more what kind of issues you encountered in different subsystems ? if possible.
    Our data is more than 10 TBytes and we could do the COPY in 3-4 hours at best. We will also take backups from all DB2 disks to have some recovery plan in case of failure (Hope not).
    It seems that you didn't trust DSN1COPY either, we wish to use it to decrease the downtime even more but this mechanical tool has no guaranty for data safety.
    Another question, have you encountered any issue after resetting the RBA ? Does DB2 goes fine after the surgery ?

    Many Thanks.


    ------------------------------
    SteffenRichterSecurity Sporting Goods
    ------------------------------



  • 16.  RE: Reset Log RBA in DB2 Z/OS 10

    Posted 4 days ago
    Steffen,

    I really can't remember all the specific issues I had between subsystems, but they all had to do with not being thorough
    enough in the preparation.  You need to have a clean subsystem before you begin.  No utilities,  No indoubt or postponed 
    threads, no objects in restricted states.

    Two issues I do remember:  
       1)  We missed copying a few datasets.  If I remember right they were indexes and fortunately the Rebuild utility handled that.
       2)  Our auto-Ops software intercepted my Start DB2 ACCESS MAINT command and started DB2 normally.  We have apps that hit DB2 as soon as it's available so some of those datasets threw errors and DB2 put them in STOPE or some state.  I'm pretty sure we were able to 
    just start them access force and copy them after we got DB2 up in maintenance mode.  I could have avoided this if I had just checked to make sure DB2 was coming up correctly.

    I chose to follow IBM's procedure rather than come up with my own because I would have no one to turn to for help if I needed it, and they
    were able to clear up some questions I had about their procedure to the point I felt pretty confident that it would work.


    A few more random thoughts and recollections:
    -  I can't over emphasize the importance of being thorough in your preparation.  The heat of the battle is no time to deal with things you forgot or didn't do completely or correctly.  Each surprise extends your down time and it's a big deal for us and only adds to the stress.
    -  Running DSNJU003 to set the STARTRBA and ENDRBA to 0 was the most stressful part.  I would say a prayer before this whether you're a religious person or not. :-)
    -  I can't remember if this is part of the procedure.  I allocated new datasets, (with slightly different names), for the Directory and Catalog datasets that get initialized during the process ahead of time to make sure I didn't have any issues during GO-time.  Then I had IDCAMS jobs set up to rename the old prod datasets and rename my new datasets to the production names.  So during the actual process I just had to submit those 2 rename jobs.  Just one more way to reduce possible surprises. 
    -  I created a zparm module with a different name ahead of time with the parm set to reset the RBA with COPY.  Just one more thing I could do ahead of time so I didn't have to deal with any assembler syntax errors during the down time.
    -  I would recommend using DSN1PRNT to check a couple/few random datasets early in the COPY process to make sure the RBA has indeed been reset.
    -  Don't forget that copying the Directory and Catalog objects have some special requirements.
    -  I was surprised at how high the current log RBA was when we were done.  I don't remember the number--just that it was higher than I expected.
    -  I know I said it before, but bears repeating.  Have someone knowledgeable there with you during the process.
    -  It's strange, but every time after a successful RBA reset I couldn't help but think it's really not that big/dangerous of a deal if you're prepared.
    -  I don't remember the details, but since we're at V11 CM we couldn't convert to the longer RBA format.  We have some apps or something that is keeping us from continuing to NFM.  I guess they're too busy writing new stuff to convert the old stuff.  Managers make the decisions and we have to live with them.

    It's kind of anticlimactict when you're done.  All that stress and work and then DB2 just works like it did before the reset.  The only small bit of satisfaction came from seeing the new current logrba in Omegamon.  Wasn't long before I was calculating how long it would last.  We did it if Feb of this year and we've used about 25% of the RBA limit.  Hopefully management will let us upgrade before we would have to do that again.

    Fred