Improving Application Performance by Eliminating Inefficient or Redundant Processing
by Joe Geller
This week IDUG continues its theme of improving application performance by reducing the number of calls to DB2. A couple of weeks ago IDUG presented Linda Claussen’s presentation (http://www.idug.org/p/bl/ar/blogaid=372) on saving CPU by reducing the number of calls by combining multiple statements into one (e.g. using multi-row fetch, the Merge statement, etc.). John Maenpass showed (http://www.idug.org/p/bl/ar/blogaid=375) how using SQL’s set processing rather than single row selects can give substantial CPU reduction. Sometimes though, you can do a whole lot more by carefully examining the application logic to find inefficient or redundant processing. I will look at some examples of programs that
- Retrieve more data than they really need (give me everything for that object)
- Retrieve data sooner than it is needed (give it to me upfront in case I need it later)
- Retrieve the same data multiple times (maybe I got it, maybe I didn’t, don’t know where I put it, who cares, I’ll just ask for it again).
- Ask for one thing at a time over the network (“Business logic should be in the app layer, so no logic in the stored procedure – each one should have a single select statement”)
It’s amazing. With all of the progress in computer hardware and speed and with application development methodologies, techniques and software systems, we keep revisiting the same performance issues.
Phase 1 – in the early days of DB2, the majority of online access was through CICS (or IMS). Everything was on the mainframe. Even user terminals were over a high speed network. If the application was returning hundreds (or thousands) of rows, there was no smooth scrolling as in browsers or client based applications. Instead, the user had to page forward to get the next screen’s worth of data. Even without network delays to transmit data, we learned that we shouldn’t fetch all of the rows up front. Instead, the program had to be written to fetch one screen’s worth of data and end the task (thread) Then if the user needed more, the program would open the cursor positioning after the last row shown.
Fetching all of the rows may cause excessively long first screen response time and is wasteful of resources because it is not likely (in many cases) that the user will scroll all the way through the data. With the locally connected applications, users expected sub second response times.
Phase 2 – client server applications with the application logic on the user’s PC. The relevant characteristics of this design were a scroll bar, and network overhead for transmitting the result set – which automatically increased response time. Now, there was a design conflict. Scroll bars make a better user experience than page forwarding, but also work best if the full set of rows is available. So of course programs were now designed and written to fetch all of the rows. Back to square one with first screen performance.
Phase 3 – the Internet and browser based applications. The internet added more data transmission delay. It also opened up the user community to be customers, and not just internal users. While internet based applications are inherently slower than CICS based, customers were used to multi-second internet response times. The database access was no longer the biggest contributor.
Phase 4 – mobile applications. The internet got faster, but smart phones were not as fast as PCs. But, they are getting faster all of the time. With Phase’s 3 and 4, the design pattern of fetching everything on the first interaction that might ever be needed so as to speed up subsequent actions, became prevalent.
What does “everything” mean? In addition to fetching all of the rows needed by the initial screen for a user, it often means retrieving all of the related data that will be needed if the user clicks on a link, an option, a detail button, etc. We are now talking about a very large amount of data, most of which the user may never look at.
It is a design tradeoff. Especially for mobile phones, speeding up most of the page accesses can be very important (but will become less important as the mobile network gets faster). There are techniques such as using XML (see http://www.idug.org/p/bl/ar/blogaid=341 by Fred Sobotka for example) that can speed up the transmission of large amounts of data by reducing the number of calls to DB2), but you really have to give some thought as to what is most appropriate for your application and user base.
The purpose of this article is not to argue which is the better approach, but to show how the implementation can easily go wrong and result in excess calls and data within these frameworks. Avoiding these pitfalls can greatly improve the overall performance of your system.
Complexity is the Enemy
Many modern application architectures are designed to “simplify” the application code by hiding details of the database and of SQL coding. Many java developers are not familiar with SQL and it is viewed as complex. SQL is really not complex for most access and is far simpler than handling the data relationships in program logic. The “simplification” involves a combination of
- black box modules (“service” routines).
- frameworks whose goal ss to simplify application development by coding common functions as a service that was to be used by all parts of the application. This hides the details of database connectivity, connection pooling and the access to core functions from the individual developer.
- Views or modules that return everything related to an entity (meaning it joins many related tables). Even though views are often not materialized, a complex one may force materialization. In many cases, the programs only needs a subset of what is retrieved via the view or module.
The result is too many layers. Instead of simplifying the coding effort, it makes it more complex. While this has merits, it also can result in many layers of abstraction, enough so that the developer has no real clue as to what is happening under the covers. Simple code at the top layer results in an excessive amount of data being accessed. We’ll see what “excessive” means with a few examples.
Case 1: The Application Architecture
The application was built in Java around such a framework. In the case of this application, several performance issues arose. These were eventually identified and corrected by thoroughly monitoring and tracing the application flow.
Problem 1 - Retrieve more data than they really need (give me everything for that object). This banking system had a user profile which saved user preferences, user functional components (i.e. credit cards, savings accounts, loans) and user authorizations. It also had the list of individual accounts the user had and the type of access allowed for each one. When the user logs on to the system, the first thing done is to retrieve and store in cache the entire user profile
Problem 2 - Retrieve data sooner than it is needed (give it to me upfront in case I need it later). The user may not plan on looking at their credit card accounts today, but that information is retrieved anyway. Many of the preferences and authorizations are only needed if the user clicks on a particular function or individual account, but all of that has already been retrieved at login time.
Problem 3 - Retrieve the same data multiple times (maybe I got it, maybe I didn’t, don’t know where I put it, who cares, I’ll just ask for it again). Caching is good. But, with all of the software layers and a lack of good understanding of them, there were many cases where a program called the service that went to the database to build the profile, rather than the higher layer that would check if it were was already cached.
Too much, too often
The stored procedure to build the profile had the highest execution frequency (by far) of all stored procedures in the system. Since it also retrieved the most amount of data, the CPU time used by this SP was a substantial fraction of the total. Not only was each call of the SP costly, but the total load on the system was high. Close examination of the number of occurrences indicated that the SP was being called more times than seemed reasonable compared to the rest of the programs. By tracing the executions of various functions, we found 2 situations that led to this over abundance.
Situation 1 – This was an example of Problem 3. When the application needed some part of the profile, it was not properly checking if that data were was already cached. As a result a call to load the profile was issued again.
Situation 2 – If a list has a set of items (accounts, transactions, etc.) and a user selects one of them, the profile data and other details are needed related to the selected item. Rather than getting that detail for an individual selected item, the program called the services to get the detail or the profile for each item in the list as it built the list. If there were 50 transactions (lines on the screen), the SPs were called 50 times – once for each. In all likelihood, the user would only need details on one or a few of those transactions.
Eliminating all of the cases for the above 2 situations resulted in a decrease of 1/3 of the calls to the profile SP. That is a significant savings just by removing unnecessary work.
Incremental Retrieval, Incremental Updates
If the application is likely to need a substantial part of the pre-loaded data, then this design may be beneficial (see for example the article by John Maenpaa earlier this month). If not, then it is okay to pre-load those parts that will often be needed, but then only retrieve the other parts when they are needed. Each application is different, but the point is that thought has to be given to the tradeoffs.
There is also the issue of how to handle changes to the pre-loaded data. If this profile or application data can be updated during the user session (i.e. he/she changed their preferences, added a new account or authorized a relative to review the account, etc.), then any cached data has to be refreshed. Rather than reload the entire profile, the system should be built with the ability to retrieve and re-cache just the updated portion. This is of course course results in a more complex architecture, but may be important if there are frequent updates.
Case 2: The View that “Simplifies” a Complex Set of Related Tables
At another company, a generic “attribute” set of tables was created to easily allow the addition of new attibutesattributes and allowable values for customers, organizations and other entities. This allowed flexibility without continual schema and program changes. That can be very good. But it also makes for complex data access. There are over half a dozen tables needed to put together the attributes, values and associated entities.
Views were created to find all of the attributes for a specific entity. The application programs would query the view, with predicates on the entity id and the specific attribute (and optionally value) needed. Sounds ok?
The problem was that the view was of sufficient complexity, and the attribute name was a string value in one table, that DB2’s access path retrieved all attributes for the entity id and then applied the predicates to filter down to the one that was needed. Much more data was retrieved processed from within the database than the program actually neededwas actually requesting.
The quick solution was to use the attribute tables directly instead of the views and to use the specific attribute id in the Where clause rather than the attribute name. This enabled index access and a good overall access path. From a table design perspective, the generic tables are very good for many attributes. But for those attributes that are frequently used in predicates, it might be better to add specific columns to the owning entity (e.g. customer entity).
Bottom Line - Know Thyself
The bottom line is that you have to really understand what your application needs and the application flow (i.e. what it is doing). Monitoring and tracing are essential, especially when the application is built on many layers of abstraction. Each individual SQL statement may perform very well, but the aggregate may not. Ensuring that the right sequence of actions is being done and the “right sized” amount of data is retrieved can be the key to good performance.