Statistics Event Monitors - terrible name, terrific feature

With a name like Statistics Event Monitors, they've got to be good

Before I really knew what statistics event monitors were, I'd hear the name and think "statistics" - as in cardinality and runstats and stuff (wrong!)  And I'd think "event monitors" - as in those things that most people kind of hate (also mostly wrong.)   The name doesn't exactly create a stampede to find out more about them.  

They are, in fact, an amazingly useful feature of DB2, and when I finally figured that out, I had to tell the world (or at least the small part of the world that reads this blog, or comes to my sessions at IDUG conferences.)

In short, statistics event monitors provide a way to automatically collect monitoring information at regular intervals. But the best part is that they also automatically calculate delta values when they do it.   This is important because most of the underlying counters/timers/etc. that DB2 returns through the mon_get() table functions just steadily increase with use after database activation.  Unlike old-school, outdated snapshot monitors, you can't reset them.   If you're using a monitoring tool like Data Server Manager, SpeedGain, Brother-Panther, SAP DBA Cockpit etc., they will take care of the delta calculations under the covers.  But if you're collecting monitor data directly from the native DB2 mon_get() functions, then without a way to reset them at the beginning of a monitoring interval, it's a bit difficult to know at the end of that interval how much of the activity happened within the interval, and how much before.   So I and others have shared various fairly complex methods for calculating delta values from the monitor table functions.  Monitoring with delta values is important and manageable - but it's not really easy or intuitive - so many people don't bother doing it, and they end up with less accurate monitoring data.

As an alternative to collecting & processing monitor data from mon_get_service_subclass (MGSS for short) or mon_get_workload (MGW), you can instead set up a statistics event monitor which will periodically collect it and store the delta values in a table for you to work with.   At the moment, statistics event monitors don't drill down to anything more specific SQL statements, tablespaces, locks, etc.   But MGSS and MGW are extremely useful on their own (and there's added bonus - OS metrics - we'll get to that in a bit.)

Setting up a statistics event monitor

Statistics event monitors are a "WLM thing", which can scare some people off if they haven't got Workload Manager set up or haven't used it before.  The fact is that DB2 implicitly uses WLM all the time (at least in a fairly basic way), whether or not you've explicitly set up workloads, service classes, etc.  If you're not explicitly using WLM, all user activity (i.e., application SQL, most utilities, etc.) goes through SYSDEFAULTSUBCLASS, so that's where we'll attach our statistics event monitor to.  If you do have service classes defined in your environment, you can be more surgical.

Step 1:
db2 "alter service class sysdefaultsubclass
     under sysdefaultuserclass
     collect aggregate activity data base"

Step 1 simply enables collection of aggregated activity data - basically, summing up counters, etc. on activities (SQL statements, LOAD, etc.) that execute in this service subclass.

Step 2:
db2 "create tablespace monTBS
     managed by automatic storage
     autoresize yes
     initialsize 32 M maxsize 1 G"

We're going to set up the statistics event monitor to write into ordinary tables back in the database, so Step 2 creates a tablespace to house those tables.  The tables produced will grow (more about that later) but they will grow quite slowly, so we don't need a particularly large tablespace.

Step 3:
db2 "create event monitor myevmon
for statistics
write to table
scmetrics ( table scmetrics in monTbs ),
scstats   ( table scstats in monTbs ),
osmetrics ( table osmetrics in monTbs )
autostart"

Step 3 creates the event monitor, in this case with three tables.  Scmetrics looks just like MGSS, which is one of the prime table functions for monitoring either at the service subclass level (if you're using WLM) or, more commonly, monitoring at the overall system level (through SYSDEFAULTUSERCLASS.)  Scstats is interesting, as it brings in a number of high water mark and aggregate metrics based on service subclass activity that might be interesting to some - e.g., ACT_CPU_TIME_TOP, REQUEST_EXEC_TIME_AVG.   Scstats also includes an XML column scstats.metrics, which contains the same metrics as scmetrics.  So if you're an XML fan, you could get everything from scstats - however I find that most of the time, only configuring the scmetrics relational table and skipping scstats works for me.  (Note that the other XML document, scstats.details_XML, which contains non-delta values for the MGSS metrics, has been deprecated in v11.)

I mentioned a bonus earlier on.  The osmetrics table is extremely handy because it contains useful machine-level metrics, like CPU, memory and swap usage.  This data dovetails really well with the DB2-level data from scmetrics. Not long ago, the only practical way to collect system-level information was to run a separate tool like vmstat or nmon in parallel with your DB2 monitoring.  With statistics event monitors, it's trivially easy to get both regularly collected, ready to use.

If you go look at the Knowledge Center you'll see that statistics event monitors can also collect other tables like HISTOGRAMBIN, QSTATS, WCSTATS, WLSTATS, etc.  I'm not setting up collection of any of those here because they're mostly suitable for environments where WLM has been configured, workloads defined, thresholds in use, etc.   For general monitoring use like we're looking at here, you could go down the Workload route (wlmetrics instead of scmetrics as I did above.)  Either is fine, but unless you've implemented WLM explicitly, there's not much value in getting both.

Step 4:
db2 "update database configuration for <dbname> using WLM_COLLECT_INT 5"
db2 "set event monitor myevmon state=1"

Step 4 is where we throw the switches to enable data collection.  WLM_COLLECT_INT specifies how frequently to collect statistics event monitor data, in minutes.  So here, I'm going to get service subclass and OS metrics collected automatically every 5 minutes, and dumped into DB2 tables for me to analyze (more about that later.)   You could set collection to run less frequently, e.g. once per hour, but every 5 minutes is the maximum collection rate.

We don't want collecting to turn into hoarding

Collecting information every 5 minutes is fairly slow, but the data volume will build up over time unless we do something about that.  It's best to automate the pruning, and DB2 can help, with the Automatic Task Scheduler (ATS.)   First, we define a simple SQL stored procedure which will delete everything more than a week old (or a day, or a month - whatever your preference.)

CREATE OR REPLACE PROCEDURE PRUNE_STATS_MONITOR()
        SPECIFIC PRUNE_STATS_MONITOR
        LANGUAGE SQL
        BEGIN
        delete from scmetrics where statistics_timestamp < current timestamp - 7 days;
        delete from scstats where statistics_timestamp < current timestamp - 7 days;
        delete from osmetrics where statistics_timestamp < current timestamp - 7 days;
        END@

Then, we set up ATS to run this once per day

db2 "CALL SYSPROC.ADMIN_TASK_ADD
        ('Prune stats monitor once a day',
        NULL, # no starting timestamp
        NULL, # no finishing timestamp
        NULL, # no max executions
        '0 0 * * *', # run once per day at midnight
        '<proc_schema>', # your procedure schema
        'PRUNE_STATS_MONITOR',
        NULL, # no input parameters
        NULL, # no options
        NULL )" # no remarks

db2set DB2_ATS_ENABLE=YES # ATS doesn't run until this is turned on

And just like that, we have a week's supply of system-level monitor data, automatically collected and ready to use with whatever queries or analysis we want.

Sample query #1: system-level throughput

How fast is the system running, in terms of activities, commits, rollbacks, etc?

-- Query 1 - system level throughput
with metrics (
   statistics_timestamp,
   ts_delta,
member,
   act_completed_total,
   total_app_commits,
   total_app_rollbacks,
   deadlocks,
   select_sql_stmts,
   uid_sql_stmts,
   rows_returned,
   rows_inserted,
   pkg_cache_inserts,
   pool_data_p_reads,
   pool_index_p_reads )
   as
   ( select
   c.statistics_timestamp,
   (select timestampdiff(2,c.statistics_timestamp-max(p.statistics_timestamp))
        from scmetrics p where p.statistics_timestamp < c.statistics_timestamp
        and p.partition_key = c.partition_key
        and p.partition_number = c.partition_number
        and p.service_class_id = c.service_class_id
   ),
c.partition_key,
   act_completed_total,
   total_app_commits,
   total_app_rollbacks,
   deadlocks,
   select_sql_stmts,
   uid_sql_stmts,
   rows_returned,
   rows_inserted,
   pkg_cache_inserts,
   pool_data_p_reads,
   pool_index_p_reads
   from scmetrics c
   where
        service_subclass_name = 'SYSDEFAULTSUBCLASS'
        and service_superclass_name = 'SYSDEFAULTUSERCLASS'
   )
select
   statistics_timestamp,
   integer(ts_delta) as ts_delta,
smallint(member) as member,
   decimal((act_completed_total / float(ts_delta)),10,1) as act_per_s,
   decimal((total_app_commits / float(ts_delta)), 10, 1) as cmt_per_s,
   decimal((total_app_rollbacks / float(ts_delta)), 10, 1) as rb_per_s,
   decimal((deadlocks / float(ts_delta)), 10, 1) as ddlck_per_s,
   decimal((select_sql_stmts / float(ts_delta)),8,1) as sel_p_s,
   decimal((uid_sql_stmts / float(ts_delta)),8,1) as uid_p_s,
   decimal((rows_returned / float(ts_delta)),8,1) as rows_ret_p_s,
   decimal((rows_inserted / float(ts_delta)),8,1) as rows_ins_p_s,
   decimal((pkg_cache_inserts / float(ts_delta)),8,1) as pkg_cache_ins_p_s,
   decimal(((pool_data_p_reads+pool_index_p_reads) / float(ts_delta)) , 10, 1) as p_rd_per_s
from metrics
where ts_delta > 0
order by statistics_timestamp,member

Running the above query on my system gives something like this:

idug blog stats evmon sample 1.JPG

Sample query #2: system-level wait time

If we notice there are throughput or response time issues, one of the most important things to look at is where time is spent - especially waiting.

-- query 2: wait time breakdown at system level
select
   statistics_timestamp,
   smallint(partition_key) as member,
   integer(total_rqst_time) as total_rqst_tm,
   case when total_rqst_time = 0 then null else
      decimal((total_wait_time / float(total_rqst_time)) * 100, 5, 1) end as pct_rq_wt,
   case when total_rqst_time = 0 then null else
      decimal((lock_wait_time / float(total_rqst_time)) * 100, 5, 1) end as pct_lck,
   case when total_rqst_time = 0 then null else
      decimal((total_extended_latch_wait_time / float(total_rqst_time)) * 100, 5, 1) end as pct_ltch,
   case when total_rqst_time = 0 then null else
      decimal((log_disk_wait_time / float(total_rqst_time)) * 100, 5, 1) end as pct_lg_dsk,
   case when total_rqst_time = 0 then null else
      decimal((reclaim_wait_time / float(total_rqst_time)) * 100, 5, 1) end as pct_rclm,
   case when total_rqst_time = 0 then null else
      decimal((cf_wait_time / float(total_rqst_time)) * 100, 5, 1) end as pct_cf,
   case when total_rqst_time = 0 then null else
      decimal((pool_read_time / float(total_rqst_time)) * 100, 5, 1) end as pct_pool_rd,
   case when total_rqst_time = 0 then null else
      decimal((direct_read_time / float(total_rqst_time)) * 100, 5, 1) end as pct_dir_rd,
   case when total_rqst_time = 0 then null else
      decimal((direct_write_time / float(total_rqst_time)) * 100, 5, 1) end as pct_dir_wrt,
   case when total_rqst_time = 0 then null else
      decimal((fcm_recv_wait_time / float(total_rqst_time)) * 100, 5, 1) end as pct_fcm_rcv,
   case when total_rqst_time = 0 then null else
      decimal((fcm_send_wait_time / float(total_rqst_time)) * 100, 5, 1) end as pct_fcm_snd
from scmetrics
where
        service_subclass_name = 'SYSDEFAULTSUBCLASS'
        and service_superclass_name = 'SYSDEFAULTUSERCLASS'
order by statistics_timestamp, member

 On the same system, at the same time, as query 1, this is how the wait time breaks down by percent:

idug blog stats evmon sample 2.JPG

Sample query #3: OS-level resource usage

Depending on throughput and wait time fluctuations, getting an idea of CPU and memory saturation at matching times can help you understand what's going on.

-- query 3: operating system metrics
with metrics(
   statistics_timestamp,
member,
   memory_total,
   memory_free,
   virtual_mem_total,
   virtual_mem_free,
   cpu_user,
   cpu_system,
   cpu_iowait,
   cpu_idle,
   cpu_total)
as (
select
   statistics_timestamp,
partition_key,
   memory_total,
   memory_free,
   virtual_mem_total,
   virtual_mem_free,
   cpu_user,
   cpu_system,
   cpu_iowait,
   cpu_idle,
   cpu_user+cpu_system+cpu_iowait+cpu_idle as cpu_total
from osmetrics )

select
   statistics_timestamp,
smallint(member) as member,
   case when cpu_total = 0 then null else decimal((cpu_user / float(cpu_total)) * 100, 8, 2) end as pct_cpu_user,
   case when cpu_total = 0 then null else decimal((cpu_system / float(cpu_total)) * 100, 8, 2) end as pct_cpu_system,
   case when cpu_total = 0 then null else decimal((cpu_iowait / float(cpu_total)) * 100, 8, 2) end as pct_cpu_iowait,
   case when cpu_total = 0 then null else decimal((cpu_idle / float(cpu_total)) * 100, 8, 2) end as pct_cpu_idle,
   case when virtual_mem_total = 0 then null else decimal(((memory_total-memory_free) / float(memory_total)) * 100, 8, 2) end as pct_phys_mem_used,
   case when memory_total = 0 then null else decimal(((virtual_mem_total-virtual_mem_free) / float(virtual_mem_total)) * 100, 8, 2) end as pct_virt_mem_used
from metrics
order by statistics_timestamp, member

idug blog stats evmon sample 3.JPG

 A few more details

You can drive extra (unscheduled) data collections by calling mon_collect_stats.  When you do that, an extra collection of data is dumped into the collection tables and the counters are reset.  The regular automated collection will proceed from there - on its original schedule.   So if you call mon_collect_stats 1 minute into a 5 minute interval, an automatic collection will follow 4 minutes after you called mon_collect_stats. This is why sample query #1 jumps through some hoops to calculate rates - the number of seconds in each collection interval isn't guaranteed to be 5 minutes' worth, since someone could call mon_collect_stats and rock the boat a bit.  In fact, if you look closely at the timestamps of collections in the examples above, you can see I did just that, around 15:18:54.  This injected an extra sample between 15:55 and 16:00.  And you'll probably have figured out that the 5 minute minimum on automatic collection isn't much of a limitation, since you could always script up calls to mon_collect_stats every minute if you wanted to.

What about overhead?  Well, frankly, I've been unable to measure it.  Statistics event monitors are just collecting pretty lightweight metrics, and rolling up at a maximum rate of once per 5 minutes, which is a pretty stately pace.

I really like these, but they don't completely replace a more powerful monitoring infrastructure, like a set of scripts that goes deep on statements, tablespaces, bufferpools, sort activity, etc., or an external tool like DSM.  But for something that you can have up and running in about 2 minutes on any DB2 system, with no other tooling or investment required - that's pretty cool.

2 Likes
Recent Stories
Tips for getting the best INSERT performance

Statistics Event Monitors - terrible name, terrific feature

Measuring OVERHEAD and TRANSFERRATE on DB2 LUW - the why & how