DB2 Compression on RVA boxes

Michael Bancale

DB2 Compression on RVA boxes
I have hundreds of decision support tables which are defined with DB2
compression. These objects reside on IBM RVA and EMC dasd. Do I need to
turn off DB2 compression if I am using RVA dasd?

Thanks,


Michael Bancale
[login to unmask email]
Texas Farm Bureau Insurance Company
www.txfb-ins.com
Waco, Texas



Sundar Sundaresan

Re: DB2 Compression on RVA boxes
(in response to Michael Bancale)
Michael

I do not know about EMC. But for the RVA and SVA refer to redbook
DB2 for OS/390 and Data Compression, SG24-5261-00

It has what you need. Also refer to comparisions between Db2
compression, RVA compression and RVA + Db2 compression on DASD savings
etc.
in the redbook
Using RVA and SnapShot for BI with OS/390 and DB2, SG24-5333-00 (
Specifically chapter 3 ) which should put to rest many an argument
between DBAs and Storage Administrators.

I use compression with RVAs and SVAs and can give you details of the
savings if you like.

Thanks
Sundar Sundaresan
Siemens




> -----Original Message-----
> From: Michael Bancale [SMTP:[login to unmask email]
> Sent: Thursday, December 09, 1999 12:08 PM
> To: [login to unmask email]
> Subject: DB2 Compression on RVA boxes
>
> I have hundreds of decision support tables which are defined with DB2
> compression. These objects reside on IBM RVA and EMC dasd. Do I need to
> turn off DB2 compression if I am using RVA dasd?
>
> Thanks,
>
>
> Michael Bancale
> [login to unmask email]
> Texas Farm Bureau Insurance Company
> www.txfb-ins.com
> Waco, Texas
>
>
>
>
>



Richard A Yevich

Re: DB2 Compression on RVA boxes
(in response to Sundar Sundaresan)
Michael,

No. Two entirely different technologies, not competing but complementary.
The RVA compression is algorithmic, internal to the hardware, and can be
viewed as just DASD. The DB2 Compression is row level, allows more rows per
page, better hit ratios in the bufferpools etc. The RVA compression has
nothing to do with DB2 or its performance.

Regards,
Richard
+===+===+===+===+====+
[login to unmask email]


> -----Original Message-----
> From: DB2 Data Base Discussion List [mailto:[login to unmask email]On Behalf Of
> Michael Bancale
> Sent: Thursday, December 09, 1999 12:08 PM
> To: [login to unmask email]
> Subject: DB2 Compression on RVA boxes
>
>
> I have hundreds of decision support tables which are defined with DB2
> compression. These objects reside on IBM RVA and EMC dasd. Do I need to
> turn off DB2 compression if I am using RVA dasd?
>
> Thanks,
>
>
> Michael Bancale
> [login to unmask email]
> Texas Farm Bureau Insurance Company
> www.txfb-ins.com
> Waco, Texas
>
>
> To change your subscription options or to cancel your
> subscription visit the DB2-L webpage at
> http://www.ryci.com/db2-l. The owners of the list can be reached
> at [login to unmask email]
>



[login to unmask email]

Re: DB2 Compression on RVA boxes
(in response to Richard A Yevich)
If you are not CPU-constrained then use DB2 compression. Your data gets
compressed before it is externalized to the DASD subsystem -- all the
benefits of DB2 compression, like reduced I/O, still apply.

With DASD subsystem products like EMC you get a further "structural"
compaction of the logical record, such as elimination of interrecord gaps,
end-of-track waste, and unused allocated tracks. You also get further data
compression of compressed data, plus DB2 catalog, directory, indexes, and
work files data compression.

Kevin Mulcahy
CheckFree Corporation



Leslie Pendlebury-Bowe

Re: DB2 Compression on RVA boxes
(in response to kmulcahy@CHECKFREE.COM)
Michael
It all depends .. why are you doing compression?

- an exmaple is that you may wish to obtain more addresses in your RVA
.

i.e you do not have the x.82 fix to obtain 1024 addresses .. and have
hit the sealing on use of available space .. then you may wish to
compress the DB2 storage and obtain :

- more rows per page
- more data retained in your virtual pools
- more space in the RVA for additional UCBs to use ..

Assuming you have the fix in the RVA for 1024 addresses then you have
no worries on UCBs, then you need to consider if you wish to have more
rows per page in the virtual pools .. are these tables high read only?
if so then leave them compressed .. I assume as they are DS tables
that they are high read and more than likley reside in their own
virtual pool (??) and so I would leave them compressed.

regards

Leslie Pendlebury-Bowe
DB2 SAP OS390
DB2 UDB certified etc etc


______________________________ Reply Separator _________________________________
Subject: DB2 Compression on RVA boxes
Author: Michael Bancale <[login to unmask email]> at Internet
Date: 12/9/99 11:08 AM


I have hundreds of decision support tables which are defined with DB2
compression. These objects reside on IBM RVA and EMC dasd. Do I need to
turn off DB2 compression if I am using RVA dasd?

Thanks,


Michael Bancale
[login to unmask email]
Texas Farm Bureau Insurance Company
www.txfb-ins.com
Waco, Texas








craig patton

Re: DB2 Compression on RVA boxes
(in response to Leslie Pendlebury-Bowe)
One small comment on being CPU constrained, DB2 compression, when done with
'hardware microcode', costs in the range of about .5% additional CPU (yes
that is 'point' 5%). I would use compression even if CPU constrained as the
I/O savings (GETPAGE uses cpu cycles) will more than make up for the
additional CPU cycles. If your processor does NOT support hardware
compression, then I would be much more cautious.

Craig Patton
DB2 DBA
Patton Research Group, Inc.


>From: [login to unmask email]
>Reply-To: DB2 Data Base Discussion List <[login to unmask email]>
>To: [login to unmask email]
>Subject: Re: DB2 Compression on RVA boxes
>Date: Fri, 10 Dec 1999 11:16:26 -0500
>
>If you are not CPU-constrained then use DB2 compression. Your data gets
>compressed before it is externalized to the DASD subsystem -- all the
>benefits of DB2 compression, like reduced I/O, still apply.
>
>With DASD subsystem products like EMC you get a further "structural"
>compaction of the logical record, such as elimination of interrecord gaps,
>end-of-track waste, and unused allocated tracks. You also get further data
>compression of compressed data, plus DB2 catalog, directory, indexes, and
>work files data compression.
>
>Kevin Mulcahy
>CheckFree Corporation
>
>
>
>
>

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com



Jim Lewandowski

Re: DB2 Compression on RVA boxes
(in response to craig patton)
As stated before, my MVS TCB inspect monitoring shows between 33% - 67%
MORE CPU than non-compressed (this is hardware compression via X'B263'
instruction CMPSC).

Remember, compression was designed primarily for DASD savings. If you
are already getting 99% (for example) buffer hit ratio on
non-compressed, compressed will buy you virtually nothing.

Jim Lewandowski


craig patton wrote:
>
> One small comment on being CPU constrained, DB2 compression, when done with
> 'hardware microcode', costs in the range of about .5% additional CPU (yes
> that is 'point' 5%). I would use compression even if CPU constrained as the
> I/O savings (GETPAGE uses cpu cycles) will more than make up for the
> additional CPU cycles. If your processor does NOT support hardware
> compression, then I would be much more cautious.
>
> Craig Patton
> DB2 DBA
> Patton Research Group, Inc.
>
> >From: [login to unmask email]
> >Reply-To: DB2 Data Base Discussion List <[login to unmask email]>
> >To: [login to unmask email]
> >Subject: Re: DB2 Compression on RVA boxes
> >Date: Fri, 10 Dec 1999 11:16:26 -0500
> >
> >If you are not CPU-constrained then use DB2 compression. Your data gets
> >compressed before it is externalized to the DASD subsystem -- all the
> >benefits of DB2 compression, like reduced I/O, still apply.
> >
> >With DASD subsystem products like EMC you get a further "structural"
> >compaction of the logical record, such as elimination of interrecord gaps,
> >end-of-track waste, and unused allocated tracks. You also get further data
> >compression of compressed data, plus DB2 catalog, directory, indexes, and
> >work files data compression.
> >
> >Kevin Mulcahy
> >CheckFree Corporation
> >
> >
> >
> >the DB2-L webpage at http://www.ryci.com/db2-l The owners of the list can
> >
>
> ______________________________________________________
> Get Your Private, Free Email at http://www.hotmail.com
>
>
> DB2-L webpage at http://www.ryci.com/db2-l The owners of the
list can be reached a



Richard Yevich

Re: DB2 Compression on RVA boxes
(in response to Jim Lewandowski)
Jim

Your statement that DB2 Compression is 33% to 67% CPU
overhead is ludicrous and incredibly misleading.
Regardless of what your monitor showed, you need to be
much more specific. Was that on compressing data, or
on decompressing the data? And if so, on what level
of compression, and how many rows and pages. Also,
what was the cpu reduction due to the reduction of
physical I/O?

On the 99% hit ratio...which formula? And what is the
difference where the data is purely random accessed
and compressed versus sequentially accessed and
compressed?

And finally, DB2 Compression was not purely for DASD
savings. And one more note -- in general, with mixed
usage, select/insert/update/delete random/sequential
average, there is less than 1% cpu overhead in a large
number of benchmarks.

For the rest of this, I sincerely hope Roger and Santa
Teresa get into this thread.

Regards,
Richard Yevich
==============
[login to unmask email]

--- Jim Lewandowski <[login to unmask email]> wrote:
> As stated before, my MVS TCB inspect monitoring
> shows between 33% - 67%
> MORE CPU than non-compressed (this is hardware
> compression via X'B263'
> instruction CMPSC).
>
> Remember, compression was designed primarily for
> DASD savings. If you
> are already getting 99% (for example) buffer hit
> ratio on
> non-compressed, compressed will buy you virtually
> nothing.
>
> Jim Lewandowski
>
>
> craig patton wrote:
> >
> > One small comment on being CPU constrained, DB2
> compression, when done with
> > 'hardware microcode', costs in the range of about
> .5% additional CPU (yes
> > that is 'point' 5%). I would use compression even
> if CPU constrained as the
> > I/O savings (GETPAGE uses cpu cycles) will more
> than make up for the
> > additional CPU cycles. If your processor does NOT
> support hardware
> > compression, then I would be much more cautious.
> >
> > Craig Patton
> > DB2 DBA
> > Patton Research Group, Inc.
> >
> > >From: [login to unmask email]
> > >Reply-To: DB2 Data Base Discussion List
> <[login to unmask email]>
> > >To: [login to unmask email]
> > >Subject: Re: DB2 Compression on RVA boxes
> > >Date: Fri, 10 Dec 1999 11:16:26 -0500
> > >
> > >If you are not CPU-constrained then use DB2
> compression. Your data gets
> > >compressed before it is externalized to the DASD
> subsystem -- all the
> > >benefits of DB2 compression, like reduced I/O,
> still apply.
> > >
> > >With DASD subsystem products like EMC you get a
> further "structural"
> > >compaction of the logical record, such as
> elimination of interrecord gaps,
> > >end-of-track waste, and unused allocated tracks.
> You also get further data
> > >compression of compressed data, plus DB2 catalog,
> directory, indexes, and
> > >work files data compression.
> > >
> > >Kevin Mulcahy
> > >CheckFree Corporation
> > >
> > >
> > >To change your subscription options or to cancel
> your subscription visit
> > >the DB2-L webpage at http://www.ryci.com/db2-l
> The owners of the list can
> > >
> >
> >
>
______________________________________________________
> > Get Your Private, Free Email at
> http://www.hotmail.com
> >
> >
> > To change your subscription options or to cancel
> your subscription visit the DB2-L webpage at
> http://www.ryci.com/db2-l The owners of the
> list can be reached a
>
>
> To change your subscription options or to cancel
> your subscription visit the DB2-L webpage at
> http://www.ryci.com/db2-l. The owners of the list
> can
>
>
>


__________________________________________________
Do You Yahoo!?
Thousands of Stores. Millions of Products. All in one place.
Yahoo! Shopping: http://shopping.yahoo.com



Jim Lewandowski

Re: DB2 Compression on RVA boxes
(in response to Richard Yevich)
Richard Yevich wrote:
>
> Jim
>
> Your statement that DB2 Compression is 33% to 67% CPU
> overhead is ludicrous and incredibly misleading.

I had stated in the past, this is what Candle's MVS monitor showed
during TCB inspect sampling. I would like to have a product like STROBE
active to see how its results compare with MVS Omegamon.

I would like other to try monitoring using Landmark's TMON, et. al. to
see if their results are consistent with mine? I have never seen anyone
do this or report back. So, all I can say is what I see. Could there
be a problem with MVS Omegamon? Possible but I can't say either way.


> Regardless of what your monitor showed, you need to be
> much more specific. Was that on compressing data, or
> on decompressing the data?

This was analyzing a thread TCB during retrieval of rows
(decompressing).

And if so, on what level
> of compression, and how many rows and pages. Also,
> what was the cpu reduction due to the reduction of
> physical I/O?

I had no way to know what the "before" picture was as far as number of
physical I/Os, so I can not address that.

>
> On the 99% hit ratio...which formula?

No particular formula. The point I was trying to make is if only 1% of
the time I may do a physical I/O, does the cost of decompressing all the
other rows for this thread, cost more CPU/elapsed time than just going
ahead to do the 1% physical I/Os.

And what is the
> difference where the data is purely random accessed
> and compressed versus sequentially accessed and
> compressed?
>
> And finally, DB2 Compression was not purely for DASD
> savings. And one more note -- in general, with mixed
> usage, select/insert/update/delete random/sequential
> average, there is less than 1% cpu overhead in a large
> number of benchmarks.
>

Is 1% quoted for the total/summarized overhead for a given SQL statement
or is it 1% to access (decompress) each row for an SQL?


> For the rest of this, I sincerely hope Roger and Santa
> Teresa get into this thread.
>
> Regards,
> Richard Yevich



Jim Lewandowski

Re: DB2 Compression on RVA boxes
(in response to Richard Humphris)
Since the DBAs think I'm just a "DASD guy", I was never involved in the
benchmarks. I am ONLY able to look at it post-compression-decision.
Here is the pertinent info:

LARGE table (about 1000 cyls - I don't know exactly as it is ptn'ed, I
believe)

90+% buffer hit (neither randomr or seq. access is far away from the
other - i.e. both are high as we have/had about 60,000 pages for this
compressed) table.

As, the SQL started, a lot of the data pages are in the bufferpool. I
would have to look at NPAGES for all the data partitions to know what %
is sitting (60,000 bufferpool pages) in the bufferpool at any point in
time.

From my memory, these queries were totally CPU bound (virtually no
physical I/O at all - i.e. 1 complete CP usage in the complex).

Since, I am using MVS Omegamon, I can ONLY look (INSPECT) the thread TCB
CPU when it is a long-running query (semi-ugly join or high # of data
rows hit).

When using inspect, I see 33% to 67% (depending on what else the SQL may
be doing) of the CPU right on the CMPSC X'B263' instruction. As I
clearly stated in older posts, it COULD be an issue of SRB
scheduling/sampling that always "happens" to be showing the thread TCB
PSW on the CMPSC instruction, but I tried different sampling rates to
see if it changed. I did not see any noticable difference.

I had also asked others who have STROBE or TMON to see what THEY see in
a semi-long-running enough SQL to get some numbers.

Obviously, running a before and after with compress off/on would be
ideal but I was not allowed to view the benchmarking that allowed the
compression decision.

Jim Lewandowski



Leslie Pendlebury-Bowe

Re: DB2 Compression on RVA boxes
(in response to Michael Ebert)
>"If you think compression is so good, why aren't all your tablespaces
>compressed?".

one reason is becuase you do not wish to bloat your DBM1 address space
with Compression Dictionaries (4K to 64K each) if you are DBM1 (max
2032Mb allowed for the address space) constrained .. and 64bit
addressing is not coming for another 12/24 months or so ..

this is just one reason I can think of straight away ...

:-))

Les


______________________________ Reply Separator _________________________________
Subject: Re: DB2 Compression on RVA boxes
Author: [login to unmask email] at Internet
Date: 12/13/99 11:08 AM


Hello all,

there have been many exchanges about DB2 compression (look in the archives), and
there are two factions. I'm on record as belonging to the
"con-Compression"-faction. Of course there are many possible scenarios, each
with their own "right" answers. If you only access your data in an OLTP-way,
i.e. reading/updating/inserting on the single-record (or few-record) level, with
a key that allows you to directly access the record, then the added overhead of
compression/decompression may be lost in all the other computations for setting
up/maintaining the connection, accessing the desired row, do data formatting,
data transmission and whatnot. Also, here the added CPU will be distributed over
the whole day and many threads and will not be noticeable.

It's a completely different picture if you're doing batch access (processing
more or less the entire tablespace). REORG is probably a worst-case scenario, as
it bypasses most of the CPU-intensive DB2 machinery. I'm quite convinced (maybe
I try it someday) that REORG with compression will cause an increase on the
order of 100% CPU even though many phases (like SORTKEYS, BUILD) are not
impacted by compression. Using KEEPDICTIONARY, I think it still will be a quite
noticeable increase (you still have to decompress columns for sorting and key
building). The same applies to well-designed batches, accessing properly
designed tables.

Here I'd like to repeat something I've said before: the more inefficient your
processing is, the less (percentagewise) will be the CPU increase due to
compression, and vice versa.

Somebody (possibly Jim) also asked a very good question, which I wish I'd
thought of:
"If you think compression is so good, why aren't all your tablespaces
compressed?".

Any takers for that one?

Dr. Michael Ebert
DB2 Database Administrator
aMaDEUS Data Processing
Erding / Munich, Germany




From: Jim Lewandowski <[login to unmask email]> on 11/12/99 23:46 GMT



Please respond to DB2 Data Base Discussion List <[login to unmask email]>


To: [login to unmask email]


cc: (bcc: Michael Ebert/MUC/AMADEUS)




Subject: Re: DB2 Compression on RVA boxes




Richard Yevich wrote:
>
> Jim
>
> Your statement that DB2 Compression is 33% to 67% CPU
> overhead is ludicrous and incredibly misleading.

I had stated in the past, this is what Candle's MVS monitor showed
during TCB inspect sampling. I would like to have a product like STROBE
active to see how its results compare with MVS Omegamon.

I would like other to try monitoring using Landmark's TMON, et. al. to
see if their results are consistent with mine? I have never seen anyone
do this or report back. So, all I can say is what I see. Could there
be a problem with MVS Omegamon? Possible but I can't say either way.


> Regardless of what your monitor showed, you need to be
> much more specific. Was that on compressing data, or
> on decompressing the data?

This was analyzing a thread TCB during retrieval of rows
(decompressing).

And if so, on what level
> of compression, and how many rows and pages. Also,
> what was the cpu reduction due to the reduction of
> physical I/O?

I had no way to know what the "before" picture was as far as number of
physical I/Os, so I can not address that.

>
> On the 99% hit ratio...which formula?

No particular formula. The point I was trying to make is if only 1% of
the time I may do a physical I/O, does the cost of decompressing all the
other rows for this thread, cost more CPU/elapsed time than just going
ahead to do the 1% physical I/Os.

And what is the
> difference where the data is purely random accessed
> and compressed versus sequentially accessed and
> compressed?
>
> And finally, DB2 Compression was not purely for DASD
> savings. And one more note -- in general, with mixed
> usage, select/insert/update/delete random/sequential
> average, there is less than 1% cpu overhead in a large
> number of benchmarks.
>

Is 1% quoted for the total/summarized overhead for a given SQL statement
or is it 1% to access (decompress) each row for an SQL?


> For the rest of this, I sincerely hope Roger and Santa
> Teresa get into this thread.
>
> Regards,
> Richard Yevich













Michael Ebert

Re: DB2 Compression on RVA boxes
(in response to Jim Lewandowski)
Hello all,

there have been many exchanges about DB2 compression (look in the archives), and
there are two factions. I'm on record as belonging to the
"con-Compression"-faction. Of course there are many possible scenarios, each
with their own "right" answers. If you only access your data in an OLTP-way,
i.e. reading/updating/inserting on the single-record (or few-record) level, with
a key that allows you to directly access the record, then the added overhead of
compression/decompression may be lost in all the other computations for setting
up/maintaining the connection, accessing the desired row, do data formatting,
data transmission and whatnot. Also, here the added CPU will be distributed over
the whole day and many threads and will not be noticeable.

It's a completely different picture if you're doing batch access (processing
more or less the entire tablespace). REORG is probably a worst-case scenario, as
it bypasses most of the CPU-intensive DB2 machinery. I'm quite convinced (maybe
I try it someday) that REORG with compression will cause an increase on the
order of 100% CPU even though many phases (like SORTKEYS, BUILD) are not
impacted by compression. Using KEEPDICTIONARY, I think it still will be a quite
noticeable increase (you still have to decompress columns for sorting and key
building). The same applies to well-designed batches, accessing properly
designed tables.

Here I'd like to repeat something I've said before: the more inefficient your
processing is, the less (percentagewise) will be the CPU increase due to
compression, and vice versa.

Somebody (possibly Jim) also asked a very good question, which I wish I'd
thought of:
"If you think compression is so good, why aren't all your tablespaces
compressed?".

Any takers for that one?

Dr. Michael Ebert
DB2 Database Administrator
aMaDEUS Data Processing
Erding / Munich, Germany




From: Jim Lewandowski <[login to unmask email]> on 11/12/99 23:46 GMT



Please respond to DB2 Data Base Discussion List <[login to unmask email]>


To: [login to unmask email]


cc: (bcc: Michael Ebert/MUC/AMADEUS)




Subject: Re: DB2 Compression on RVA boxes




Richard Yevich wrote:
>
> Jim
>
> Your statement that DB2 Compression is 33% to 67% CPU
> overhead is ludicrous and incredibly misleading.

I had stated in the past, this is what Candle's MVS monitor showed
during TCB inspect sampling. I would like to have a product like STROBE
active to see how its results compare with MVS Omegamon.

I would like other to try monitoring using Landmark's TMON, et. al. to
see if their results are consistent with mine? I have never seen anyone
do this or report back. So, all I can say is what I see. Could there
be a problem with MVS Omegamon? Possible but I can't say either way.


> Regardless of what your monitor showed, you need to be
> much more specific. Was that on compressing data, or
> on decompressing the data?

This was analyzing a thread TCB during retrieval of rows
(decompressing).

And if so, on what level
> of compression, and how many rows and pages. Also,
> what was the cpu reduction due to the reduction of
> physical I/O?

I had no way to know what the "before" picture was as far as number of
physical I/Os, so I can not address that.

>
> On the 99% hit ratio...which formula?

No particular formula. The point I was trying to make is if only 1% of
the time I may do a physical I/O, does the cost of decompressing all the
other rows for this thread, cost more CPU/elapsed time than just going
ahead to do the 1% physical I/Os.

And what is the
> difference where the data is purely random accessed
> and compressed versus sequentially accessed and
> compressed?
>
> And finally, DB2 Compression was not purely for DASD
> savings. And one more note -- in general, with mixed
> usage, select/insert/update/delete random/sequential
> average, there is less than 1% cpu overhead in a large
> number of benchmarks.
>

Is 1% quoted for the total/summarized overhead for a given SQL statement
or is it 1% to access (decompress) each row for an SQL?


> For the rest of this, I sincerely hope Roger and Santa
> Teresa get into this thread.
>
> Regards,
> Richard Yevich








Leslie Pendlebury-Bowe

Re: DB2 Compression on RVA boxes
(in response to Adrian Savory)
>By the way, DB2 for AS400 automatically compresses data. One has no
>choice in turning it on or off.

oooooooouuuuuuuuuuuuuuch

never worked ont his platform - but I have on Unix/NT/OS2 etc .. and
this would be a real pain.

Leslie Pendlebury-Bowe



______________________________ Reply Separator _________________________________
Subject: Re: DB2 Compression on RVA boxes
Author: Lynne Flatley <[login to unmask email]> at Internet
Date: 12/13/99 7:07 AM


It's my opinion that you wouldn't want to compress 'everything' because some
tablespaces are so small, the compression dictionary would be larger than
the space required to hold the data. I have seen compression cut a batch
cycle in half, (two hours to one) in an application's database that
contained multi-million row tables with insert, update, delete processing.

By the way, DB2 for AS400 automatically compresses data. One has no choice
in turning it on or off.

> -----Original Message-----
> From: [login to unmask email] [SMTP:[login to unmask email]
> Sent: Monday, December 13, 1999 5:08 AM
> To: [login to unmask email]
> Subject: Re: DB2 Compression on RVA boxes
>
> Hello all,
>
> there have been many exchanges about DB2 compression (look in the
> archives), and
> there are two factions. I'm on record as belonging to the
> "con-Compression"-faction. Of course there are many possible scenarios,
> each
> with their own "right" answers. If you only access your data in an
> OLTP-way,
> i.e. reading/updating/inserting on the single-record (or few-record)
> level, with
> a key that allows you to directly access the record, then the added
> overhead of
> compression/decompression may be lost in all the other computations for
> setting
> up/maintaining the connection, accessing the desired row, do data
> formatting,
> data transmission and whatnot. Also, here the added CPU will be
> distributed over
> the whole day and many threads and will not be noticeable.
>
> It's a completely different picture if you're doing batch access
> (processing
> more or less the entire tablespace). REORG is probably a worst-case
> scenario, as
> it bypasses most of the CPU-intensive DB2 machinery. I'm quite convinced
> (maybe
> I try it someday) that REORG with compression will cause an increase on
> the
> order of 100% CPU even though many phases (like SORTKEYS, BUILD) are not
> impacted by compression. Using KEEPDICTIONARY, I think it still will be a
> quite
> noticeable increase (you still have to decompress columns for sorting and
> key
> building). The same applies to well-designed batches, accessing properly
> designed tables.
>
> Here I'd like to repeat something I've said before: the more inefficient
> your
> processing is, the less (percentagewise) will be the CPU increase due to
> compression, and vice versa.
>
> Somebody (possibly Jim) also asked a very good question, which I wish I'd
> thought of:
> "If you think compression is so good, why aren't all your tablespaces
> compressed?".
>
> Any takers for that one?
>
> Dr. Michael Ebert
> DB2 Database Administrator
> aMaDEUS Data Processing
> Erding / Munich, Germany
>
>
>
>
> From: Jim Lewandowski <[login to unmask email]> on 11/12/99 23:46 GMT
>
>
>
> Please respond to DB2 Data Base Discussion List <[login to unmask email]>
>
>
> To: [login to unmask email]
>
>
> cc: (bcc: Michael Ebert/MUC/AMADEUS)
>
>
>
>
> Subject: Re: DB2 Compression on RVA boxes
>
>
>
>
> Richard Yevich wrote:
> >
> > Jim
> >
> > Your statement that DB2 Compression is 33% to 67% CPU
> > overhead is ludicrous and incredibly misleading.
>
> I had stated in the past, this is what Candle's MVS monitor showed
> during TCB inspect sampling. I would like to have a product like STROBE
> active to see how its results compare with MVS Omegamon.
>
> I would like other to try monitoring using Landmark's TMON, et. al. to
> see if their results are consistent with mine? I have never seen anyone
> do this or report back. So, all I can say is what I see. Could there
> be a problem with MVS Omegamon? Possible but I can't say either way.
>
>
> > Regardless of what your monitor showed, you need to be
> > much more specific. Was that on compressing data, or
> > on decompressing the data?
>
> This was analyzing a thread TCB during retrieval of rows
> (decompressing).
>
> And if so, on what level
> > of compression, and how many rows and pages. Also,
> > what was the cpu reduction due to the reduction of
> > physical I/O?
>
> I had no way to know what the "before" picture was as far as number of
> physical I/Os, so I can not address that.
>
> >
> > On the 99% hit ratio...which formula?
>
> No particular formula. The point I was trying to make is if only 1% of
> the time I may do a physical I/O, does the cost of decompressing all the
> other rows for this thread, cost more CPU/elapsed time than just going
> ahead to do the 1% physical I/Os.
>
> And what is the
> > difference where the data is purely random accessed
> > and compressed versus sequentially accessed and
> > compressed?
> >
> > And finally, DB2 Compression was not purely for DASD
> > savings. And one more note -- in general, with mixed
> > usage, select/insert/update/delete random/sequential
> > average, there is less than 1% cpu overhead in a large
> > number of benchmarks.
> >
>
> Is 1% quoted for the total/summarized overhead for a given SQL statement
> or is it 1% to access (decompress) each row for an SQL?
>
>
> > For the rest of this, I sincerely hope Roger and Santa
> > Teresa get into this thread.
> >
> > Regards,
> > Richard Yevich
>
>
>
> the
>
>
>
>
>
>
>
>








Lynne Flatley

Re: DB2 Compression on RVA boxes
(in response to Leslie Pendlebury-Bowe)
It's my opinion that you wouldn't want to compress 'everything' because some
tablespaces are so small, the compression dictionary would be larger than
the space required to hold the data. I have seen compression cut a batch
cycle in half, (two hours to one) in an application's database that
contained multi-million row tables with insert, update, delete processing.

By the way, DB2 for AS400 automatically compresses data. One has no choice
in turning it on or off.

> -----Original Message-----
> From: [login to unmask email] [SMTP:[login to unmask email]
> Sent: Monday, December 13, 1999 5:08 AM
> To: [login to unmask email]
> Subject: Re: DB2 Compression on RVA boxes
>
> Hello all,
>
> there have been many exchanges about DB2 compression (look in the
> archives), and
> there are two factions. I'm on record as belonging to the
> "con-Compression"-faction. Of course there are many possible scenarios,
> each
> with their own "right" answers. If you only access your data in an
> OLTP-way,
> i.e. reading/updating/inserting on the single-record (or few-record)
> level, with
> a key that allows you to directly access the record, then the added
> overhead of
> compression/decompression may be lost in all the other computations for
> setting
> up/maintaining the connection, accessing the desired row, do data
> formatting,
> data transmission and whatnot. Also, here the added CPU will be
> distributed over
> the whole day and many threads and will not be noticeable.
>
> It's a completely different picture if you're doing batch access
> (processing
> more or less the entire tablespace). REORG is probably a worst-case
> scenario, as
> it bypasses most of the CPU-intensive DB2 machinery. I'm quite convinced
> (maybe
> I try it someday) that REORG with compression will cause an increase on
> the
> order of 100% CPU even though many phases (like SORTKEYS, BUILD) are not
> impacted by compression. Using KEEPDICTIONARY, I think it still will be a
> quite
> noticeable increase (you still have to decompress columns for sorting and
> key
> building). The same applies to well-designed batches, accessing properly
> designed tables.
>
> Here I'd like to repeat something I've said before: the more inefficient
> your
> processing is, the less (percentagewise) will be the CPU increase due to
> compression, and vice versa.
>
> Somebody (possibly Jim) also asked a very good question, which I wish I'd
> thought of:
> "If you think compression is so good, why aren't all your tablespaces
> compressed?".
>
> Any takers for that one?
>
> Dr. Michael Ebert
> DB2 Database Administrator
> aMaDEUS Data Processing
> Erding / Munich, Germany
>
>
>
>
> From: Jim Lewandowski <[login to unmask email]> on 11/12/99 23:46 GMT
>
>
>
> Please respond to DB2 Data Base Discussion List <[login to unmask email]>
>
>
> To: [login to unmask email]
>
>
> cc: (bcc: Michael Ebert/MUC/AMADEUS)
>
>
>
>
> Subject: Re: DB2 Compression on RVA boxes
>
>
>
>
> Richard Yevich wrote:
> >
> > Jim
> >
> > Your statement that DB2 Compression is 33% to 67% CPU
> > overhead is ludicrous and incredibly misleading.
>
> I had stated in the past, this is what Candle's MVS monitor showed
> during TCB inspect sampling. I would like to have a product like STROBE
> active to see how its results compare with MVS Omegamon.
>
> I would like other to try monitoring using Landmark's TMON, et. al. to
> see if their results are consistent with mine? I have never seen anyone
> do this or report back. So, all I can say is what I see. Could there
> be a problem with MVS Omegamon? Possible but I can't say either way.
>
>
> > Regardless of what your monitor showed, you need to be
> > much more specific. Was that on compressing data, or
> > on decompressing the data?
>
> This was analyzing a thread TCB during retrieval of rows
> (decompressing).
>
> And if so, on what level
> > of compression, and how many rows and pages. Also,
> > what was the cpu reduction due to the reduction of
> > physical I/O?
>
> I had no way to know what the "before" picture was as far as number of
> physical I/Os, so I can not address that.
>
> >
> > On the 99% hit ratio...which formula?
>
> No particular formula. The point I was trying to make is if only 1% of
> the time I may do a physical I/O, does the cost of decompressing all the
> other rows for this thread, cost more CPU/elapsed time than just going
> ahead to do the 1% physical I/Os.
>
> And what is the
> > difference where the data is purely random accessed
> > and compressed versus sequentially accessed and
> > compressed?
> >
> > And finally, DB2 Compression was not purely for DASD
> > savings. And one more note -- in general, with mixed
> > usage, select/insert/update/delete random/sequential
> > average, there is less than 1% cpu overhead in a large
> > number of benchmarks.
> >
>
> Is 1% quoted for the total/summarized overhead for a given SQL statement
> or is it 1% to access (decompress) each row for an SQL?
>
>
> > For the rest of this, I sincerely hope Roger and Santa
> > Teresa get into this thread.
> >
> > Regards,
> > Richard Yevich
>
>
>
> the
>
>
>
>
>
>
>
>



Adrian Savory

Re: DB2 Compression on RVA boxes
(in response to Lynne Flatley)
A few more reasons....

First and foremost compression is used to make the object smaller, either to
save DASD, or for performance reasons i.e. better coverage within your buffer
pools. (If you are using RAID then you may not be so concerned about the former,
though you can still save more space than if you just use RAID compression - see
Chapter 2 of the 'Using RVA and Snapshot...' redbook (SG24-5333-00) for some
figures for RVA.) This may be stating the obvious but it is possible to
implement compression and end up using more space as you need to accomodate the
compression dictionary. Run DSN1COMP to make sure that considering compression
is worthwhile.

If DSN1COMP shows you that compression should be considered then look at how the
tablespace is accessed. If you have a large static tablespace that is randomly
accessed then this is an obvious candidate as there will be little overhead. You
may even see an overall reduction in CPU and I/O when you take into account the
savings made when running image copies for example. On the other hand you would
not consider compressing a small volatile tablespace, and probably not a large
tablespace that was accessed mainly sequentially (though it depends!).

DB2 compression is not a panacea. I've used it successfully in a number of
environments. At the same time I've done benckmarking to show that in certain
circumstances there is a large CPU overhead with some activities on some tables,
specifically tablespace scans, mass data changes (particularly mass updates) and
reorgs without KEEPDICTIONARY.

Adrian Savory
ACSIS Ltd





Leslie Pendlebury-Bowe <[login to unmask email]> on 13/12/99
09:58:08

Please respond to DB2 Data Base Discussion List <[login to unmask email]>

To: [login to unmask email]
cc: (bcc: Adrian Savory/ZI/England/Zurich)

Subject: Re: DB2 Compression on RVA boxes



>"If you think compression is so good, why aren't all your tablespaces
>compressed?".

one reason is becuase you do not wish to bloat your DBM1 address space
with Compression Dictionaries (4K to 64K each) if you are DBM1 (max
2032Mb allowed for the address space) constrained .. and 64bit
addressing is not coming for another 12/24 months or so ..

this is just one reason I can think of straight away ...

:-))

Les


______________________________ Reply Separator _________________________________
Subject: Re: DB2 Compression on RVA boxes
Author: [login to unmask email] at Internet
Date: 12/13/99 11:08 AM


Hello all,

there have been many exchanges about DB2 compression (look in the archives), and
there are two factions. I'm on record as belonging to the
"con-Compression"-faction. Of course there are many possible scenarios, each
with their own "right" answers. If you only access your data in an OLTP-way,
i.e. reading/updating/inserting on the single-record (or few-record) level, with
a key that allows you to directly access the record, then the added overhead of
compression/decompression may be lost in all the other computations for setting
up/maintaining the connection, accessing the desired row, do data formatting,
data transmission and whatnot. Also, here the added CPU will be distributed over
the whole day and many threads and will not be noticeable.

It's a completely different picture if you're doing batch access (processing
more or less the entire tablespace). REORG is probably a worst-case scenario, as
it bypasses most of the CPU-intensive DB2 machinery. I'm quite convinced (maybe
I try it someday) that REORG with compression will cause an increase on the
order of 100% CPU even though many phases (like SORTKEYS, BUILD) are not
impacted by compression. Using KEEPDICTIONARY, I think it still will be a quite
noticeable increase (you still have to decompress columns for sorting and key
building). The same applies to well-designed batches, accessing properly
designed tables.

Here I'd like to repeat something I've said before: the more inefficient your
processing is, the less (percentagewise) will be the CPU increase due to
compression, and vice versa.

Somebody (possibly Jim) also asked a very good question, which I wish I'd
thought of:
"If you think compression is so good, why aren't all your tablespaces
compressed?".

Any takers for that one?

Dr. Michael Ebert
DB2 Database Administrator
aMaDEUS Data Processing
Erding / Munich, Germany




From: Jim Lewandowski <[login to unmask email]> on 11/12/99 23:46 GMT



Please respond to DB2 Data Base Discussion List <[login to unmask email]>


To: [login to unmask email]


cc: (bcc: Michael Ebert/MUC/AMADEUS)




Subject: Re: DB2 Compression on RVA boxes




Richard Yevich wrote:
>
> Jim
>
> Your statement that DB2 Compression is 33% to 67% CPU
> overhead is ludicrous and incredibly misleading.

I had stated in the past, this is what Candle's MVS monitor showed
during TCB inspect sampling. I would like to have a product like STROBE
active to see how its results compare with MVS Omegamon.

I would like other to try monitoring using Landmark's TMON, et. al. to
see if their results are consistent with mine? I have never seen anyone
do this or report back. So, all I can say is what I see. Could there
be a problem with MVS Omegamon? Possible but I can't say either way.


> Regardless of what your monitor showed, you need to be
> much more specific. Was that on compressing data, or
> on decompressing the data?

This was analyzing a thread TCB during retrieval of rows
(decompressing).

And if so, on what level
> of compression, and how many rows and pages. Also,
> what was the cpu reduction due to the reduction of
> physical I/O?

I had no way to know what the "before" picture was as far as number of
physical I/Os, so I can not address that.

>
> On the 99% hit ratio...which formula?

No particular formula. The point I was trying to make is if only 1% of
the time I may do a physical I/O, does the cost of decompressing all the
other rows for this thread, cost more CPU/elapsed time than just going
ahead to do the 1% physical I/Os.

And what is the
> difference where the data is purely random accessed
> and compressed versus sequentially accessed and
> compressed?
>
> And finally, DB2 Compression was not purely for DASD
> savings. And one more note -- in general, with mixed
> usage, select/insert/update/delete random/sequential
> average, there is less than 1% cpu overhead in a large
> number of benchmarks.
>

Is 1% quoted for the total/summarized overhead for a given SQL statement
or is it 1% to access (decompress) each row for an SQL?


> For the rest of this, I sincerely hope Roger and Santa
> Teresa get into this thread.
>
> Regards,
> Richard Yevich


















John Arbogast

Re: DB2 Compression on RVA boxes
(in response to Leslie Pendlebury-Bowe)
If your processor doesn't support hardware
compression, call your electrical supplier and have
them start their reserve power generating capacity
before using software compression. Your data center
lights might dim like a B grade prison movie.

:)


--- craig patton <[login to unmask email]> wrote:
> One small comment on being CPU constrained, DB2
> compression, when done with
> 'hardware microcode', costs in the range of about
> .5% additional CPU (yes
> that is 'point' 5%). I would use compression even
> if CPU constrained as the
> I/O savings (GETPAGE uses cpu cycles) will more than
> make up for the
> additional CPU cycles. If your processor does NOT
> support hardware
> compression, then I would be much more cautious.
>
> Craig Patton
> DB2 DBA
> Patton Research Group, Inc.
>
>
> >From: [login to unmask email]
> >Reply-To: DB2 Data Base Discussion List
> <[login to unmask email]>
> >To: [login to unmask email]
> >Subject: Re: DB2 Compression on RVA boxes
> >Date: Fri, 10 Dec 1999 11:16:26 -0500
> >
> >If you are not CPU-constrained then use DB2
> compression. Your data gets
> >compressed before it is externalized to the DASD
> subsystem -- all the
> >benefits of DB2 compression, like reduced I/O,
> still apply.
> >
> >With DASD subsystem products like EMC you get a
> further "structural"
> >compaction of the logical record, such as
> elimination of interrecord gaps,
> >end-of-track waste, and unused allocated tracks.
> You also get further data
> >compression of compressed data, plus DB2 catalog,
> directory, indexes, and
> >work files data compression.
> >
> >Kevin Mulcahy
> >CheckFree Corporation
> >
> >
> >To change your subscription options or to cancel
> your subscription visit
> >the DB2-L webpage at http://www.ryci.com/db2-l. The
> owners of the list can
> >
>
>
______________________________________________________
> Get Your Private, Free Email at
> http://www.hotmail.com
>
>
> To change your subscription options or to cancel
> your subscription visit the DB2-L webpage at
> http://www.ryci.com/db2-l. The owners of the list
> can
>


__________________________________________________
Do You Yahoo!?
Thousands of Stores. Millions of Products. All in one place.
Yahoo! Shopping: http://shopping.yahoo.com



Roger Miller

Re: DB2 Compression on RVA boxes
(in response to John Arbogast)
When we ran our first tests for reorg with keepdictionary back in V3, we
found that the
savings in handling the shorter files for input and output more than made
up for the
cost of decompression for extracting keys. The range for the runs was in
the range
of no difference to about 10% less cpu time. Our runs are in the red book
on DB2
performance - GG24-4284.

Compressing everything is counterproductive. There is a cost for the
compression
dictionary - in storage. There are situations where the cpu increase is
too large,
and others where the improvement in space is too small. There are many
situtations
where compression is just exactly right, reducing the size of bufferpools
or
improving their efficiency, reducing the cost of disk and improving the
speed
and efficiency of processing.

Roger Miller



Richard Humphris

Re: DB2 Compression on RVA boxes
(in response to Roger Miller)
Hi Jim,

I think there are a lot of ways of running DB2 benchmarks which can skew the
results and/or have different cpu ratio's. The primary way to skew a test
is by using a small amount of data which has already been read and therefore
resides in the bufferpool.

Even if a test is controlled to eliminate buffer pool effects (either both
benchmarks use pre-loaded data or neither does) there may be other factors
which might interfer with a benchmark. The load on DB2; did a checkpoint
occur during a benchmark; did a buffer pool threshold hit during the
benchmark, etc. are additional factors which might skew the results.

Using a poor candidate for compression will skew the results to. A poor
candidate might be:
- a small table which is constantly referenced; therefore, this table will
(almost always) reside in the buffer pool with or without compression.
- in this case, referenced rows will have to be decompressed constantly;
but with no I/O savings and with a CPU cost.
- benchmarking such tables can result in really skewed results if one of
the tables was already in the buffer pool prior to it's benchmark.

A good candidate for compression (CPU differences minimal) would be:
- a very large table (lot's of rows), containing data in the rows which
can be compressed (i.e. the non-index portion).
- such a table, even if referenced constantly, would be considered large
if it could never reside in the buffer pool in it's entirety therefore I/O
will be needed to service pre-fetch, sequential and random I/O getpage
requests.

So I'm interested in hearing about what you had originally compressed (# or
rows, row size, index size(in row)) and the size of your buffer pool,
whether the table is or isn't loaded in the buffer pool, etc., when you ran
your benchmarks.

Rich




> -----Original Message-----
> From: Jim Lewandowski [SMTP:[login to unmask email]
> Sent: Saturday, December 11, 1999 5:47 PM
> To: [login to unmask email]
> Subject: Re: DB2 Compression on RVA boxes
>
> Richard Yevich wrote:
> >
> > Jim
> >
> > Your statement that DB2 Compression is 33% to 67% CPU
> > overhead is ludicrous and incredibly misleading.
>
> I had stated in the past, this is what Candle's MVS monitor showed
> during TCB inspect sampling. I would like to have a product like STROBE
> active to see how its results compare with MVS Omegamon.
>
> I would like other to try monitoring using Landmark's TMON, et. al. to
> see if their results are consistent with mine? I have never seen anyone
> do this or report back. So, all I can say is what I see. Could there
> be a problem with MVS Omegamon? Possible but I can't say either way.
>
>
> > Regardless of what your monitor showed, you need to be
> > much more specific. Was that on compressing data, or
> > on decompressing the data?
>
> This was analyzing a thread TCB during retrieval of rows
> (decompressing).
>
> And if so, on what level
> > of compression, and how many rows and pages. Also,
> > what was the cpu reduction due to the reduction of
> > physical I/O?
>
> I had no way to know what the "before" picture was as far as number of
> physical I/Os, so I can not address that.
>
> >
> > On the 99% hit ratio...which formula?
>
> No particular formula. The point I was trying to make is if only 1% of
> the time I may do a physical I/O, does the cost of decompressing all the
> other rows for this thread, cost more CPU/elapsed time than just going
> ahead to do the 1% physical I/Os.
>
> And what is the
> > difference where the data is purely random accessed
> > and compressed versus sequentially accessed and
> > compressed?
> >
> > And finally, DB2 Compression was not purely for DASD
> > savings. And one more note -- in general, with mixed
> > usage, select/insert/update/delete random/sequential
> > average, there is less than 1% cpu overhead in a large
> > number of benchmarks.
> >
>
> Is 1% quoted for the total/summarized overhead for a given SQL statement
> or is it 1% to access (decompress) each row for an SQL?
>
>
> > For the rest of this, I sincerely hope Roger and Santa
> > Teresa get into this thread.
> >
> > Regards,
> > Richard Yevich
>
>
>
>
>



Richard Humphris

Re: DB2 Compression on RVA boxes
(in response to Jim Lewandowski)
Hi Jim,

I'm just going to throw out a couple of thoughts. I don't know what DB2's
compression looks like or the internals of DB2. But I'll make some
statements and ask some questions that you (or someone else) may be able to
answer.

Is think the following statement is true:
When the administrator guide says "The Corresponding index data is NOT
compressed.". I think this means that the the index columns in the data
table will be compressed; but the index table will not compressed.

This means that:
Sql code that DB2 uses the index for, will be almost as efficient as before.
But if DB2 doesn't have an index to use (or decides to use a tablespace
scan anyway), then all 1,000 cylinders will have to be decompressed for the
SQL statement. This may account for a significant waste of CPU. On the
otherhand, a great number of short transactions, using the index (and
avoiding unneccessary decompression) may keep the overall CPU increase to
acceptable levels. It may be that compressing a table means creating
secondary indexes (if they'll be used) may become more important. Does this
also mean that fooling DB2 into using indexes becomes more important with
compressed tables too?

Another question: does DB2 treat decompressed tables any different from
compressed tables when trying to decide if an index should be ignored?

Btw, the worst thing that could happen is that your row sizes are large
(almost the size of a 4k page) and the rows decompressed equivlent is
greater than 2k. A compress of such a perverse table will result in no
space savings for the table but with still require additional cpu usage to
compress/uncompress the row. Note: IBM documented this behaviour.

Rich
> -----Original Message-----
> From: Jim Lewandowski [SMTP:[login to unmask email]
> Sent: Monday, December 13, 1999 1:36 AM
> To: [login to unmask email]
> Subject: Re: DB2 Compression on RVA boxes
>
> Since the DBAs think I'm just a "DASD guy", I was never involved in the
> benchmarks. I am ONLY able to look at it post-compression-decision.
> Here is the pertinent info:
>
> LARGE table (about 1000 cyls - I don't know exactly as it is ptn'ed, I
> believe)
>
> 90+% buffer hit (neither randomr or seq. access is far away from the
> other - i.e. both are high as we have/had about 60,000 pages for this
> compressed) table.
>
> As, the SQL started, a lot of the data pages are in the bufferpool. I
> would have to look at NPAGES for all the data partitions to know what %
> is sitting (60,000 bufferpool pages) in the bufferpool at any point in
> time.
>
> From my memory, these queries were totally CPU bound (virtually no
> physical I/O at all - i.e. 1 complete CP usage in the complex).
>
> Since, I am using MVS Omegamon, I can ONLY look (INSPECT) the thread TCB
> CPU when it is a long-running query (semi-ugly join or high # of data
> rows hit).
>
> When using inspect, I see 33% to 67% (depending on what else the SQL may
> be doing) of the CPU right on the CMPSC X'B263' instruction. As I
> clearly stated in older posts, it COULD be an issue of SRB
> scheduling/sampling that always "happens" to be showing the thread TCB
> PSW on the CMPSC instruction, but I tried different sampling rates to
> see if it changed. I did not see any noticable difference.
>
> I had also asked others who have STROBE or TMON to see what THEY see in
> a semi-long-running enough SQL to get some numbers.
>
> Obviously, running a before and after with compress off/on would be
> ideal but I was not allowed to view the benchmarking that allowed the
> compression decision.
>
> Jim Lewandowski
>
>
>
>
>



craig patton

Re: DB2 Compression on RVA boxes
(in response to Richard Humphris)
In response to the indexes, you are correct, data in the index is in
'uncompressed' format, ONLY the data in the table get's compressed.
Tablespace scans are normally MORE efficient with compression as fewer
GETPAGE requests are needed resulting in FEWER I/O's. In my experience, the
savings in I/O processing more than makes up for the additional CPU required
for the data compression/decompression. Keep in mind as well, DB2
Decompression is extremely fast and efficient. Compression is FAST, but
more involved than Decompression.

As far as different considerations for access via indexes when compressed or
not-compressed, one of the factors in choosing access is the number of data
pages required for a TS Scan. Compression will reduce the number of data
pages required for a TS Scan and DB2 may choose a TS Scan more prevelantly
than without compression. HOWEVER, I have NOT seen this as a trend.

One other point, DB2 uses a data dictionary (built from the data during LOAD
or REORG). This dictionary comprises of the first 15 or 16 data pages in the
TS VSAM FILE (EACH partition has it's own dictionary in a partitioned TS).
The dictionary is loaded into the bufferpool and remains resident, so NO I/O
is required for the utilization of the dictionary pages once loaded.

1 rule that I follow in choosing DB2 Compression is to make sure the CPU
processor (such as the IBM CMOS processors) contain the compression
MICROCODE instructions. In this case, the published amount of average CPU
overhead is .5% (yes, point 5 percent). I believe it is around 30% overhead
using SOFTWARE compression (those processors that do NOT have the microcode
in the hardware.)

I am a STROG supporter of using compression whenever possible and viable,
even for objects that are used for heavy inserts and updates.

HTH,
Craig Patton
DB2 DBA


>From: "Humphris,Richard P.(NXI)" <[login to unmask email]>
>Reply-To: DB2 Data Base Discussion List <[login to unmask email]>
>To: [login to unmask email]
>Subject: Re: DB2 Compression on RVA boxes
>Date: Tue, 14 Dec 1999 09:31:50 -0600
>
>Hi Jim,
>
>I'm just going to throw out a couple of thoughts. I don't know what DB2's
>compression looks like or the internals of DB2. But I'll make some
>statements and ask some questions that you (or someone else) may be able to
>answer.
>
>Is think the following statement is true:
>When the administrator guide says "The Corresponding index data is NOT
>compressed.". I think this means that the the index columns in the data
>table will be compressed; but the index table will not compressed.
>
>This means that:
>Sql code that DB2 uses the index for, will be almost as efficient as
>before.
>But if DB2 doesn't have an index to use (or decides to use a tablespace
>scan anyway), then all 1,000 cylinders will have to be decompressed for the
>SQL statement. This may account for a significant waste of CPU. On the
>otherhand, a great number of short transactions, using the index (and
>avoiding unneccessary decompression) may keep the overall CPU increase to
>acceptable levels. It may be that compressing a table means creating
>secondary indexes (if they'll be used) may become more important. Does
>this
>also mean that fooling DB2 into using indexes becomes more important with
>compressed tables too?
>
>Another question: does DB2 treat decompressed tables any different from
>compressed tables when trying to decide if an index should be ignored?
>
>Btw, the worst thing that could happen is that your row sizes are large
>(almost the size of a 4k page) and the rows decompressed equivlent is
>greater than 2k. A compress of such a perverse table will result in no
>space savings for the table but with still require additional cpu usage to
>compress/uncompress the row. Note: IBM documented this behaviour.
>
>Rich
> > -----Original Message-----
> > From: Jim Lewandowski [SMTP:[login to unmask email]
> > Sent: Monday, December 13, 1999 1:36 AM
> > To: [login to unmask email]
> > Subject: Re: DB2 Compression on RVA boxes
> >
> > Since the DBAs think I'm just a "DASD guy", I was never involved in the
> > benchmarks. I am ONLY able to look at it post-compression-decision.
> > Here is the pertinent info:
> >
> > LARGE table (about 1000 cyls - I don't know exactly as it is ptn'ed, I
> > believe)
> >
> > 90+% buffer hit (neither randomr or seq. access is far away from the
> > other - i.e. both are high as we have/had about 60,000 pages for this
> > compressed) table.
> >
> > As, the SQL started, a lot of the data pages are in the bufferpool. I
> > would have to look at NPAGES for all the data partitions to know what %
> > is sitting (60,000 bufferpool pages) in the bufferpool at any point in
> > time.
> >
> > From my memory, these queries were totally CPU bound (virtually no
> > physical I/O at all - i.e. 1 complete CP usage in the complex).
> >
> > Since, I am using MVS Omegamon, I can ONLY look (INSPECT) the thread TCB
> > CPU when it is a long-running query (semi-ugly join or high # of data
> > rows hit).
> >
> > When using inspect, I see 33% to 67% (depending on what else the SQL may
> > be doing) of the CPU right on the CMPSC X'B263' instruction. As I
> > clearly stated in older posts, it COULD be an issue of SRB
> > scheduling/sampling that always "happens" to be showing the thread TCB
> > PSW on the CMPSC instruction, but I tried different sampling rates to
> > see if it changed. I did not see any noticable difference.
> >
> > I had also asked others who have STROBE or TMON to see what THEY see in
> > a semi-long-running enough SQL to get some numbers.
> >
> > Obviously, running a before and after with compress off/on would be
> > ideal but I was not allowed to view the benchmarking that allowed the
> > compression decision.
> >
> > Jim Lewandowski
> >
> >
> >
> > the DB2-L webpage at http://www.ryci.com/db2-l. The owners of the list
>can
> >
>
>
>
>
>

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com



Richard Yevich

Re: DB2 Compression on RVA boxes
(in response to craig patton)
Richard,

First, DB2 compression is a "dictionary" style
algorithm, very similar to PKZIP -- compressions
builds a dictionary, decompression is a lookup. It is
also row level, which means some rows do not get
compressed. It is not "page level compression".

Table spaces only are compressed, never index spaces.
Compression in the table is a "dictionary match"
level, and not column level - pure data only - so any
reference to index columns regarding what is or is not
compressed is meaningless. Understand that when I say
'dictionary match' I am greatly simplifying the tree
structure used to implement this type of algorithm.

SQL and tablespace scan WILL NOT CAUSE DECOMPRESSION
of 100%. It will only cause decompression on rows
SELECTED, if and only if those columns were in a row
that was compressed.

Also, the more rows that are compressed, the
difference in the costing of the SQL, and hence the
possible difference in access path selection and
methods, such as list prefetch may be more desirable,
etc.

As to you comment on the 4k to 2k row size issue --
that is one of the reason that there are 4k, 8k, 16k,
and 32k page sizes to benefit from in V6.

Regards,
Richard Yevich

--- "Humphris,Richard P.(NXI)"
<[login to unmask email]> wrote:
> Hi Jim,
>
> I'm just going to throw out a couple of thoughts. I
> don't know what DB2's
> compression looks like or the internals of DB2. But
> I'll make some
> statements and ask some questions that you (or
> someone else) may be able to
> answer.
>
> Is think the following statement is true:
> When the administrator guide says "The Corresponding
> index data is NOT
> compressed.". I think this means that the the index
> columns in the data
> table will be compressed; but the index table will
> not compressed.
>
> This means that:
> Sql code that DB2 uses the index for, will be almost
> as efficient as before.
> But if DB2 doesn't have an index to use (or decides
> to use a tablespace
> scan anyway), then all 1,000 cylinders will have to
> be decompressed for the
> SQL statement. This may account for a significant
> waste of CPU. On the
> otherhand, a great number of short transactions,
> using the index (and
> avoiding unneccessary decompression) may keep the
> overall CPU increase to
> acceptable levels. It may be that compressing a
> table means creating
> secondary indexes (if they'll be used) may become
> more important. Does this
> also mean that fooling DB2 into using indexes
> becomes more important with
> compressed tables too?
>
> Another question: does DB2 treat decompressed
> tables any different from
> compressed tables when trying to decide if an index
> should be ignored?
>
> Btw, the worst thing that could happen is that your
> row sizes are large
> (almost the size of a 4k page) and the rows
> decompressed equivlent is
> greater than 2k. A compress of such a perverse
> table will result in no
> space savings for the table but with still require
> additional cpu usage to
> compress/uncompress the row. Note: IBM documented
> this behaviour.
>
> Rich
> > -----Original Message-----
> > From: Jim Lewandowski
> [SMTP:[login to unmask email]
> > Sent: Monday, December 13, 1999 1:36 AM
> > To: [login to unmask email]
> > Subject: Re: DB2 Compression on RVA boxes
> >
> > Since the DBAs think I'm just a "DASD guy", I was
> never involved in the
> > benchmarks. I am ONLY able to look at it
> post-compression-decision.
> > Here is the pertinent info:
> >
> > LARGE table (about 1000 cyls - I don't know
> exactly as it is ptn'ed, I
> > believe)
> >
> > 90+% buffer hit (neither randomr or seq. access is
> far away from the
> > other - i.e. both are high as we have/had about
> 60,000 pages for this
> > compressed) table.
> >
> > As, the SQL started, a lot of the data pages are
> in the bufferpool. I
> > would have to look at NPAGES for all the data
> partitions to know what %
> > is sitting (60,000 bufferpool pages) in the
> bufferpool at any point in
> > time.
> >
> > From my memory, these queries were totally CPU
> bound (virtually no
> > physical I/O at all - i.e. 1 complete CP usage in
> the complex).
> >
> > Since, I am using MVS Omegamon, I can ONLY look
> (INSPECT) the thread TCB
> > CPU when it is a long-running query (semi-ugly
> join or high # of data
> > rows hit).
> >
> > When using inspect, I see 33% to 67% (depending on
> what else the SQL may
> > be doing) of the CPU right on the CMPSC X'B263'
> instruction. As I
> > clearly stated in older posts, it COULD be an
> issue of SRB
> > scheduling/sampling that always "happens" to be
> showing the thread TCB
> > PSW on the CMPSC instruction, but I tried
> different sampling rates to
> > see if it changed. I did not see any noticable
> difference.
> >
> > I had also asked others who have STROBE or TMON to
> see what THEY see in
> > a semi-long-running enough SQL to get some
> numbers.
> >
> > Obviously, running a before and after with
> compress off/on would be
> > ideal but I was not allowed to view the
> benchmarking that allowed the
> > compression decision.
> >
> > Jim Lewandowski
> >
> >
> > To change your subscription options or to cancel
> your subscription visit
> > the DB2-L webpage at http://www.ryci.com/db2-l.
> The owners of the list can
> >
>
>
> To change your subscription options or to cancel
> your subscription visit the DB2-L webpage at
> http://www.ryci.com/db2-l. The owners of the list
> can
>
>
>



=====
Regards,
Richard
+====+====+====+====+
[login to unmask email]
__________________________________________________
Do You Yahoo!?
Thousands of Stores. Millions of Products. All in one place.
Yahoo! Shopping: http://shopping.yahoo.com



Jim Lewandowski

Re: DB2 Compression on RVA boxes
(in response to Richard Humphris)
So, the question is: does your processor have the ability to execute
the CMPSC instruction directly?


VERY good question. I will have to check the CVT bits and see if
hardware compression is enabled.

Jim Lewandowski



Richard Humphris

Re: DB2 Compression on RVA boxes
(in response to Richard Yevich)
I just checked via IBMIN, the software PTF faking the CMPSC instruction was
implemented as a PTF to the MVS operating systems (not to DB2).

Therefore, DB2 doesn't have to check the processor for the microcode, it
just issues the CMPSC (B263) instruction and trusts that it will be handled
via microcode or software. If the microcode exists it runs fast. If it
doesn't then a program check (or the equiv) occurs, MVS traps it and runs
software instructions to simulate the B263 instruction. Therefore, DB2 (and
Omegamon) won't know how the instruction was executed; but Omegamon will
measure how long the microcode and/or software took to handle the
compression.

So, the question is: does your processor have the ability to execute the
CMPSC instruction directly?

> -----Original Message-----
> From: Jim Lewandowski [SMTP:[login to unmask email]
> Sent: Monday, December 13, 1999 1:36 AM
> To: [login to unmask email]
> Subject: Re: DB2 Compression on RVA boxes
>
> Since the DBAs think I'm just a "DASD guy", I was never involved in the
> benchmarks. I am ONLY able to look at it post-compression-decision.
> Here is the pertinent info:
>
> LARGE table (about 1000 cyls - I don't know exactly as it is ptn'ed, I
> believe)
>
> 90+% buffer hit (neither randomr or seq. access is far away from the
> other - i.e. both are high as we have/had about 60,000 pages for this
> compressed) table.
>
> As, the SQL started, a lot of the data pages are in the bufferpool. I
> would have to look at NPAGES for all the data partitions to know what %
> is sitting (60,000 bufferpool pages) in the bufferpool at any point in
> time.
>
> From my memory, these queries were totally CPU bound (virtually no
> physical I/O at all - i.e. 1 complete CP usage in the complex).
>
> Since, I am using MVS Omegamon, I can ONLY look (INSPECT) the thread TCB
> CPU when it is a long-running query (semi-ugly join or high # of data
> rows hit).
>
> When using inspect, I see 33% to 67% (depending on what else the SQL may
> be doing) of the CPU right on the CMPSC X'B263' instruction. As I
> clearly stated in older posts, it COULD be an issue of SRB
> scheduling/sampling that always "happens" to be showing the thread TCB
> PSW on the CMPSC instruction, but I tried different sampling rates to
> see if it changed. I did not see any noticable difference.
>
> I had also asked others who have STROBE or TMON to see what THEY see in
> a semi-long-running enough SQL to get some numbers.
>
> Obviously, running a before and after with compress off/on would be
> ideal but I was not allowed to view the benchmarking that allowed the
> compression decision.
>
> Jim Lewandowski
>
>
>
>
>