Unmasking The Masked Data by Emil Kotrc

Recently, we had a discussion with my colleagues about managing the column access in DB2 for z/OS via the column masks. One of my smart mates had a very good question - how do column masks work with joins? His idea was pretty simple: if column masks are evaluated during the output only, why not to generate a "helper" table with guessed values and then join this helper table with the original masked table? If the inner join returns a row, then we know the masked value. Let's look at this problem in more detail in the following short blog.

Note. This blog mostly references DB2 for z/OS, but the SQL and the examples are valid for DB2 LUW as well.

Column masks

First, let's start with column masks. They were introduced together with row permissions in DB2 10 for z/OS (respectively in DB2 10.1 for LUW) and if you look at the IBM documentation you will find the following definition of column masks: "A column mask is a database object that describes a specific column access control rule for a column. In the form of an SQL CASE expression, the rule specifies the condition under which a user, group, or role can receive the masked values that are returned for a column."

Moreover, you will find the following note: "The application of column masks affects the final output only; it does not impact the operations, such as predicates and ordering, in an SQL statement." So, it's confirmed - according to the previous sentence the column masks affect the output only.

Example

An example is worth a thousand words. You probably have your IBM EMP sample table or any equivalent test table. Let's use this EMP table for defining a column mask. Be aware that to be able to manage the column access you must be granted the SECADM or SYSADM authority depending on your SEPARATE_SECURITY ZPARM setting.

When you run a SELECT * FROM EMP you'll get the list of all employees with their salaries and bonuses. Now, if you want to mask a bonus (from the column BONUS) that is higher than a certain amount (for example 500), you can create a column mask as follows (check this example in the IBM reference):

create mask bonus_mask on emp
 for column bonus
 return 
   case 
     when (bonus > 500.00) then null
     else bonus
   end
 enable;

The case expression returns the bonus if it is equal or less than 500, or returns NULL otherwise, effectively hiding the high bonuses from the users. However, the column masks are not activated for your table, unless you do so explicitly via ALTER TABLE:

alter table emp activate column access control;

Since then on, when you do the SELECT LASTNAME,BONUS FROM EMP ORDER BY BONUS, LASTNAME, you get the following output (the rows where the column mask is applied have NULL bonus, the output is truncated):

LASTNAME   BONUS
---------- ------
...
WONG       500.00
YAMAMOTO   500.00
YOSHIMURA  500.00
BROWN        NULL
HENDERSON    NULL
JOHN         NULL
LUTZ         NULL
MARINO       NULL
...
LUCCHESI     NULL
HAAS         NULL
HEMMINGER    NULL

Unmasking the data

So far so good, everything seems ok and the bonus is nulled where you wanted. This is the place where the bad user comes to play. This bad user wants to know the bonuses of all employees regardless the column mask. To do that he could perform an inner join of EMP and a dummy table containing all possible bonuses. How to create such a dummy table? Easy enough, what about recursive SQL for example? If I am a bad user, I could use a query like this:

with hack(number) as (                           
 select 1 from sysibm.sysdummy1          
 union all                                  
 select number+1 from hack where number <= 30000)
select lastname, bonus, number as unmasked_bonus from emp, hack where bonus = number order by bonus, lastname;

This query contains a recursive common table expression that builds a table HACK having one column NUMBER with rows that range from 1 to 30000, but of course I could use a different range or step. The HACK table is then joined with the EMP table; and what's important is the JOIN condition, which takes the BONUS column we want to hack. The selected column then references the HACK table to get the unmasked value. The output is as follows:

LASTNAME   BONUS  UNMASKED_BONUS
---------- ------ --------------
...
WONG       500.00            500
YAMAMOTO   500.00            500
YOSHIMURA  500.00            500
BROWN        NULL            600
HENDERSON    NULL            600
JOHN         NULL            600
LUTZ         NULL            600
MARINO       NULL            600
...
LUCCHESI     NULL            900
HAAS         NULL           1000
HEMMINGER    NULL           1000

The bad user can clearly see the bonuses.

Final word

I can imagine that creating a HACK table for certain types of columns might be more challenging, but still there is a brute force possible. So, if a bad user can query the table with masked data, he could eventually figure out the secrets. Therefore, you must be careful when using column masks as they potentially cannot guarantee the security level you might desire.

What are your ideas on this? Are you using column masks? If so, please share your experience in the comments.

References

Column masks in Security reference

CREATE MASK syntax

Discussion on IDUG DB2-L forum

Recent Stories
Db2 for z/OS locking basics

A Beginners Guide to Locks, Latches, Claims and Drains in DB2 for z/OS by Steve Thomas

Fun with Date Ranges