Recently, we had a discussion with my colleagues about managing the column access in DB2 for z/OS via the column masks. One of my smart mates had a very good question - how do column masks work with joins? His idea was pretty simple: if column masks are evaluated during the output only, why not to generate a "helper" table with guessed values and then join this helper table with the original masked table? If the inner join returns a row, then we know the masked value. Let's look at this problem in more detail in the following short blog.
Note. This blog mostly references DB2 for z/OS, but the SQL and the examples are valid for DB2 LUW as well.
First, let's start with column masks. They were introduced together with row permissions in DB2 10 for z/OS (respectively in DB2 10.1 for LUW) and if you look at the IBM documentation you will find the following definition of column masks: "A column mask is a database object that describes a specific column access control rule for a column. In the form of an SQL CASE expression, the rule specifies the condition under which a user, group, or role can receive the masked values that are returned for a column."
Moreover, you will find the following note: "The application of column masks affects the final output only; it does not impact the operations, such as predicates and ordering, in an SQL statement." So, it's confirmed - according to the previous sentence the column masks affect the output only.
An example is worth a thousand words. You probably have your IBM EMP sample table or any equivalent test table. Let's use this EMP table for defining a column mask. Be aware that to be able to manage the column access you must be granted the SECADM or SYSADM authority depending on your SEPARATE_SECURITY ZPARM setting.
When you run a SELECT * FROM EMP you'll get the list of all employees with their salaries and bonuses. Now, if you want to mask a bonus (from the column BONUS) that is higher than a certain amount (for example 500), you can create a column mask as follows (check this example in the IBM reference):
create mask bonus_mask on emp for column bonus return case when (bonus > 500.00) then null else bonus end enable;
The case expression returns the bonus if it is equal or less than 500, or returns NULL otherwise, effectively hiding the high bonuses from the users. However, the column masks are not activated for your table, unless you do so explicitly via ALTER TABLE:
alter table emp activate column access control;
Since then on, when you do the SELECT LASTNAME,BONUS FROM EMP ORDER BY BONUS, LASTNAME, you get the following output (the rows where the column mask is applied have NULL bonus, the output is truncated):
LASTNAME BONUS ---------- ------ ... WONG 500.00 YAMAMOTO 500.00 YOSHIMURA 500.00 BROWN NULL HENDERSON NULL JOHN NULL LUTZ NULL MARINO NULL ... LUCCHESI NULL HAAS NULL HEMMINGER NULL
So far so good, everything seems ok and the bonus is nulled where you wanted. This is the place where the bad user comes to play. This bad user wants to know the bonuses of all employees regardless the column mask. To do that he could perform an inner join of EMP and a dummy table containing all possible bonuses. How to create such a dummy table? Easy enough, what about recursive SQL for example? If I am a bad user, I could use a query like this:
with hack(number) as ( select 1 from sysibm.sysdummy1 union all select number+1 from hack where number <= 30000) select lastname, bonus, number as unmasked_bonus from emp, hack where bonus = number order by bonus, lastname;
This query contains a recursive common table expression that builds a table HACK having one column NUMBER with rows that range from 1 to 30000, but of course I could use a different range or step. The HACK table is then joined with the EMP table; and what's important is the JOIN condition, which takes the BONUS column we want to hack. The selected column then references the HACK table to get the unmasked value. The output is as follows:
LASTNAME BONUS UNMASKED_BONUS ---------- ------ -------------- ... WONG 500.00 500 YAMAMOTO 500.00 500 YOSHIMURA 500.00 500 BROWN NULL 600 HENDERSON NULL 600 JOHN NULL 600 LUTZ NULL 600 MARINO NULL 600 ... LUCCHESI NULL 900 HAAS NULL 1000 HEMMINGER NULL 1000
The bad user can clearly see the bonuses.
I can imagine that creating a HACK table for certain types of columns might be more challenging, but still there is a brute force possible. So, if a bad user can query the table with masked data, he could eventually figure out the secrets. Therefore, you must be careful when using column masks as they potentially cannot guarantee the security level you might desire.
What are your ideas on this? Are you using column masks? If so, please share your experience in the comments.
Discussion on IDUG DB2-L forum