The Power and Fun of SQL – Let’s Have Some Fun with an April 2015 Online SQL Challenge

IDUG is dedicated to the support of not only database administrators, but also developers who use our databases. This means that we’re dedicated to sharing information about the many features of SQL and of DB2 in particular. Why is this important? Well, for some very basic reasons.

  • Delivering SQL-based solutions can happen quickly

Developers can generally construct and test SQL quickly. Therefore, if the SQL can incorporate significant programming features then the resulting application processing can also be developed and tested quickly.

  • SQL is portable

SQL runs on the server to which it is directed. Therefore, if more application logic can be pushed into an SQL statement, then that statement can be placed into a variety of application programs or embedded in various tools, on a variety of platforms. This gives an enterprise leverage in selecting platforms.

  • Advanced SQL can be a performance advantage

This performance advantage is realized when a process that requires multiple database calls can be consolidated into a single database call. Reducing the calls to the database, the network message traffic, and quantity of data sent by the database and processed by the application can result in a dramatic performance improvement of a data intensive process.

SQL is a Programming Language

Have read you the SQL manuals lately? There is one for DB2 for z/OS and two for DB2 for LUW. Many people use these reference as just that, references. However, I frequently encourage people to actually read the manuals. In the very least you should read the “SQL Queries” chapters (chapter 6 in the DB2 for LUW SQL Reference Volume 1, chapter 4 in the DB2 for z/OS SQL Reference). It is absolutely amazing what you can do with the SQL language in DB2:

  • Subqueries, joins, anti-joins
  • Complex built-in functions and user-defined functions
  • Online analytical functionality
  • Aggregation
  • Recursion, looping
  • XML processing and generation
  • JSON processing
  • SQL PL programming for stored procedures, functions, and triggers

Getting a good SQL education can be a bit of a challenge. There are some good classes out there and plenty of examples to be found on the internet. However, some of the best education possible is to just give it a try. This is actually really easy for developers to do as DB2 Express-C and IBM Data Studio can both be downloaded for free from the IBM web site, and the DB2 sample database can be created for immediate testing of various SQL capabilities.

The Dojo Concept

The concept of the SQL Dojo was brought to IDUG and the IDUG Content Committee by Content Committee member Kurt Struyf. Kurt had attended a local application developer’s user group meeting and was blown away by the comradery and support brought out by the Dojo exercise. As IDUG was coming up with more ideas to support application developers, Kurt had the brilliant idea to bring the dojo concept to SQL and IDUG. It was an instant success and we’re expanding on it every year.

The concept of a dojo is quite simple. A group of developers (and/or DBAs and Sysprogs) gather together and are individually and randomly assigned to teams. Then a single computer is used on a single display with a portable keyboard and mouse to present a series of programming challenges. Teams are chosen randomly and each team is given just a few minutes, to solve the programming challenge. If the team solves the programming challenge, they move on to the next one; if they don’t solve the challenge before their time runs out, the same challenge (including previous teams’ unfinished work) moves on to the next randomly selected team. Once a team has been selected their number immediately goes back into the pool, and so they could be selected many times during the dojo and must stay engaged. Now comes the real fun; not only does a participant get to meet and work with other people in their group, but participants are also welcome to shout suggestions to the team working on the challenge in an effort to help them out. This can result in a very active and funny discussion with a timer running out, multiple people shouting out suggestions, and the team racing to implement the solution and beat the clock. The result is a fun and cooperative event where people with a common interest meet others, network, work cooperatively, and learn something about application development. It’s great fun and a great learning opportunity.

Kurt brought the SQL Dojo concept to the IDUG North America Technical Conference in 2013. We called it “Fun with SQL” and it was an instant hit! We’ve been expanding on it ever since.

April 2015 “Fun with SQL” Challenge

This month on the Content Committee blog we’ll be doing an online version of the “Fun with SQL” dojo. Each week we’ll present an SQL challenge and request the community to solve it. People can respond to the challenge in the comments section of the blog post, or they can email Kurt directly at kurt.struyf@lone-star.be with a solution or idea and he will post it to the blog. You can post a possible answer, ask a question, and comment on something someone else has posted, or offer alternatives to something that’s been posted. Don’t be shy about posting your answer if someone else has already posted. At the end of the month, the content committee will take the names of the people that have participated and randomly select one to receive a reward of an IPad case, courtesy of IBM, and one to receive a $50 Amazon voucher! So, don’t be shy, just dive in and share your experience or inexperience and get a free and fun education!

Here is the first SQL challenging based upon the DB2 sample database:

Construct a query that gives in one line per employee the employee’s first name, last name, department name, salary, as well as the average and total salary of the department where the employee works.

Here is an ERD representing the DB2 sample database:

A portion of the DB2 sample database for reference

 “Fun with SQL”, the SQL Dojo at IDUG Technical Conferences

If you are planning on attending an IDUG Technical Conference (or are considering attending), then consider the “Fun with SQL” dojo at the upcoming IDUG Technical Conferences. We will be having the dojo at IDUG North America, IDUG EMEA, and IDUG Austalasia! As I have mentioned before, it is geared toward developers, but everyone is welcome, and any skill level is welcome! We’ll have subject matter experts on had to guide you in the experience and you’ll have some laughs while meeting likeminded people and learning something about SQL. We’ll also be introducing “Fun with SQL” to the IDUG booth with new interactive challenges, led by IDUG hall of fame speaker David Simpson and the IDUG Content Committee Team. Another way to meet people, have a good time, and learn something!

Are you an application developer, or know application developers that have avoided IDUG conferences because they are DBA oriented? Well, that has changed! Not only do we have the SQL Dojo but we’ve increased the developer focused content dramatically. Check it out, attend, and have yourself some “Fun with SQL”!!! 

14 Comments

Solution suggestion

April 8, 2015 08:52 AM by Lennart Jonsson

Sketch for a solution

select e.firstnme, lastname, d.deptno, d.deptname, e.salary

         , sum(salary) over (partition by d.deptno)

         , avg(salary) over (partition by d.deptno)

from employee e

join department d

     on e.workdept = d.deptno

 

Possible Solution

April 8, 2015 09:05 AM by Jay Reavill

SELECT X.FIRSTNAME, X.LASTNAME, X.WORKDEPT,

       X.SALARY, Y.AVGSALARY, Y.TOTSALARY

  FROM

(SELECT FIRSTNAME, LASTNAME, WORKDEPT, SALARY

   FROM EMPLOYEE

) X

INNER JOIN

(SELECT DEPTNO, AVG(SALARY) AS AVGSALARY, SUM(SALARY) AS TOTSALARY

   FROM EMPLOYEE   A

            , DEPARTMENT B

  WHERE A.WORKDEPT = B.DEPTNO

  GROUP BY DEPTNO

) Y

ON X.WORKDEPT = Y.DEPTNO

WITH UR;

Reading the book as recommended...

April 8, 2015 09:53 AM by Armando Arranja

SELECT FIRSTNME, LASTNAME, DEPTNAME, SALARY,            
       SUMSAL, AVGSAL                                   
FROM  EMP E,  DEPT D,                                   
TABLE(SELECT SUM(SALARY) AS SUMSAL,                     
             AVG(SALARY) AS AVGSAL                      
       FROM  EMP EMPC                                   
       WHERE EMPC.WORKDEPT = E.WORKDEPT) AS CALCSAL     
WHERE E.WORKDEPT = D.DEPTNO                             
;

 

 

Another Solution

April 8, 2015 01:56 PM by Joe Carroll

WITH TEMP AS
   (SELECT WORKDEPT
                  ,SUM(SALARY) AS TTL_SALARY                     
                  ,AVG(SALARY) AS AVG_SALARY                      
       FROM  EMP                                   
      GROUP BY WORKDEPT
   )
SELECT E.FIRSTNME
              ,E.LASTNAME
              ,D.DEPTNAME
              ,E.SALARY
              ,T.TTL_SALARY
              ,T.AVG_SALARY                                   
 FROM EMP E
 JOIN DEPT D
   ON E.WORKDEPT = D.DEPTNO 
 JOIN TEMP T 
   ON E.WORKDEPT = T.WORKDEPT
;

keep 'em coming!

April 8, 2015 03:22 PM by Daniel Luksetich

I'm diggin' it! Keep the SQL coming!

Solution

April 8, 2015 08:14 PM by Kal Sub

with sal(workdept, avg,tot) as (select workdept, dec(avg(salary),10,2), dec(sum(salary),10,2) from employee group by workdept) select empno, firstnme, lastname, a.workdept, salary, avg, tot from employee a left outer join sal b on a.workdept = b.workdept

Another one

April 9, 2015 12:16 AM by Lennart Jonsson

Similar to Armando's solution but with Ansi joins and Lateral instead of table:

select e.firstnme, lastname, d.deptname, e.salary, x.sum_salary, x.avg_salary
from employee e
join department d
    on e.workdept = d.deptno
cross join lateral ( select sum(salary), avg(salary)
                               from employee a
                               where a.workdept = e.workdept ) as x (sum_salary, avg_salary)

In case anyone else wondered how to prevent the code from using double linespacing it helps to write the code in for example vi and then copy it from there. Don't touch it once it is pasted :-)

 

An other approach

April 9, 2015 11:09 AM by Peter Vanroose

The "base" table (logically speaking) is EMPLOYEE, since we want to see one line of output per employee.
So we don't even need joins!
[tt]
SELECT firstnme,
       lastname,
       (some scalar fullselect returning the deptname) AS deptname,
       salary,
       (scalar fullselect for avg_salary),
       (scalar fullselect for total_salary)
FROM   employee e
[/tt]
This is not the most performant solution, of course, since some work is duplicated, especially in the last two subselects.
(But that's a later issue, typically using CTE's to do preparatory work, and let the subqueries use those intermediates.)

Now for the three scalar fullselects:
1) [tt]
        (SELECT deptname FROM department WHERE deptno = e.workdept)
    [/tt]
                   (correlated, of course; also the other two will be.)
2)  [tt]
        (SELECT avg(salary) FROM employee WHERE workdept = e.workdept)
     [/tt]
3)  [tt]
        (SELECT sum(salary) FROM employee WHERE workdept = e.workdept)
      [/tt/

All combined:
[tt]
SELECT firstnme,
       lastname,
       (SELECT deptname FROM department WHERE deptno = e.workdept) AS deptname,
       salary,
       (SELECT avg(salary) FROM employee WHERE workdept = e.workdept) AS avg_salary_for_dept,
       (SELECT sum(salary) FROM employee WHERE workdept = e.workdept) AS tot_salary_for_dept
FROM   employee e
[/tt]

--  Peter Vanroose
     ABIS Training & Consulting,
     Leuven, Belgium.

Performance comparisons ? And ease of maintenance ?

April 10, 2015 07:37 AM by Philip Nelson

It is always instructive to think about the relative performance of each of the options.  Both the explain plan and DB2's rewritten pseudo-SQL are valuable for this.

Also looking at the various solutions offered, some are much easier to understand, and therefore for others to maintain than others (and also more easily extensible as the business requirements change).

Personally I favour Lennart Jonsson's original solution, using OLAP expressions, for both performance and ease of understanding at the moment.   Perhaps someone has time to do some analysis on the various solutions offered to see if my "hunch" is correct.

Phil Nelson

 

Analyses considerations

April 10, 2015 09:16 AM by Lennart Jonsson

Good idea Philip. One problem though is the low cardinality of the tables involved. I adjusted card for employee and department according to:

db2 "update syscat.tables set card = 1000000*card where tabname in ('EMPLOYEE','DEPARTMENT')"

My original query then got a cost of 9.52249e+09, and the second query got a cost of 1.95213e+11. I added the plans for these two questions at:

http://dustbite.se/sql_dojo/plans.html

 

Additional thoughts

April 10, 2015 09:35 AM by Lennart Jonsson

Faking statistics probably requires a bit more thought than I ended up doing here :-)

Organisations are hierarchical...

April 12, 2015 06:58 PM by Robert Mala

I would think that averages and totals should include employees in sub-departments.  Organisations are normally structured as a hierarchy where org-units are made of other org-units, which are made of other org-units, and so on.   This is the case for SPIFFY which is made of PLANNING, INFORMATION CENTER, DEVELOPMENT CENTER, and SUPPORT SERVICES.   The average and total employee salary for SPIFFY would be all employees in A00 and below, not just those working directly to A00, and so on for other departments.

 

hierarchical solution

April 14, 2015 06:46 AM by Daniel Luksetich

Robert, go for it. Let's see your query!

A solution...

April 15, 2015 02:42 AM by Robert Mala

Thanks for the prompt Daniel.   A solution is to transitively close DEPARTMENT(DETPNO, ADMRDEPT) before aggregating.  For example, the inclusion of (D11,A00), (D21,A00) transitively closes (D01,A00), (D11,D01), (D21,D01).  Here’s the query:

 

with

  department_tc ( deptno, admrdept ) as (

    (

      select d.deptno

           , d.admrdept

        from department d

       union

      select deptno

           , deptno

        from department

    )

    union all

    select d.deptno

         , c.admrdept

      from department_tc c

         , department    d

     where c.admrdept <> c.deptno

       and c.deptno = d.admrdept

)

select d.admrdept

     , sum(e.salary) sum_salary

     , avg(e.salary) avg_salary

  from department_tc d

       left outer join employee e

         on e.workdept = d.deptno

 group

    by d.admrdept

;

Recent Stories
Selecting Appropriate Data Architecture Technologies

SQL Tuning for Developers – Part 2 Explain Output

Some of my favorite tunings tips and tricks for DB2 by Joe Geller