Wednesday, May 23, 2012

Partial Early Desk Cleaning - New in DB2

Remember those days when your parents told you to clean up your room or your desk? There are two strategies to deal with cleaning up a room or a desk. Do a little bit here and there or wait till it is really crowded (and filthy...?). The latter is more of a "last resort" action, e.g., before visitors arrive or someone gets really, really angry. The former (today often called "pro-active") means that some time is taken daily or weekly to clean up parts of the room or the desk. In the best scenario, the desk always shines, there is rarely the need for a bigger clean-up and it gives a good feeling (or it could mean you don't really have a real job...).

Now let's talk about DB2. Starting with version 10 it features PED and PEA. The first is Partial Early Distinct (PED) and means that the big clean-up task of removing duplicates from a result set is not done at the end, but as early as possible ("here and there"). The big advantage is that it will speed up the query because smaller intermediate result sets can be move through the system, less memory is used for the sort heap (needed for eliminating duplicates), and some more. Partial Early Aggregation (PEA) works similarly and applies to GROUP BYs.

Some more of "partial early" is explained in a DB2 V10.1 Query performance enhancements paper published at the DB2 LUW Best Practices website.

More later, I promised my boss to clean up my desk...