Monday, July 28, 2014

What's Going On? - DB2 Workload Management: Identification of Activities

In an earlier blog post I had written about why Workload Management is needed. It's not just something for the database system or on the operating system level, it is really useful and done in "real life". But what is managed in the system, how are you able to identify activies in a DB2 database system? I am going to explain that today.

Before I dig deeper into the identification, first we need to clarify what is meant with "activity". It could be almost anything going on in the database system that is related to a single database and could be both user- and system-related tasks. The important distinction is that is on the database level, not for a DB2 instance. Identification of activities deals with three questions: WHO is doing WHAT on my database and WHERE is that data located?

The WHO can be answered by looking at the connection properties such as:
  • Who is the user and which group does the user belong to? 
  • Is the user operating in a special role?
  • From where is the user connecting, does the machine have a name, is it from a specific application?
In DB2 these properties can be checked and used for identification of an activity by defining a workload (CREATE WORKLOAD). The workload object deals with the WHO. The WHAT and WHERE aspect of identification are handled by work classes. They are defined as part of a so-called work class set (CREATE WORK CLASS SET). Each work class can be used to identify an activity based on the type of work it is performing on the database and its related costs (the WHAT part). The type of work could be any combination of LOADing data, defining, altering or dropping objects (DDL), or read or write operations as part of select, insert, update, and delete statements (DML). The cost is what has been estimated by the DB2 compiler/optimizer and is the expected overall execution cost (timeron cost) or the cardinality (how many rows are we expecting in the result). Note that the actual cost of an activity is dealt with as part of controlling and managing the activities which I plan to describe in a later article.

By specifying a "data tag" for a work class, it can be related to storage groups or tablespaces and their priority. This is how activity can be identified by from WHERE the data is processed.

Because multiple work classes in a work class set could identify the same activity, the individual work classes can be ordered/positioned within the set. That way a work class with several properties could pick a very specific activity whereas other activities would be mapped to more general work classes.

Using the concepts of WORKLOAD and WORK CLASS SET it is possible to identify an activity. They help to understand what is going on in the DB2 database system. It is the prerequisite for actively controlling and managing the activities in the system by assigning resources.