Tuesday, December 8, 2009

Let's phone, shop, and whatever.... - Testing new limits

For most of the activities these days data will be produced, ranging from your life and household to administration, manufacturing or services. A lot of the data will end up in data warehouses, either directly or in some condensed, aggregated, distilled way. And while people were talking about 1 TB or 10 TB warehouses only few years ago, scaling up to 100s of Terabyte or even Petabytes (PB) is discussed often now.

One of the enhancements in DB2 9.7 is for addressing this trend. Up to version 9.5 distribution maps were limited to 4096 entries (4 kB), now  up to 32768 entries are possible. In a partitioned database the distribution key, i.e., the columns used to figure out on which database partition the entire row ends up, is important because it determines how evenly the data is split between the partitions. The more evenly balanced the distribution is, the better balanced typically the performance is.

To assign a database partition, the distribution key is hashed to an entry in the distribution map. The more entries in the maps, the smaller the skew. With the new increased distribution map in DB2 9.7, the skew remains small even for databases with a larger number of database partitions.

How do you test it? Increase your calling, shopping, driving, consuming. This will not only kick-start the economy, but also grow the enterprise warehouse and make sure new limits are tested (and introduced)...