Thursday, August 22, 2013

Bananas, pants, and DB2 with BLU Acceleration

Have you ever fed a banana to a young child? It would have been banana puree, i.e., mashed bananas (there are even instructions for that!). When you think about it, after feeding the banana puree you have a happy child - and a fork, a spoon, a plate and maybe more to wash. As a grownup or older child you would just peel and eat the banana, no dishes to do. Isn't that smart?
Source Wikipedia (Ed Yourdan / Themightyquill)

Now imagine that the bananas taste well and you eat a lot. Babies grow, you "grow". For the babies and children you have to buy new clothes because eventually, even with tricks like adjustable waist bands etc., the pants and shirts don't fit anymore. As a grownup you can react to "growing in the middle" and keep a healthy diet, do some workout, and keep your pants. No money to spend. Isn't that smart?

In database systems like DB2 avoiding to do the dishes would translate to avoid accessing data, to avoid I/O operations. DB2 with BLU Acceleration uses data skipping to only access those portions of the data where comparisons or further processing is necessary. No sweat, no dirty dishes. Even if the data needs to be processed, many operations can be done on the compressed, encoded version of it. Think of it as using a set of plates that remain in their original, shrink-wrapped packaging. That makes "clean-up" faster, a smart move. :)

What do you do when you have more data than it fits in memory? A baby would need larger clothes, some buy bigger machines. DB2 BLU uses decades old, but highly improved technologies like a bufferpool and algorithms to keep the most needed data in memory (or is it in-memory?) and to bring it in from disk with advanced prefetching before needed. That would be like telling your extra pounds to stay at home, so that your existing pants fit well. Maybe you could even wear a smaller size or keep your current pants longer. That would be smart and would save many bucks. DB2 BLU even goes further by tailor-made pants, i.e., laying out the data to make the best use of CPU registers and to improve performance even more. Tailor-made pants with expandable waist bands, "haute couture" for your data at no extra cost.

My older BLU-related posts can be found using the blu label.