Thursday, February 19, 2009

How many trees does a forest need? (DB2 pureXML Indexing)



The above question was the key topic in a public debate here in town. The background is the small forest next to us. Right now there are too many trees per acre and they are competing for sun light and space. The proposed solution by the administration was to cut some trees, so that those trees which are important for the forest, for animals and insects can grow and prosper. The envisioned result is a healthy forest.

Do you see any parallels to our database world? One of the more frequently asked questions about DB2 pureXML is how many XML indexes do make sense for a table. In the relational world, the number of possible indexes is limited by the number of combinations and permutations of columns. For XML it would be possible to store an entire database inside a single document. But how do you index the data? How do you determine which indexes are needed? Too few indexes could mean that queries do not perform. Too many indexes could mean that a hefty penality is paid during IUD operations for index maintenance and the system health is impacted. The index maintenance costs are known for the relational world, but little is known for XML indexes.

In the coming days and weeks I plan to address XML indexing questions. In the meantime, please let me know if you have indexing topics that should be part of this series.