Friday, June 19, 2009

Lessons from the field (again): Remember the trees!

In April I had already written about the nice surprises XML and XQuery have for developers. XQuery and XPath operate on the tree representation of a document and in most cases those trees have more than one branch. This is something to keep in mind, always!

This morning I was reminded about that article and that problem again when I was contacted with a question about two versions of seemingly the same query against the DB2 sample database:


xquery declare default element namespace "http://posample.org";
for $d in db2-fn:xmlcolumn("PURCHASEORDER.PORDER")
where $d/PurchaseOrder/item[quantity = 1]
return $d/PurchaseOrder/item/name



and

xquery declare default element namespace "http://posample.org";
db2-fn:xmlcolumn ("PURCHASEORDER.PORDER")/PurchaseOrder/item[quantity = 1]/name



In both cases we iterate over the PurchaseOrders and seem to select only items with a quantity of one. But the first query returns 4 records, the second query 3 records. Why?
This is because the element we are qualifying, i.e., PurchaseOrder, can have (and actually has!) multiple item elements, PurchaseOrder qualifies if there is at least one such item with a quantity of 1 (existential semantics). The second query directly filters at the item level. Hence the difference in the results.

The first query can be rewritten to return the same result as the second query:


xquery declare default element namespace "http://posample.org";
for $d in db2-fn:xmlcolumn("PURCHASEORDER.PORDER")/PurchaseOrder/item
where $d[quantity = 1]
return $d/name


What do we learn? Regardless of SQL or XQuery: Think green! Remember the trees! And enjoy pureXML...