Friday, February 6, 2009

DB2 jack-of-all-trades: Hybrid, native, bilingual, pureXML

Today I was pointed to the FAQ for XML:DB again and asked whether DB2 is a hybrid system or an XML-enabled database. In those FAQs they distinguish between a native XML database (DB), an XML-enabled DB, and a hybrid XML DB. So what is DB2?

In the FAQ they use "hybrid" in a different meaning than IBM is doing for DB2 as "hybrid database system". XML:DB is defining a hybrid XML database as one that can be both native and XML-enabled. DB2 is called a hybrid system because it is both a (native) relational database system as well as a (native) XML database system. What does native mean? It indicates that the data, either relational or XML, is processed and stored in its own data model, with its specific semantics. Relational data is stored in an optimized row format, relational operators work on the data, and the output are result sets. XML data is stored in its (native) hierarchical format, as optimized, easy-to-navigate trees on disk. The XQuery Data Model (XDM) is an inherent part of the storage structure, the processing of XML data - sequences of nodes and atomic values are core to the processing. Based on this infrastructure DB2 is a native XML database and a hybrid database (relational, XML) - not a hybrid XML database.

Furthermore, DB2 is also bilingual as it understands both SQL statements and XQuery statements. If you write a regular "SELECT ... FROM ... WHERE ..." you are by default in the relational world, using SQL. Thanks to part 14 of the SQL standard, XML is a "relational" data type and we have XML-specific functionality and defined semantics. Users can embed XQuery statements into SQL.
By using the keyword "xquery" in front of a query, users can switch to the XQuery mode and directly issue an XQuery statement. Something like "xquery for $i in .... where ... return ..." is understood by DB2, users who are coming from the XML and XQuery world do not need to learn SQL and can immediately start leveraging their experience.
BTW: Both SQL statements and XQuery statements end up in a single compiler and optimizer since everything is deeply integrated. It's similar to speaking two languages and having only one (!!!) brain.

As shown, DB2 is a bilingual, hybrid database. To top it off, you can download and use it for free as DB2 Express-C.