XML documents like the following can be processed with DB2 by using its pureXML feature.
<a>
<b>first value</b>
<b>second value</b>
<c>oh, even a different element<c>
</a>
When a document is inserted into a column of type XML, it is parsed and transformed into the internal, "native" representation. An instance of the so-called XQuery Data Model (XDM) is created (see the Processing Model in the XQuery specification). This step or process is not part of the XQuery processing, XQuery processing assumes that you (or your system) managed to provide instances of the XDM it then can operate on. If you inserted the above document into DB2, you have an instance of the XDM and you can process the document using XPath or XQuery. That's fine and everybody is happy.
Now, humans have never rested and sought new challenges. When you learn a foreign language, you first learn how to speak and understand simple sentences. Eventually, you have to deal with more complex sentences like subordinate clauses, relative clauses, appositions, and whatever the stuff is named (think of subselects, common table expressions, case statements, etc.). Where was I? Ah, back to XML. Similar to sentences, people tend to make data more complicated and to seek new challenges. What they do is to embed XML data into other XML data.
<a>
<b>
<e1><e2>embedded data 1</e2></e1>
</b>
<b>
<e1><e2>embedded data 2</e2></e1>
</b>
<c>oh, even a different element<c>
</a>
In the above example the previously text values like "first value" are replaced with XML fragments on their own. The entire document and the embedded parts can still be easily processed because all "tags" are element nodes in the XDM instance. XPath and XQuery can directly answer queries on e1 and e2, e.g., all instances of "e2" can quickly be found by searching "//e2".
However, the above way of embedding XML is not the only one and some organizations, standards, and data providers embed entire XML documents as text. This can be done by escaping directly, i.e., to by replacing < and > by "<" and ">". A different, but equivalent way is to utilize CDATA sections (see the XML specification). Let's take a look at the following example:
<a>
<b>
<![CDATA[<e1><e2>embedded data 1</e2></e1>]]>
</b>
<b>
<![CDATA[<e1><e2>embedded data 2</e2></e1>]]>
</b>
<c>oh, even a different element<c>
</a>
I plan to look at options on how to process the data in a future post.