Monday, January 24, 2011

Why XML columns have a length of zero

One interesting question when looking at the meta data of tables is why XML columns don't have a length.

db2 "create table thebigq(id int, name varchar(40), doc xml inline length 1000)"
DB20000I  The SQL command completed successfully.

hloeser@BR857D67:~/1Q11$ db2 describe table thebigq

                                Data type                     Column
Column name   schema    Data type name Length     Scale Nulls
------------- --------- -------------- ---------- ----- ------
ID            SYSIBM    INTEGER                 4     0 Yes  
NAME          SYSIBM    VARCHAR                40     0 Yes  
DOC           SYSIBM    XML                     0     0 Yes  

  3 record(s) selected.

Well, for an INTEGER type the size is well-known because of the bits allocated to store the value. For VARCHAR strings the specified length (in bytes) is what is shown. For XML columns, however, there is no maximum size that you can specify. This is in contrast to LOBs (BLOBs, CLOBs). There is also no maximum size of an XML document on disk, it is only determined by storage and possibly the document structure.

When pureXML was designed, there was a long discussion on what the length as shown above should be. Should we just show a really big number, like 2GB, or decide that it is -1 or -2?

Inserting a 2GB big XML document - this is the maximum that can be transferred into DB2 - can take up more storage space than 2 GB or less. Hence, it is not a good value. The options "-1" or other negative numbers usually have a special meaning and therefore were also not chosen. So it became zero (0) - XML values have the size they have. Something mystic...!?!