Monday, October 19, 2009

Hey, what's wrong with you?!?

Sometimes, you have to deal with stuff that is just wrong. Sometimes, it's not easy to even find out what's going on. And sometimes, there are small things that make a big difference...

In an earlier post I had described how an index over XML data can be used to clean up data. Today, I will focus on a neat feature that was added to DB2 9.5 Fix Pack 3 as well as to DB2 9.7. It is a stored procedure (SP) named XSR_GET_PARSING_DIAGNOSTICS. That stored procedure helps you to find out details on XML parsing and validation errors.

The SP has five input and two output parameter. The first is a BLOB with the XML document/fragment in question. The next two make up the XML schema's fully qualified name (schema.name) against which you want to validate or they can be NULL. The 4th parameter can be NULL again or should be the schemaLocation of the primary XML schema document. The last input parameter specifies whether to use schemaLocation information from inside the provided XML document (1) or not (0). If zero was specified and no schema name has been specified (parameters 2 and 3), then no validation is performed. This is exactly done (no validation) in our example:

call xsr_get_parsing_diagnostics(blob('<a>Hello<b>Another<c><c></a>'),NULL,NULL,NULL,0,?,?)

The last two parameters are the output parameters. The first is a placeholder for an output document (as VARCHAR), the 2nd the number of errors found in the XML document. The above call would produce an output like this:

  Value of output parameters
  --------------------------
  Parameter Name  : ERRORDIALOG
  Parameter Value : <ErrorLog>
  <XML_FatalError parser="XML4C">
    <errCode>202</errCode>
    <errDomain>http://apache.org/xml/messages/XMLErrors</errDomain>
    <errText>Expected end of tag 'c'</errText>
    <lineNum>1</lineNum>
    <colNum>27</colNum>
    <location>/a/b/c/c</location>
    <schemaType></schemaType>
    <tokenCount>1</tokenCount>
    <token1>c</token1>
  </XML_FatalError>
  <DB2_Error>
    <sqlstate>2200M</sqlstate>
    <sqlcode>-16129</sqlcode>
    <errText>
[IBM][CLI Driver][DB2/NT] SQL16129N  XML document expected end of tag "c".  SQLSTATE=2200M
    </errText>
  </DB2_Error>
</ErrorLog>


  Parameter Name  : ERRORCOUNT
  Parameter Value : 1

  Return Status = 0




The function is not just useful to deal with parsing or validation errors coming from your application data, but also if you have XML documents that are causing trouble in general.