Wednesday, August 25, 2010

To APPEND or not to APPEND: Insert and Import Performance

When you create a table, by default, DB2 is resource-conscious in terms of space. So-called FSCRs (free space control records) are used to keep track of free space in the pages. When you delete a row or a row has to move because of an update statement expanding the row size, the new gap in the data page is added as free space information to a FSCR. During insert or import operations the FSCRs are searched to fill up the pages.

The DB2 registry variable DB2MAXFSCRSEARCH controls how many of those space management records are searched when a row is added to a table. The default is 5 FSCRs, a value of "-1" means that in the worst case all records are searched for the best fit. However, searching for free space takes some time, even with efficient algorithms, and it is faster to not search for free space when adding data. This behavior can be turned on by executing ALTER TABLE with the APPEND ON clause. If large quantities of data are inserted (or imported), a performance gain can be realized by omitting the free space search and directly append the data to the table, i.e., to add it to the end of the table.

The drawback of the approach of using APPEND ON is that even when data gets deleted the free space is not reused, unless the table is REORGanized. A REORG is also required if the table is altered again to switch to APPEND OFF to update the free space information.

Of course, if bulk data insert needs to be performed, using LOAD is a good (or most often better) option. All the above applies to relational as well as to XML data.