Tuesday, May 14, 2013

Performance Tuning for XML-based OLTP Systems (Temenos T24)

From time to time my colleagues and I get contacted on how to set up and tune DB2 for use with Temenos T24. The latter is a core banking system and in use worldwide. And it uses XML as internal data storage format which makes it an ideal fit for DB2 pureXML (both on z/OS and on Linux, UNIX, and Windows). To make our answers easily findable, why not blog about it? Today, I will cover setting up tables for good performance.

A typical T24 table consists only of two columns, a RECID (as primary key) and an XMLRECORD. This is similar to other systems implemented based on XML data, both OLTP and more of OLAP/BI character.

Inlining of data

For these kind of tables it is a good idea to INLINE the XML column. Inlining is good for several reasons. Because XML data is regularly stored in a special XML Data Area (XDA), its access requires an indirection from the row data. The descriptor that is stored in the row is taken as input to look up the actual storage location via the XML Regions Index. Eliminating the descriptor means the XML data is directly fetched with the row as well as fewer entries in the XML Regions Index.


If there is test data, the function ADMIN_EST_INLINE_LENGTH can be used to find the typical inline length that should be set. The maximum inline length depends on the page size and hence the maximum row size. Both are documented with the just mentioned function. For T24 the recommendation is to use an inline length of 32000 bytes or even more in 32 kb pages.

Volatile Tables for Index Use

Another configuration option for tables is to declare them as VOLATILE. The background is explained on this page with "Relational index performance tips". Basically, the optimizer will prefer index-based access to the table even if the cardinality changes frequently.

Enable static compression 

Many systems benefit from data compression. The same goes for T24 on DB2. However, the question is whether to use static (table) compression or even the newer adaptive compression (page-level compression on top). Based on different benchmarks and experience with production-oriented testing the recommendation is to use static compression. The additional CPU overhead for a higher degree of space and IO savings impacts system throughput. When in doubt test it on your own. Another effect: Indexes on compressed tables will also be compressed by default.

Consider APPEND ON or DB2MAXFSCRSEARCH=1

With APPEND ON as an option to CREATE/ALTER TABLE, DB2 will insert "at the end" of the table without searching for free space elsewhere. It results in slightly higher space consumption, but benefits insert performance. An alternative is to set DB2MAXFSCRSEARCH to a low value, e.g. to 1. It determines on a global level of how many free space control records (FSCR) to search for placing a new record. Thus, all tables would be impacted, not just the one where APPEND ON is specified.

Separate tablespaces for table, index, and long data

It is always a good idea to have separate tablespaces for the different page types. They can be specified during CREATE TABLE.

PCTFREE 99 for hotspot tables

Depending on the application and usage type of T24 there can be hotspot tables. For these it could make sense to specify a high value for PCTFREE during CREATE/ALTER TABLE. This determines how much space is left free (read: unused) in a page, i.e., how many or how few records are stored in a single page. The fewer records - the extreme would be a single record - in a page, the less likely that page becomes a hotspot.

Consider table/range partitioning for bigger tables

Last, but not least, it is a good idea to apply range partitioning on bigger tables. This benefits performance in several ways: Old data can be rolled out (DETACHed) quickly, queries perform faster due to range elimination and rebalancing of work, and maintenance can also be done in parallel on the ranges.
The partitioning can either be done on the RECID (the primary key) or a new hidden column could be introduced to evenly split the data into ranges, depending on needs.

Summary

Tuning basic properties for XML-based tables is not (that) different from other tables. The above gives an introduction of should be considered when setting up Temenos T24 or similar XML-based systems on DB2. I didn't provide examples with the full syntax. If you need them, leave a comment... ;-)

Special thanks to Juergen Goetz for sharing his experience for use with other customers and this blog entry.

BTW: There are also few links for Temenos T24 on DB2 on my resources page. General DB2 system setup for HA for Temenos is discussed in the Redpaper "High Availability and Disaster Recovery for Temenos T24 with DB2 and AIX"

8 comments:

Junaid Khan said...

Hi,

Please tell us more about the partitioning technique that can applied to RECID column.

Regards,
Junaid

Henrik Loeser said...

Hi Junaid,

to partition on just RECID is difficult for several reasons (data format, not much insight into how they are assigned, unknown data growth, ...). It would need to be determined on a per-table-basis.

What can be done is to introduce a new column of type, e.g., smallint. It could have automatically assigned numbers from 1 - n (n is max partitions). Then use that column for automatically partition on that.

I could post more details in a separate article if there is interest.

Henrik

Rehman said...

Hi,

can we use HASH partitioning for such tables ? (which contain RECID)

Regards,
Rehman

Henrik Loeser said...

Hej Rehman,

yes, you can use HASH partitioning (database partitioning) with the T24 schema. However, it is not best practice. Using pureScale (shared disk clustering) is the recommended way.

You can contact me by email for details.

Regards,
Henrik

Junaid ur Rehman said...

Hi,

I was referring to hash partitioning of table in T24 Schema.
Since transaction tables of T24 can grow in size pretty quickly, and there is no 'KEY' that we can use to partition tables here (since tables only have two columns).

Regards,
Rehman

Henrik Loeser said...

You could add a hidden column with default values and then range partition based on that column. The column type could be something like int, smallint, date.
Because the column is hidden, it would not be returned automatically. The default values would make sure meaningful values are used.

Seems there is interest in the details. If I find time, I will post an article with details.

Henrik

Henrik Loeser said...

BTW: Here is the link where I discuss hidden columns: http://blog.4loeser.net/2012/07/implicitly-hidden-columns-in-db2.html.

What is missing is how to use it to add partitioning.

Henrik

Henrik Loeser said...

A follow-up is here: http://blog.4loeser.net/2014/01/using-hidden-columns-for-range.html

LinkWithin

Related Posts with Thumbnails