Wednesday, May 15, 2013

Optim Query Capture and Replay for system tests

One of the frequent questions during bootcamps is about what tools are available for performing tests, especially driving workloads against a DB2 database. In an older article I pointed you to the Workload Multiuser Driver (WMD) and to the TPoX-included workload driver, both free, open source, and from IBM-initiated projects. Today, I wanted to point you to a tool that is a spin-off of InfoSphere Guardium, the so-called InfoSphere Optim Query Capture and Replay (OQCR).

What the tool does is simple to describe. It captures (or logs) the workload or traffic to one database and is then able to replay it to a different system, i.e., to send the queries it grabbed to a database. That way it is possible to test out systems under realistic "stress" before putting them into production. Optim Query Capture and Replay is able to manage workloads. With that statements or transactions can be removed from a workload or they can be copied into others. Users and schemas can be mapped, the speed of how quickly workloads or queries are replayed can be changed and much more.

The best way to learn more about Optim Query Capture and Replay is to either start with the OQCR Information Center or by glimpsing over the documents available at the so-called information roadmap. The tool can also be used together with the Optim Test Data Management solution which allows cloning of production databases for tests, including applying or handling data privacy rules (masking data).

Tuesday, May 14, 2013

Performance Tuning for XML-based OLTP Systems (Temenos T24)

From time to time my colleagues and I get contacted on how to set up and tune DB2 for use with Temenos T24. The latter is a core banking system and in use worldwide. And it uses XML as internal data storage format which makes it an ideal fit for DB2 pureXML (both on z/OS and on Linux, UNIX, and Windows). To make our answers easily findable, why not blog about it? Today, I will cover setting up tables for good performance.

A typical T24 table consists only of two columns, a RECID (as primary key) and an XMLRECORD. This is similar to other systems implemented based on XML data, both OLTP and more of OLAP/BI character.

Inlining of data

For these kind of tables it is a good idea to INLINE the XML column. Inlining is good for several reasons. Because XML data is regularly stored in a special XML Data Area (XDA), its access requires an indirection from the row data. The descriptor that is stored in the row is taken as input to look up the actual storage location via the XML Regions Index. Eliminating the descriptor means the XML data is directly fetched with the row as well as fewer entries in the XML Regions Index.


If there is test data, the function ADMIN_EST_INLINE_LENGTH can be used to find the typical inline length that should be set. The maximum inline length depends on the page size and hence the maximum row size. Both are documented with the just mentioned function. For T24 the recommendation is to use an inline length of 32000 bytes or even more in 32 kb pages.

Volatile Tables for Index Use

Another configuration option for tables is to declare them as VOLATILE. The background is explained on this page with "Relational index performance tips". Basically, the optimizer will prefer index-based access to the table even if the cardinality changes frequently.

Enable static compression 

Many systems benefit from data compression. The same goes for T24 on DB2. However, the question is whether to use static (table) compression or even the newer adaptive compression (page-level compression on top). Based on different benchmarks and experience with production-oriented testing the recommendation is to use static compression. The additional CPU overhead for a higher degree of space and IO savings impacts system throughput. When in doubt test it on your own. Another effect: Indexes on compressed tables will also be compressed by default.

Consider APPEND ON or DB2MAXFSCRSEARCH=1

With APPEND ON as an option to CREATE/ALTER TABLE, DB2 will insert "at the end" of the table without searching for free space elsewhere. It results in slightly higher space consumption, but benefits insert performance. An alternative is to set DB2MAXFSCRSEARCH to a low value, e.g. to 1. It determines on a global level of how many free space control records (FSCR) to search for placing a new record. Thus, all tables would be impacted, not just the one where APPEND ON is specified.

Separate tablespaces for table, index, and long data

It is always a good idea to have separate tablespaces for the different page types. They can be specified during CREATE TABLE.

PCTFREE 99 for hotspot tables

Depending on the application and usage type of T24 there can be hotspot tables. For these it could make sense to specify a high value for PCTFREE during CREATE/ALTER TABLE. This determines how much space is left free (read: unused) in a page, i.e., how many or how few records are stored in a single page. The fewer records - the extreme would be a single record - in a page, the less likely that page becomes a hotspot.

Consider table/range partitioning for bigger tables

Last, but not least, it is a good idea to apply range partitioning on bigger tables. This benefits performance in several ways: Old data can be rolled out (DETACHed) quickly, queries perform faster due to range elimination and rebalancing of work, and maintenance can also be done in parallel on the ranges.
The partitioning can either be done on the RECID (the primary key) or a new hidden column could be introduced to evenly split the data into ranges, depending on needs.

Summary

Tuning basic properties for XML-based tables is not (that) different from other tables. The above gives an introduction of should be considered when setting up Temenos T24 or similar XML-based systems on DB2. I didn't provide examples with the full syntax. If you need them, leave a comment... ;-)

Special thanks to Juergen Goetz for sharing his experience for use with other customers and this blog entry.

BTW: There are also few links for Temenos T24 on DB2 on my resources page. General DB2 system setup for HA for Temenos is discussed in the Redpaper "High Availability and Disaster Recovery for Temenos T24 with DB2 and AIX"

Friday, May 3, 2013

In-depth article on DB2 Statistical Views published on developerWorks

There is a new article on developerWorks: Get the most out of the DB2 optimizer: Leveraging statistical views to improve query execution performance. One of the many new features in DB2 10.1 that we mention during DB2 Bootcamps is for automated runstats on statistical views. Often, the reaction is "what are statistical views?".

There is an entire section about statistical views in the DB2 Information Center. Basically, there are a view defined on complex queries.Then, on the result set of those view statistics are collected which help the optimizer to improve query planning. Essentially, those statistics help determine cardinalities on complex relationships between multiple tables. Though DB2 usually has statistical information on each single table, it does not on the predicate-based combination (a.k.a. join) of multiple tables. That is when statistical views come to the rescue.

You can read more on them at the developerWorks article and see how the query plans and performance improve.