Wednesday, August 16, 2017

Combining Db2 and Cloud Object Storage

Db2 and Cloud Storage
happily combined
Since a while, Db2 has built-in support to directly access Cloud Object Storage (COS) like Amazon S3 and IBM COS (Softlayer / IBM Bluemix IaaS). It allows to perform backups (and restores :) as well as data load operations to and from the cloud. Phil Nelson wrote an excellent overview and introduction on the IDUG blog with his examples focussed on S3. Thus, in the following I am going to mostly cover Db2 using the IBM Cloud Object Storage and fill in some additional details.


Overview

Db2 manages Cloud Object Storage location and access to them on the instance level. First, the provider along with the necessary credentials are cataloged and an alias assigned. Thereafter, the alias can be referred to on the database level by using an URI with the DB2REMOTE prefix:


backup db hltest to db2remote://hls3/henrik/backups;

load from db2remote://hlswift/henrik/testdata.csv of del insert into loadtest;

Update: The content is here for historic reasons, see this newer post about Db2 and  S3-compatible Cloud Object Storage.

Getting Started

To get started with Cloud Object Storage on a recent version of Db2 some requirements need to be met. Because the information about a cloud or remote storage location includes access credentials, Db2 requires the presence of a keystore. The process of creating a local keystore or how to configure a centralized keystore are described in the Db2 Knowledge Center. It is the same process as for native data encryption. If you alread have that up and running, you are good to go. Else, the next is to configure the Db2 instance keystore_location and keystore_type settings. In my case it was simply:


update dbm cfg using keystore_location /home/henrik/ks/hlks.p12 
update dbm cfg using keystore_type pkcs12

The keystore "hlks.p12" of type PKCS12 is located in the directory "ks". Thereafter, restart Db2 for the changes to become active.

An alias for a storage location is defined on the instance level using the command CATALOG STORAGE ACCESS ALIAS:

catalog storage access alias hlswift
vendor softlayer
server https://fra02.objectstorage.softlayer.net/auth/v1.0/
user 'IBMOS1234567:henrik' password '5abc123abc123abc1238791a3f905196af8da497b3c1a1add123abc';


The above creates a new alias "hlswift" with a Frankfurt-based storage location. The credentials are encrypted and stored in the keystore configured above. Once I have an alias defined I could use it like shown in the backup example above.
The CATALOG command has (security) options to restrict who can use the alias. It is also possible to directly specify which container or storage bucket to use.

Staging Area

As a temporary buffer for uploading or downloading files Db2 uses a staging area. The path for it can be set via the DB2_OBJECT_STORAGE_LOCAL_STAGING_PATH registry variable:


db2set DB2_OBJECT_STORAGE_LOCAL_STAGING_PATH=/tmp

The above sets the staging area to the "/tmp" directory. The default is in the instance directory under "sqllib/tmp/RemoteStorage.xxxx". The "xxxx" refers to the Db2 partition number. Note that the staging area needs to have enough capacity to hold backup images or load files.

Administration / Maintenance

The defined storage aliases can be printed via the LIST STORAGE ACCESS command. It returns the details on each of the defined storage locations except the passwords. Here is the entry for the alias "hlswift" as defined above:
...
Node 3 entry:

ALIAS=hlswift
VENDOR=softlayer
SERVER=https://fra02.objectstorage.softlayer.net/auth/v1.0/
USERID=IBMOS1234567:henrik

CONTAINER=
OBJECT=
DBUSER=
DBGROUP=SYSADM

...



To remove an existing alias use th UNCATALOG STORAGE ACCESS ALIAS command. Last but not least, there is a command to rotate the key for the encrypted credentials, ROTATE MASTER KEY FOR STORAGE ACCESS.

Conclusions

The above provides you with a rough introduction into combining Db2 with Cloud Object Storage (or the other way?) with a focus on the IBM COS. If not done, start by reading Phil Nelson's introduction at the IDUG website. There are many details that I didn't cover today which warrant a follow-up blog post.

If you want to learn more about that topic or discuss the blog with me, then meet me at the upcoming IDUG EMEA and Db2 Aktuell conferences. If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.

NOTE: I posted a follow-up in 08/2020 titled Db2 and S3-compatible Cloud Object Storage.