A Katamaran and the car ferry seen in front of the Swiss Alps on Lake Constance (as seen from Friedrichshafen).
Henrik's thoughts on life in IT, data and information management, cloud computing, cognitive computing, covering IBM Db2, IBM Cloud, Watson, Amazon Web Services, Microsoft Azure and more.
Saturday, December 22, 2012
Tuesday, December 18, 2012
Tune in to DB2 FM... (Happy Holidays)
Ever heard of a radio station "DB2 FM", ever tried to listen to it? If you prefer streaming radio, why not start up the music service "db2fm" on your console...?
With that music in the background I plan to go offline for some time, enjoy the holiday season and Christmas with the family. Happy Holidays and see you next year...
With that music in the background I plan to go offline for some time, enjoy the holiday season and Christmas with the family. Happy Holidays and see you next year...
Wednesday, December 12, 2012
Buffer pools, registers, working memory, and epilepsy
What does it mean to a high-powered database server (lots of CPU cores, plenty of memory, fast and big disk system) when the bufferpools are tiny? What happens with a CPU with lots of cache when only few registers can be used? What happens with human brain if the so-called working memory is impacted by epilepsy-related medication or surgery (see my earlier posts related to epilepsy and IT)?
All of the three mentioned systems cannot utilize their full capabilities, throughput is limited because an important component is impacted. But what can be done to improve throughput?
The key is to change the strategy and to deal with the shortcomings. In a database server like DB2 or Oracle it would be to improve selectivity to reduce the data to be processed, to create indexes to avoid access to much data, and to enable data compression so that less data is moved. For software running on a system with too fw registers it would be to change algorithms, so that, e.g., fewer operands are handled.
How does someone cope with memory problems caused by epilepsy? Our son has occupational therapy to develop strategies for more efficient use of his working memory and also to train (build up) his capabilities. The Epilepsy Society has this summary on the memory issue and workarounds that can be used like sticky notes (external memory), task lists, etc.
All of the three mentioned systems cannot utilize their full capabilities, throughput is limited because an important component is impacted. But what can be done to improve throughput?
The key is to change the strategy and to deal with the shortcomings. In a database server like DB2 or Oracle it would be to improve selectivity to reduce the data to be processed, to create indexes to avoid access to much data, and to enable data compression so that less data is moved. For software running on a system with too fw registers it would be to change algorithms, so that, e.g., fewer operands are handled.
How does someone cope with memory problems caused by epilepsy? Our son has occupational therapy to develop strategies for more efficient use of his working memory and also to train (build up) his capabilities. The Epilepsy Society has this summary on the memory issue and workarounds that can be used like sticky notes (external memory), task lists, etc.
Monday, December 10, 2012
DB2 fixpacks, support, APARs, and other information
About a week ago I had asked how you typically find information about DB2 fixpacks. With the new DB2 10.1 Fix Pack 2 and Data Studio 3.2 just released (and I back from travel), here is the way I usually trying to stay in the loop.
A good starting point is the IBM Support Portal. It requires a so-called "IBM ID" to manage a profile. There you can define RSS news feeds or email subscriptions to many of the IBM products, including the Information Management offerings. Information you can subscribe to include new or update Technotes (example: updated Technote on recommended fix packs for Data Server Client Packages), on fixes (example: IC84157, Crash recovery may fail if the mirror log...), product and packaging information (example: Mobile Database now included...), etc.
Once the new fixpack is available I usually first read the Fix pack Summary in the DB2 Information Center. It describes the high-level changes in the fixpack.
On the support pages you will also find an overview of the available fix packs for the different supported versions of DB2. When you click on one of the fixpacks, there are additional links leading to, e.g., the list of security vulnerabilities, HIPER and special attention APARs fixed in DB2 (here V10.1, FP2) or the list of the fixes (fix list) for that release. By the way: HIPER stands for High Impact or PERvasive, i.e., bugs with critical impact. APAR is Authorized Program Analysis Report and basically is a formal description of a product defect. Customers usually open a PMR (Problem Management Report) which may lead to an APAR (or not).
A good starting point is the IBM Support Portal. It requires a so-called "IBM ID" to manage a profile. There you can define RSS news feeds or email subscriptions to many of the IBM products, including the Information Management offerings. Information you can subscribe to include new or update Technotes (example: updated Technote on recommended fix packs for Data Server Client Packages), on fixes (example: IC84157, Crash recovery may fail if the mirror log...), product and packaging information (example: Mobile Database now included...), etc.
Once the new fixpack is available I usually first read the Fix pack Summary in the DB2 Information Center. It describes the high-level changes in the fixpack.
On the support pages you will also find an overview of the available fix packs for the different supported versions of DB2. When you click on one of the fixpacks, there are additional links leading to, e.g., the list of security vulnerabilities, HIPER and special attention APARs fixed in DB2 (here V10.1, FP2) or the list of the fixes (fix list) for that release. By the way: HIPER stands for High Impact or PERvasive, i.e., bugs with critical impact. APAR is Authorized Program Analysis Report and basically is a formal description of a product defect. Customers usually open a PMR (Problem Management Report) which may lead to an APAR (or not).
Friday, December 7, 2012
DB2 10.1 Fixpack 2 is available
An overview of available fixpacks by DB2 version, including the fixpack 2 for DB2 10.1, can be found at this support page. A summary of the enhancements in the fixpack is, as usual, in the DB2 Information Center.
Wednesday, November 28, 2012
Quiz: Where is a good place to find information about DB2 fixpacks?
Nobody is perfect, nothing is complete. So is DB2. But where do you find information about fixpacks and fixes for DB2 for Linux, UNIX, and Windows? Where can you look up what features have been added as part of a fixpack?
Post your sources as comments or let me know in other ways.
Post your sources as comments or let me know in other ways.
Wednesday, November 21, 2012
Black Friday and DB2 recovery objects - new db2adutl feature
Black Friday and Cyber Monday are coming up, busy days for many database systems working in the background of brick and mortar and online shops. Will the infrastructure keep up with the traffic? Will response times be acceptable for shoppers and store personnel alike? How much revenue is made with which types of products? Many questions at the start of the holiday season.
There is an interesting feature in DB2 that shipped in version 9.7 FP5 (see here for the db2adutl feature announcement) and in version 10.1 to manage high peak load on DB2 systems. db2adutl now allows to UPLOAD recovery objects (backup images, archived logs) to TSM in a very simple way without breaking the log chain. Some examples of the new UPLOAD feature are shown here. The idea is to first archive to a fast disk instead to TSM, so that during a peak like Black Friday or Cyber Monday log archiving can keep up with the rest of the system. Later on, the recovery objects can be moved from disk to TSM to allow TSM-based recovery if needed.
Happy Thanksgiving (for those in the USA) and interesting shopping deals for all of us...
BTW: As of DB 10.1 archived logs can also be compressed reducing what needs to be written out.
There is an interesting feature in DB2 that shipped in version 9.7 FP5 (see here for the db2adutl feature announcement) and in version 10.1 to manage high peak load on DB2 systems. db2adutl now allows to UPLOAD recovery objects (backup images, archived logs) to TSM in a very simple way without breaking the log chain. Some examples of the new UPLOAD feature are shown here. The idea is to first archive to a fast disk instead to TSM, so that during a peak like Black Friday or Cyber Monday log archiving can keep up with the rest of the system. Later on, the recovery objects can be moved from disk to TSM to allow TSM-based recovery if needed.
Happy Thanksgiving (for those in the USA) and interesting shopping deals for all of us...
BTW: As of DB 10.1 archived logs can also be compressed reducing what needs to be written out.
Labels:
administration,
availability,
best practices,
cash register,
data in action,
DB2,
IT,
retail
Friday, November 16, 2012
Where to find information about granted privileges in DB2
Yesterday, I reveived a question about where to find specific metadata in DB2. The customer wanted to find out which privileges had been granted within a database and they were aware that db2look can produce this list. But where does this information come from?
Let's start with a glimpse at db2look. It is the "DB2 statistics and DDL extraction tool" and can be used to produce the DDL statements for the objects inside a database. There are also options specific to handling authorizations, i.e., the privileges (see the -x, -xdep, -xd, and -xddep options). All the statements that db2look produces are based on special data stored in the database, the so-called metadata. Those tables that hold the metadata are called System Catalog in DB2 and Data Dictionary in Oracle. The system catalog than can be queried using regular SELECT statements because the information is provided in tables (users can stay within the data mode, the relation model).
DB2 offers the metadata in two different sets of views. The views in the SYSCAT schema basically have all the metadata. The views in the SYSSTAT schema have a subset of the data and have (some) updatable columns, so that object statistics can be changed (for good or for worse...). The views are built on top of the internal catalog tables which are managed in the SYSIBM schema. It is advised to only use the SYSCAT and SYSSTAT views because structure of the internal tables can change without warning whereas the external catalog views are kept intact.
Now, where can the information about granted privileges be found? The DB2 Information Center has a so-called "Road map to catalog views" which is a listing of all the offered metadata found in the SYSCAT and SYSSTAT views. All the views that end in *AUTH carry authorization information, i.e., data about privileges. To give some examples, in SYSCAT.INDEXAUTH you can find out who has CONTROL privilege on an index, SYSCAT.ROLEAUTH is useful to see who has ADMIN authority for granting a specific role, and finally, as a complex example, SYSCAT.TABLEAUTH manages all the table privileges like insert, update, delete, select, alter, control, and even more privileges.
Administration tools for DB2 access that information, procedures and scripts can select data from these views, and of course applications and users. Which of these views are made available for the PUBLIC is up to the administrators. Remember, it's an honor, not a privilege...
Comments or questions?
Let's start with a glimpse at db2look. It is the "DB2 statistics and DDL extraction tool" and can be used to produce the DDL statements for the objects inside a database. There are also options specific to handling authorizations, i.e., the privileges (see the -x, -xdep, -xd, and -xddep options). All the statements that db2look produces are based on special data stored in the database, the so-called metadata. Those tables that hold the metadata are called System Catalog in DB2 and Data Dictionary in Oracle. The system catalog than can be queried using regular SELECT statements because the information is provided in tables (users can stay within the data mode, the relation model).
DB2 offers the metadata in two different sets of views. The views in the SYSCAT schema basically have all the metadata. The views in the SYSSTAT schema have a subset of the data and have (some) updatable columns, so that object statistics can be changed (for good or for worse...). The views are built on top of the internal catalog tables which are managed in the SYSIBM schema. It is advised to only use the SYSCAT and SYSSTAT views because structure of the internal tables can change without warning whereas the external catalog views are kept intact.
Now, where can the information about granted privileges be found? The DB2 Information Center has a so-called "Road map to catalog views" which is a listing of all the offered metadata found in the SYSCAT and SYSSTAT views. All the views that end in *AUTH carry authorization information, i.e., data about privileges. To give some examples, in SYSCAT.INDEXAUTH you can find out who has CONTROL privilege on an index, SYSCAT.ROLEAUTH is useful to see who has ADMIN authority for granting a specific role, and finally, as a complex example, SYSCAT.TABLEAUTH manages all the table privileges like insert, update, delete, select, alter, control, and even more privileges.
Administration tools for DB2 access that information, procedures and scripts can select data from these views, and of course applications and users. Which of these views are made available for the PUBLIC is up to the administrators. Remember, it's an honor, not a privilege...
Comments or questions?
Labels:
administration,
benchmark,
catalog,
data studio,
DB2,
Information Center,
IT,
optim,
performance
Tuesday, November 13, 2012
Where are the soap bars...? - Database Appliances
Do you remember the days when there were soap bars in the hotel bathrooms? The days when you began washing the hands and then realizing that the soap bar had to be unwrapped? The room service would restock soap, clean the soap dish, and empty the waste basket. Now enter the age of appliances. Many hotels have switched over to "appliances", wall-mounted soap dispensers. They allow for savings for the hotels due to less effort (service, restocking, ...) and are "greener", i.e., the dispensers cause significantly less waste.
The same transition is also happening for database servers. Instead of purchasing hardware (server, storage, network equipments, etc.) and software separately and assemble and test them on-site in a lengthly process, appliances often are delivered ready-to-use (see IBM PureData System). The key to success is to have the right ingredients. Would you use a soap dispenser that says "good for hands, hair, and dishes"? You would expect to have cream soap, shampoo and conditioner, and other cleaning liquids in separate appliances. The same is true for database appliances. That is the reason why IBM (other than some competition ("supports both OLTP and OLAP" )) is offering PureData System for Transactions, PureData System for Operational Analytics, and PureData System for Analytics. Different workloads and requirements need database appliances with different ingredients.
BTW: I still like a nicely shaped, slightly scented bar of soap. It makes the right gift...
The same transition is also happening for database servers. Instead of purchasing hardware (server, storage, network equipments, etc.) and software separately and assemble and test them on-site in a lengthly process, appliances often are delivered ready-to-use (see IBM PureData System). The key to success is to have the right ingredients. Would you use a soap dispenser that says "good for hands, hair, and dishes"? You would expect to have cream soap, shampoo and conditioner, and other cleaning liquids in separate appliances. The same is true for database appliances. That is the reason why IBM (other than some competition ("supports both OLTP and OLAP" )) is offering PureData System for Transactions, PureData System for Operational Analytics, and PureData System for Analytics. Different workloads and requirements need database appliances with different ingredients.
BTW: I still like a nicely shaped, slightly scented bar of soap. It makes the right gift...
Wednesday, October 24, 2012
INCLUDE columns in IUD operations?!
When I read a question about INCLUDE columns that I received I first thought that indexes were meant. The reason is that for performance reason, to achieve index-only access, additional columns, not contributing to the index key, are added to an index during CREATE INDEX. These columns are called "include columns". However, the question was about include columns for INSERT, UPDATE, and DELETE. So what are they?
In OLTP environments it is often necessary to retrieve data that was just modified, i.e., inserted, updated, deleted. To save CPU cycles and improve response time why not combine the SELECT with the other statement? How this can be done is documented as part of the IUD operations (see links above) and the so-called data-change-table-reference clause. With DB2 you can access the data before (OLD TABLE) or after (NEW TABLE) that change was applied.
db2 => create table ict(id int, s varchar(80))
DB20000I The SQL command completed successfully.
db2 => select * from ict
ID S
----------- --------------------------------------------------------------------
0 record(s) selected.
db2 => select id,s from new table(insert into ict values(1,'Hello World'))
ID S
----------- --------------------------------------------------------------------
1 Hello World
1 record(s) selected.
db2 => select id,s from old table(update ict set s='I am quiet' where id=1)
ID S
----------- --------------------------------------------------------------------
1 Hello World
1 record(s) selected.
db2 => select id,s from new table(update ict set s='Hello again' where id=1)
ID S
----------- ---------------------------------------------------------------------
1 Hello again
1 record(s) selected.
db2 => select id,s from old table(delete from ict where id=1)
ID S ----------- --------------------------------------------------------------------
1 Hello again
1 record(s) selected.
db2 => select * from ict
ID S
----------- ---------------------------------------------------------------------
0 record(s) selected.
So far, so good, but still no include columns. They are needed when you want to return more than what is in the rows that you modify, i.e., more columns are needed in the result set.
db2 => select id, s, ts from new table(insert into ict(id, s) include (ts timestamp) values(1,'Hello World',current timestamp),(2,'Hello again',current timestamp))
ID S TS
----------- ------------------------------------------ --------------------------
1 Hello World 2012-10-24-15.08.06.762079
2 Hello again 2012-10-24-15.08.06.762079
2 record(s) selected.
Of course there are better things to pass around than the timestamp. It could be a sequence value like the insert order, a function call, or whatever you make up.
That's all I wanted to include into this article...
In OLTP environments it is often necessary to retrieve data that was just modified, i.e., inserted, updated, deleted. To save CPU cycles and improve response time why not combine the SELECT with the other statement? How this can be done is documented as part of the IUD operations (see links above) and the so-called data-change-table-reference clause. With DB2 you can access the data before (OLD TABLE) or after (NEW TABLE) that change was applied.
db2 => create table ict(id int, s varchar(80))
DB20000I The SQL command completed successfully.
db2 => select * from ict
ID S
----------- --------------------------------------------------------------------
0 record(s) selected.
db2 => select id,s from new table(insert into ict values(1,'Hello World'))
ID S
----------- --------------------------------------------------------------------
1 Hello World
1 record(s) selected.
db2 => select id,s from old table(update ict set s='I am quiet' where id=1)
ID S
----------- --------------------------------------------------------------------
1 Hello World
1 record(s) selected.
db2 => select id,s from new table(update ict set s='Hello again' where id=1)
ID S
----------- ---------------------------------------------------------------------
1 Hello again
1 record(s) selected.
db2 => select id,s from old table(delete from ict where id=1)
ID S ----------- --------------------------------------------------------------------
1 Hello again
1 record(s) selected.
db2 => select * from ict
ID S
----------- ---------------------------------------------------------------------
0 record(s) selected.
So far, so good, but still no include columns. They are needed when you want to return more than what is in the rows that you modify, i.e., more columns are needed in the result set.
db2 => select id, s, ts from new table(insert into ict(id, s) include (ts timestamp) values(1,'Hello World',current timestamp),(2,'Hello again',current timestamp))
ID S TS
----------- ------------------------------------------ --------------------------
1 Hello World 2012-10-24-15.08.06.762079
2 Hello again 2012-10-24-15.08.06.762079
2 record(s) selected.
Of course there are better things to pass around than the timestamp. It could be a sequence value like the insert order, a function call, or whatever you make up.
That's all I wanted to include into this article...
Tuesday, October 16, 2012
Law License Enforcement options in DB2
Yesterday I received an interesting question regarding licensed features in DB2 and how the license is enforced. Someone was able to create an MQT even though the DB2 edition does not include support for it. But why? I try to explain.
The DB2 Information Center has an overview of features by edition. DB2 utilizes a license management server to control what features can be used and for reporting and auditing of compliance. The db2licm tool is the interface to the server and is used to add licenses, list features, and much more. Traditionally, DB2 had a "soft stance" on license enforcement, e.g., unlicensed features could be used without the software complaining (only the lawyers).
By customer request that eventually changed, so that compliance could be enforced on company level (and lawyers and finance could focus on keeping the books clean). From my time as DB2 developer I remember discussions on how to implement certain hard stops. In DB2 V9.5 hard enforcement of license compliance was offered for the first time AFAIK. I found these two DB2 9.5 fixpack notes on FP2 and enforcement of pureXML and storage features (which I was involved in) and FP4 changes on enforcement of WLM, CPU, and memory usage.
When installing DB2, by default enforcement is set to SOFT as it seems:
hloeser@ems:~$ db2licm -l show detail
Product name: "IBM Database Enterprise Developer Edition"
License type: "Developer"
Expiry date: "Permanent"
Product identifier: "db2dede"
Version information: "10.1"
Product name: "DB2 Advanced Enterprise Server Edition"
License type: "Developer"
Expiry date: "Permanent"
Product identifier: "db2aese"
Version information: "10.1"
Enforcement policy: "Soft Stop"
The behavior can be changed using the "-e" (enforcement) option of db2licm:
hloeser@ems:~$ db2licm -e db2aese HARD
LIC1412W A hard stop enforcement policy has been set. This enforcement
policy stops unlicensed requests.
Explanation:
You issued the db2licm command with the -e parameter, to update the
enforcement policy, and specified the value HARD. (For example, db2licm
-e db2ese HARD.) The value HARD specifies that unlicensed requests will
not be allowed.
User response:
As a mechanism for you to keep track of, and differentiate, the DB2
database products and features installed on your system, it is
recommended that you register the license key for each DB2 database
product and feature.
If you want unlicensed requests to be logged but not restricted, change
the enforcement policy to SOFT. For example, db2licm -e db2ese SOFT
LIC1411I Enforcement policy type updated successfully.
As I have AESE installed, it does not really allow me to show you an example. But you could try to compress or create an MQT on DB2 Express-C. Here is the output after switching the enforcement:
hloeser@ems:~$ db2licm -l
Product name: "IBM Database Enterprise Developer Edition"
License type: "Developer"
Expiry date: "Permanent"
Product identifier: "db2dede"
Version information: "10.1"
Product name: "DB2 Advanced Enterprise Server Edition"
License type: "Developer"
Expiry date: "Permanent"
Product identifier: "db2aese"
Version information: "10.1"
Enforcement policy: "Hard Stop"
As mentioned, SOFT has been the traditional way and is the default in DB2. However, to be compliant and to be on the safe side for audits, switching to HARD is a good idea.
BTW: You can generate a compliance report using db2licm:
hloeser@ems:~$ db2licm -g comp.txt
LIC1440I License compliance report generated successfully.Anything troublesome in there...?
The DB2 Information Center has an overview of features by edition. DB2 utilizes a license management server to control what features can be used and for reporting and auditing of compliance. The db2licm tool is the interface to the server and is used to add licenses, list features, and much more. Traditionally, DB2 had a "soft stance" on license enforcement, e.g., unlicensed features could be used without the software complaining (only the lawyers).
By customer request that eventually changed, so that compliance could be enforced on company level (and lawyers and finance could focus on keeping the books clean). From my time as DB2 developer I remember discussions on how to implement certain hard stops. In DB2 V9.5 hard enforcement of license compliance was offered for the first time AFAIK. I found these two DB2 9.5 fixpack notes on FP2 and enforcement of pureXML and storage features (which I was involved in) and FP4 changes on enforcement of WLM, CPU, and memory usage.
When installing DB2, by default enforcement is set to SOFT as it seems:
hloeser@ems:~$ db2licm -l show detail
Product name: "IBM Database Enterprise Developer Edition"
License type: "Developer"
Expiry date: "Permanent"
Product identifier: "db2dede"
Version information: "10.1"
Product name: "DB2 Advanced Enterprise Server Edition"
License type: "Developer"
Expiry date: "Permanent"
Product identifier: "db2aese"
Version information: "10.1"
Enforcement policy: "Soft Stop"
The behavior can be changed using the "-e" (enforcement) option of db2licm:
hloeser@ems:~$ db2licm -e db2aese HARD
LIC1412W A hard stop enforcement policy has been set. This enforcement
policy stops unlicensed requests.
Explanation:
You issued the db2licm command with the -e parameter, to update the
enforcement policy, and specified the value HARD. (For example, db2licm
-e db2ese HARD.) The value HARD specifies that unlicensed requests will
not be allowed.
User response:
As a mechanism for you to keep track of, and differentiate, the DB2
database products and features installed on your system, it is
recommended that you register the license key for each DB2 database
product and feature.
If you want unlicensed requests to be logged but not restricted, change
the enforcement policy to SOFT. For example, db2licm -e db2ese SOFT
LIC1411I Enforcement policy type updated successfully.
As I have AESE installed, it does not really allow me to show you an example. But you could try to compress or create an MQT on DB2 Express-C. Here is the output after switching the enforcement:
hloeser@ems:~$ db2licm -l
Product name: "IBM Database Enterprise Developer Edition"
License type: "Developer"
Expiry date: "Permanent"
Product identifier: "db2dede"
Version information: "10.1"
Product name: "DB2 Advanced Enterprise Server Edition"
License type: "Developer"
Expiry date: "Permanent"
Product identifier: "db2aese"
Version information: "10.1"
Enforcement policy: "Hard Stop"
As mentioned, SOFT has been the traditional way and is the default in DB2. However, to be compliant and to be on the safe side for audits, switching to HARD is a good idea.
BTW: You can generate a compliance report using db2licm:
hloeser@ems:~$ db2licm -g comp.txt
LIC1440I License compliance report generated successfully.Anything troublesome in there...?
Labels:
administration,
best practices,
DB2,
Information Center,
IT,
license,
version 10,
version 9.5
Monday, October 15, 2012
Updated Redbook for DB2 10: High Availability and Disaster Recovery Options
Well, there is not much to say about this existing Redbook that has been updated to reflect DB2 10.1 for Linux, UNIX, and Windows and current technologies. The "High Availability and Disaster Recovery Options for DB2 for Linux, UNIX, and Windows" Redbook describes and explains technologies like IBM Tivoli TSA, PowerHA SystemMirror, Microsoft Windows Failover Cluster, WebSphere Q Replication or InfoSphere CDC.
With close to 600 pages it also requires your high availability...
With close to 600 pages it also requires your high availability...
Labels:
availability,
best practices,
database,
DB2,
IT,
pureScale,
redbook,
system management,
version 10
Friday, October 12, 2012
Answer to DB2 Quiz: Don't forget to close what was opened...
Earlier this week I posted a DB2 quiz showing an error message. It showed that creating a function failed with an "end of file reached" error. I received some feedback by email and the consensus was that the error somehow was caused by the earlier executed scripts. But what exactly is the cause?
In the quiz I had mentioned that we ran into the error during a proof of concept when porting an application from Oracle to DB2. This is a first hint. The DB2 Information Center has a document describing how to set up the DB2 environment for Oracle application enablement. In there are two versions for executing SQL statements listed, one based on clpplus, another one using the DB2 Command Line Processor. For running statements with the DB2 CLP it is recommended to use "SET SQLCOMPAT PLSQL" first. That switches the CLP to taking the forwad slash ("/") on a new line as PL/SQL statement termination character. Using "SET SQLCOMPAT DB2" switches back to regular operations.
The error "end of file reached" indicates that a statement is incomplete. This could mean a statement is missing a clause (usually a syntax error) or the statement or file is missing something. Is it a statement terminator? What we tried then while debugging is to execute a simple CREATE TABLE:
>>> db2 -tf table2.sql
DB20000I The SQL command completed successfully.
Strangely, this works. Is it related to our function definition? We opened another window with a new DB2 connection and the CREATE FUNCTION when directly pasted into the CLP window succeeded. And then it dawned on us...
As with other settings, including the database connection, they are kept between invocations of DB2 CLP. In the example in the quiz we only open the database connection once. In one of the scripts that we executed, the SET SQLCOMPAT PLSQL was set, but not reset to the default using SET SQLCOMPAT DB2. Simple DDL like CREATE TABLE seems unimpacted by this setting. However, creating more complex objects like the shown function gives an error. I asked the DB2 Information Development team (the one creating the documentation) to provide details on the SQLCOMPAT switch.
What can be taken away is to make sure that when putting a switch into a file, make sure switching back to defaults is part of the same file. Sounds like best practices from coding days... ;-)
BTW: I disabled captcha codes for comments. This should make it more interactive and lead to less emails. Have a great weekend.
In the quiz I had mentioned that we ran into the error during a proof of concept when porting an application from Oracle to DB2. This is a first hint. The DB2 Information Center has a document describing how to set up the DB2 environment for Oracle application enablement. In there are two versions for executing SQL statements listed, one based on clpplus, another one using the DB2 Command Line Processor. For running statements with the DB2 CLP it is recommended to use "SET SQLCOMPAT PLSQL" first. That switches the CLP to taking the forwad slash ("/") on a new line as PL/SQL statement termination character. Using "SET SQLCOMPAT DB2" switches back to regular operations.
The error "end of file reached" indicates that a statement is incomplete. This could mean a statement is missing a clause (usually a syntax error) or the statement or file is missing something. Is it a statement terminator? What we tried then while debugging is to execute a simple CREATE TABLE:
>>> db2 -tf table2.sql
DB20000I The SQL command completed successfully.
Strangely, this works. Is it related to our function definition? We opened another window with a new DB2 connection and the CREATE FUNCTION when directly pasted into the CLP window succeeded. And then it dawned on us...
As with other settings, including the database connection, they are kept between invocations of DB2 CLP. In the example in the quiz we only open the database connection once. In one of the scripts that we executed, the SET SQLCOMPAT PLSQL was set, but not reset to the default using SET SQLCOMPAT DB2. Simple DDL like CREATE TABLE seems unimpacted by this setting. However, creating more complex objects like the shown function gives an error. I asked the DB2 Information Development team (the one creating the documentation) to provide details on the SQLCOMPAT switch.
What can be taken away is to make sure that when putting a switch into a file, make sure switching back to defaults is part of the same file. Sounds like best practices from coding days... ;-)
BTW: I disabled captcha codes for comments. This should make it more interactive and lead to less emails. Have a great weekend.
Tuesday, October 9, 2012
DB2 Quiz: Why the error? (create function)
Last week during a Proof-of-Concept (PoC) with an ISV we tried to port and deploy database objects from Oracle to DB2. During some testing I ran into a strange error (at least at first) and only stepping back and getting some coffee helped to explain it. Can you guess what went wrong and why we ran into the error?
For the actual porting we used the IBM Data Movement Tool, but the DB2 Command Line Processor for some side testing. Here is what we did on the command line:
>>> db2 connect to mydb
Database Connection Information
Database server = DB2/LINUX 10.1.0
SQL authorization ID = HLOESER
Local database alias = MYDB
>>> db2 -tf script1.sql
DB20000I The SQL command completed successfully.
>>> db2 -tf scriptN.sql
DB20000I The SQL command completed successfully.
>>> db2 -tf func.sql
DB21007E End of file reached while reading the command.
>>> cat func.sql
CREATE OR REPLACE FUNCTION TAN (X DOUBLE)
RETURNS DOUBLE
LANGUAGE SQL
CONTAINS SQL
NO EXTERNAL ACTION
DETERMINISTIC
RETURN SIN(X)/COS(X);
>>> db2 -tf func.sql
DB21007E End of file reached while reading the command.
Basically we successfully tested several SQL scripts before we ran into the "end of file" error. There were no special hidden characters in the file. Opening another database connection in another window and executing the script there succeeded.
Do you know why? Any guesses? I will post the solution in a follow-up article.
For the actual porting we used the IBM Data Movement Tool, but the DB2 Command Line Processor for some side testing. Here is what we did on the command line:
>>> db2 connect to mydb
Database Connection Information
Database server = DB2/LINUX 10.1.0
SQL authorization ID = HLOESER
Local database alias = MYDB
>>> db2 -tf script1.sql
DB20000I The SQL command completed successfully.
>>> db2 -tf scriptN.sql
DB20000I The SQL command completed successfully.
>>> db2 -tf func.sql
DB21007E End of file reached while reading the command.
>>> cat func.sql
CREATE OR REPLACE FUNCTION TAN (X DOUBLE)
RETURNS DOUBLE
LANGUAGE SQL
CONTAINS SQL
NO EXTERNAL ACTION
DETERMINISTIC
RETURN SIN(X)/COS(X);
>>> db2 -tf func.sql
DB21007E End of file reached while reading the command.
Basically we successfully tested several SQL scripts before we ran into the "end of file" error. There were no special hidden characters in the file. Opening another database connection in another window and executing the script there succeeded.
Do you know why? Any guesses? I will post the solution in a follow-up article.
IBM PureData follows PureFlex and PureApplication Systems
IBM PureData System for Transactions and IBM PureData System for Analytics are two new categories of IBM's expert integrated systems, the so-called PureSystems.
The PureData System for Transactions is built on the DB2 pureScale technology and hence provides scalability and operational continuity. Using DB2's SQL compatibility mode it is also an attractive offer for customers wishing to move away from Oracle database systems.
The PureData System for Analytics actually distinguishes between Analytics and Operational Analytics. The former is built using Netezza technology, the later follows the tradition of InfoSphere Warehouse and IBM Smart Analytics System.
The common benefit of all three new systems is that they provide simplification from A-Z, starting from procurement (one HW/SW piece only) over getting ready for production (my colleagues are claiming hours) to maintenance/administration. They all integrate best practices (like those found here) from experience in the field when working with customers.
BTW: The new systems are nothing for you if you like to fiddle around with OS and DB2 configuration paramaters and want to personally tune everything. That was already done by IBM experts.
The PureData System for Transactions is built on the DB2 pureScale technology and hence provides scalability and operational continuity. Using DB2's SQL compatibility mode it is also an attractive offer for customers wishing to move away from Oracle database systems.
The PureData System for Analytics actually distinguishes between Analytics and Operational Analytics. The former is built using Netezza technology, the later follows the tradition of InfoSphere Warehouse and IBM Smart Analytics System.
The common benefit of all three new systems is that they provide simplification from A-Z, starting from procurement (one HW/SW piece only) over getting ready for production (my colleagues are claiming hours) to maintenance/administration. They all integrate best practices (like those found here) from experience in the field when working with customers.
BTW: The new systems are nothing for you if you like to fiddle around with OS and DB2 configuration paramaters and want to personally tune everything. That was already done by IBM experts.
Wednesday, September 26, 2012
PureSystems meets Data: Register for October 9th event
Something cool is coming up on October 9th. The following video includes a registration link for the announcement event. I am not allowed to tell you what it is, but it is neither a new iPhone, electric car, next version of DB2, nor anything related to any elections coming up. It is related to expert knowledge, hardware and software integration, and your future in IT...
How the iPad changed our vacation
After moving to Germany in 2008, our family vacation this Summer was finally (see here why not sooner) in San Jose and San Francisco. The kids were able to use "their" playgrounds of the past, see their preschool and take a look at the school our older son was a student at. What was interesting for all of us was to watch how mobile technology, here the iPad, changes lifes. Note that I am a frequent traveler myself (and sometimes could be one of the cases described below...), but below is what stuck out to the rest of the family.
- While in the past, breakfast rooms were dominated by big screen TVs and the usual morning news, now many people are eating and "smartphoning" or "iPadding" at the same time, occasionally turning heads towards the TV.
- While walking in San Francisco (not much walking in San Jose) special care needs to be taken of people walking and talking. When someone is yelling next to you, the person on the other end of the phone is usually meant. At first this was irritating to the kids.
- iPads are the new "pocket cameras". People are taking pictures using "big screens"...
- Beware of people holding an iPad in front of them and trying to navigate the streets. It was kind of funny and disturbing to see that lamp poles are not on the maps... (and this was before iOS 6 ;-).
- On crowded buses (e.g. line 30) we noticed people trying to locate the position using an iPad to decide when and where to get off.
Thursday, September 20, 2012
DB2 10.1 - the first fixpack is out
FP1 for DB2 10.1 for Linux, UNIX, and Windows is now available. Here are the two important links:
Some of the new features or enhancements for DB2 10.1 were already included in DB2 9.7 FP6 and needed to be ported. An example of this is the support for XML type for global variables and in compiled SQL functions.
- The so-called Fix Pack Summary in the DB2 Information Center,
- and here in the overview DB2 Fix Packs by version you can find all the available fixpacks from DB2 10.1 to DB2 8.2
Some of the new features or enhancements for DB2 10.1 were already included in DB2 9.7 FP6 and needed to be ported. An example of this is the support for XML type for global variables and in compiled SQL functions.
Wednesday, September 19, 2012
DB2 Information Center: English, deutsch, espaƱol, italiano, etc.
Today I discovered an interesting feature of the DB2 Information Center that I will now use often. As all of you know, the DB2 Information Center is available in many languages and the content is displayed according to the preferred language setting of your browser. The DB2 Information Center even has a page on how to set this up. However, when I work with customers or partners in a different country or I don't even use my machine, switching languages requires some clicks.
So I was very positively surprised and happy to find out that I can just specify a "lang=xx" (xx being the country code) parameter to the base URL of the DB2 Information Center. Thus with one of the following URLs you have the Information Center in
Now using a specific language is simpler which allows for faster cross-checking of wording between your language and the English version of the DB2 Information Center.
So I was very positively surprised and happy to find out that I can just specify a "lang=xx" (xx being the country code) parameter to the base URL of the DB2 Information Center. Thus with one of the following URLs you have the Information Center in
Now using a specific language is simpler which allows for faster cross-checking of wording between your language and the English version of the DB2 Information Center.
Saturday, September 15, 2012
DB2 on Android? Yes! IBM Mobile Database
Yesterday IBM GAed the IBM Mobile Database, a special database for Android devices. The IBM Mobile Database works with the solidDB Sync Server to bring your enterprise data from DB2 or Informix databases to a mobile device (and back, of course). Thus, the title should actually say DB2 data on Android...
The IBM Mobile Database is free of charge and supports an SQL API. It is suited to build your own mobile database applications on top. These applications would be able to directly, even with transaction support during connect loss, work with data coming from a central enterprise database.
The IBM Mobile Database is free of charge and supports an SQL API. It is suited to build your own mobile database applications on top. These applications would be able to directly, even with transaction support during connect loss, work with data coming from a central enterprise database.
Wednesday, September 12, 2012
San Francisco vacation, Audi cars, and DB2 replacing Oracle
Long title to an interesting journey...
I just returned from family vacation (and hence offline) that we spent in San Jose and San Francisco (California). This included more than a dozen Cable Car rides, walking steep hills, and enjoying Fog City. Could you imagine going down the hills in SF on skiers?
The marketing guys of Audi of course could. Speaking of Audi, which is part of the Volkswagen group: They were today featured in a press article because Audi has been replacing Oracle and HP with DB2 on Power. Switching over was not an uphill battle, but a smooth ride. And the system stability wasn't going downhill either...
I just returned from family vacation (and hence offline) that we spent in San Jose and San Francisco (California). This included more than a dozen Cable Car rides, walking steep hills, and enjoying Fog City. Could you imagine going down the hills in SF on skiers?
The marketing guys of Audi of course could. Speaking of Audi, which is part of the Volkswagen group: They were today featured in a press article because Audi has been replacing Oracle and HP with DB2 on Power. Switching over was not an uphill battle, but a smooth ride. And the system stability wasn't going downhill either...
Monday, September 10, 2012
New Redbook on pureScale: Delivering Continuity and Extreme Capacity with the IBM DB2 pureScale Feature
A new redbook covering many aspects of using pureScale with DB2 10.1 has been published last Friday. All the details on the related website.
XML Devotee session with z/OS topics
A session of the so-called XML Devotee Community will be held this Wednesday, September 12th, with two topics: 1) Using COBOL with pureXML on DB2 z/OS and 2) XML processing on z/OS. Details of the session, including phone numbers, speaker info and more is available at the community website.
Tuesday, August 21, 2012
Monday, August 20, 2012
Welcome to Switzerland! Choose your database territory wisely...
One of the many international aspects of living in the Lake Constance region is that you encounter Swiss, Austrian, and German people, and even those from Lichtenstein on a daily basis. Language (mostly the dialect) is the most obvious differentiator in direct contact as well as a license plate on car. Sometimes, there are also the small and funny struggles with technology which show country-specific preferences.
Depending on the weather, it often happens that you walk in downtown Friedrichshafen and suddenly receive a text message "Welcome to Switzerland!" and your mobile phone started roaming using a Swiss mobile phone company instead of your German mobile provider. Similary, depending on the weather again, you can easily pick up a German provider along the Swiss shore or in Romanshorn waiting for a ferry to Germany. Always make sure that your phone has the right provider, else it will be expensive even receiving a call.
Then, there also funny annoyances like Google Maps. It is showing the Swiss-German border in a very liberal way. The German Reichenau island is, according to Google, Swiss territory. Fortunately, the borderline is clear to everyone and the countries are cooperating and helping each other across the lake.
When working with a database system like DB2, you also have to select the territory, codepage and collation for a database. You could have German, Swiss, Austrian (though no Lichtensteinian) databases side by side in a single DB2 instance. One thing you cannot do is to change the territory as we can do on the mobile phone when picking our home provider. So it would continue to say "Welcome to Switzerland!"
Larger Map
Depending on the weather, it often happens that you walk in downtown Friedrichshafen and suddenly receive a text message "Welcome to Switzerland!" and your mobile phone started roaming using a Swiss mobile phone company instead of your German mobile provider. Similary, depending on the weather again, you can easily pick up a German provider along the Swiss shore or in Romanshorn waiting for a ferry to Germany. Always make sure that your phone has the right provider, else it will be expensive even receiving a call.
Then, there also funny annoyances like Google Maps. It is showing the Swiss-German border in a very liberal way. The German Reichenau island is, according to Google, Swiss territory. Fortunately, the borderline is clear to everyone and the countries are cooperating and helping each other across the lake.
When working with a database system like DB2, you also have to select the territory, codepage and collation for a database. You could have German, Swiss, Austrian (though no Lichtensteinian) databases side by side in a single DB2 instance. One thing you cannot do is to change the territory as we can do on the mobile phone when picking our home provider. So it would continue to say "Welcome to Switzerland!"
Larger Map
Monday, August 13, 2012
Two whales in the air: Zeppelins over Friedrichshafen
Yesterday I got the sudden chance to climb up a church tower in Friedrichshafen. It was also the rare opportunity to see two Zeppelins side by side (taken with my mobile phone). One had just departed with new passengers, the other was approaching the Zeppelin landing site.
What can also be seen close to the center is the runway of Friedrichshafen Airport, on the left of it the Zeppelin hangar and the Friedrichshafen Messe (expo & fair). The open space in the foreground on the right will be home to the new headquarters of ZF.
What can also be seen close to the center is the runway of Friedrichshafen Airport, on the left of it the Zeppelin hangar and the Friedrichshafen Messe (expo & fair). The open space in the foreground on the right will be home to the new headquarters of ZF.
Tuesday, August 7, 2012
Partial early blog posting
Two of the many performance enhancements in DB2 10.1 are called PED (partial early distinct) and PEA (partial early aggregation). Work is done as early as possible, though only partially, to reduce data volume already in an early stage and speed up the total query. Having said that, I am done. :)
P.S.: This may be only partial information on PED and PEA, but it helps you speed up query performance and is light in reading...
P.S.: This may be only partial information on PED and PEA, but it helps you speed up query performance and is light in reading...
Wednesday, August 1, 2012
Vacation: Time to REBALANCE (and to look at DB2 10)
It is vacation time, time to recharge and to rebalance life (BTW: Some years back some companies talked about work-life balance, now it is "work/life integration" - towards always on). When I thought about "rebalance", some new DB2 features came to mind. You see, I am still in work mode...
When changing storage in DB2, e.g., adding containers to a tablespace or removing them from it, the tablespace is rebalanced to keep it evenly striped. DB2 10 introduced the concept of storage groups. It allows to group tablespaces with similar properties or "tasks" together. Using ALTER TABLESPACE it is possible to change the associated storage group. To move the data from the former to the new storage (group), data is rebalanced in the background, i.e., asynchronously. To have greater control over those background tasks, DB2 10 adds SUSPEND and RESUME to the ALTER TABLEPACE ... REBALANCE statements. So you can decide when to take a break...
As you might know, there is another rebalance operation available in DB2, storage-related again. It is used in pureScale environments after changes to the shared file system (cluster file system) and rebalances storage utilization. The rebalance option is part of the db2cluster command which is used to interact with the reliable, scalable cluster infrastructure (RSCT) and the cluster file system GPFS.
As both rebalance operations are recommended for "periods of low system activity", this could mean vacation time - bringing me back to what I am preparing for...
When changing storage in DB2, e.g., adding containers to a tablespace or removing them from it, the tablespace is rebalanced to keep it evenly striped. DB2 10 introduced the concept of storage groups. It allows to group tablespaces with similar properties or "tasks" together. Using ALTER TABLESPACE it is possible to change the associated storage group. To move the data from the former to the new storage (group), data is rebalanced in the background, i.e., asynchronously. To have greater control over those background tasks, DB2 10 adds SUSPEND and RESUME to the ALTER TABLEPACE ... REBALANCE statements. So you can decide when to take a break...
As you might know, there is another rebalance operation available in DB2, storage-related again. It is used in pureScale environments after changes to the shared file system (cluster file system) and rebalances storage utilization. The rebalance option is part of the db2cluster command which is used to interact with the reliable, scalable cluster infrastructure (RSCT) and the cluster file system GPFS.
As both rebalance operations are recommended for "periods of low system activity", this could mean vacation time - bringing me back to what I am preparing for...
Friday, July 27, 2012
A box of ice cream?
Some years back IBM was advertising a very special box. In the commercial they showed what it could be good for. One thing is definitely missing: ice cream. Today is a hot day here in the South of Germany and I am looking forward to some ice cream after work and then some fun at the Kulturufer ("shore of culture" - a well-known tent festival at the shores of Lake Constance in Friedrichshafen with music, theater, comedy, magic, and much more).
But now back to the magic box:
But now back to the magic box:
Tuesday, July 17, 2012
Another year passed by: Energy consumption in a passive house
In the first year of living in the new house I kind of read the energy meters more or less weekly, then it turned into bi-weekly to monthly activity. The past year I looked at the meters mostly during the Winter months and almost forgot to read them out last weekend when another year had passed. When you look at the energy consumption, there are surprising numbers:
Even though the past year was a leap year (and had a leap second!!!) the consumption was the same. I had hoped to have a smaller number, but we had some very chilly weeks in February. Household consumption has also stayed constant with most consumers now in place. Most of it should come from cooking, laundry and my home office with computer, monitor, phones, etc.
Our house does not have a smart meter where you can track consumption online and real-time. However, I talked with some "owners" and the bottom line is that after the first few weeks they barely make use of it. It is similar to reading our old-fashioned meters more frequently in the beginning to learn and look for exceptions. Then leaning back once everything seems to run smoothly. Any similarities to a well-tuned database system and regular workloads...?
2012 | 2011 | 2010 | 2009 | |
---|---|---|---|---|
Household (kWh) | 2597 | 2605 | 2473 | 2244 |
Heating/Ventilation/Water (kWh) | 2713 | 2713 | 2858 | 2782 |
Even though the past year was a leap year (and had a leap second!!!) the consumption was the same. I had hoped to have a smaller number, but we had some very chilly weeks in February. Household consumption has also stayed constant with most consumers now in place. Most of it should come from cooking, laundry and my home office with computer, monitor, phones, etc.
Our house does not have a smart meter where you can track consumption online and real-time. However, I talked with some "owners" and the bottom line is that after the first few weeks they barely make use of it. It is similar to reading our old-fashioned meters more frequently in the beginning to learn and look for exceptions. Then leaning back once everything seems to run smoothly. Any similarities to a well-tuned database system and regular workloads...?
Labels:
energy saving,
Life,
passive house,
Passivhaus,
power consumption,
smart metering
Friday, July 13, 2012
Big Data, Big Problems, Right Questions?
The term "Big Data" has been around for a while. Most of my time I have dealt with problems around traditional data (relational data), new data structures, non-traditional data, semi-structured data, in-memory data, on-disk data, hard-to-find data, lost data, object-oriented data, object-relational data, old data, new data, archive data, wrong data, corrupted data, THE right data, and all kinds of other data.
Almost daily I receive emails or letters where my name is misprinted, the address is incorrect, or where the email was not even intended for me (long story). Few days ago I listened to a longer discussion about problems with various kind of applications at several companies, all non-IBM-related. One key problem was that due to failing components in a too complex environment, data got outdated, was missing or is incorrect. The consequences impacted the "regular guys", ranging from having no or incorrect parts in an assembly line over not updated timesheets to not being able to "badge in". When I talk with my students at the Cooperative State University (who all have a job in the industry), many of them have seen or had to deal with incorrect or outdated data and "information" based on it.
The issues remind me of one of the first lessons in physics at school: "Garbage in, garbage out". For Big Data, the amount of data is in a different dimension, but the same principles as with the other data apply. It is important to know that there are or could be issues with the input data. How is that dealt with? Does it impact the algorithms and the result? Is the derived "information" really information? How is the data interpreted? Is it ok to base decisions on them or is it taken as just one of many indicators? Do those dealing with information found from Big Data know the source, is everything fully disclosed? There are many more questions, many of them non-technical. One important question is what questions to ask against the data. Is this to "proof" some speculation and guesswork or to really find out something new that may be validated by further work? And then we are back to a problem we still face since years with existing, "small" data.
Now to a small problem and statistics...:
Over the weekend we will have big fireworks locally here over the Lake Constance. My kids want to watch them for sure. I could point out that only 20% of the persons I considered for my statistical findings will attend. My kids could respond that 70% will attend. They chose people aged 8-80 years, I those aged 0-8 years...
The context counts and asking the right questions, regardless of small or big data.
Almost daily I receive emails or letters where my name is misprinted, the address is incorrect, or where the email was not even intended for me (long story). Few days ago I listened to a longer discussion about problems with various kind of applications at several companies, all non-IBM-related. One key problem was that due to failing components in a too complex environment, data got outdated, was missing or is incorrect. The consequences impacted the "regular guys", ranging from having no or incorrect parts in an assembly line over not updated timesheets to not being able to "badge in". When I talk with my students at the Cooperative State University (who all have a job in the industry), many of them have seen or had to deal with incorrect or outdated data and "information" based on it.
The issues remind me of one of the first lessons in physics at school: "Garbage in, garbage out". For Big Data, the amount of data is in a different dimension, but the same principles as with the other data apply. It is important to know that there are or could be issues with the input data. How is that dealt with? Does it impact the algorithms and the result? Is the derived "information" really information? How is the data interpreted? Is it ok to base decisions on them or is it taken as just one of many indicators? Do those dealing with information found from Big Data know the source, is everything fully disclosed? There are many more questions, many of them non-technical. One important question is what questions to ask against the data. Is this to "proof" some speculation and guesswork or to really find out something new that may be validated by further work? And then we are back to a problem we still face since years with existing, "small" data.
Now to a small problem and statistics...:
Over the weekend we will have big fireworks locally here over the Lake Constance. My kids want to watch them for sure. I could point out that only 20% of the persons I considered for my statistical findings will attend. My kids could respond that 70% will attend. They chose people aged 8-80 years, I those aged 0-8 years...
The context counts and asking the right questions, regardless of small or big data.
Tuesday, July 10, 2012
Index something non-existing - functional indexes in DB2
After discussing hidden (non-visible) columns last week, indexing something non-existing fits into the same theme. Let's shed some light into this seemingly mysterious topic...
XML documents and XML-based data modeling have been around for quite some years now. DB2 started its XML support with the XML extender in the late 1990ies (version 7), DB2 pureXML shipped initially with version 9.1. Advantages of using XML include flexibility of what is included in a document (or set of documents) and how sparse data can be represented. In relational NULL values or other "empty" values need to be stored even if data for a property (column) does not exist, in XML documents that particular property could be just left off, i.e., nothing is stored. However, the "nothing is stored", the non-existing data introduced a problem for searching efficiently in some use cases. How do you find all customers that do not have a certain property, e.g., have not received the marketing letter yet or do not own a car?
To address this kind of problem, DB2 10.1 now allows certain functional indexes over XML data (when does it come for relational data - any guesses...?). One of the functions allowed in such an index definition is fn:upper-case(). That way an index supports case insensitive searches over strings:
The other supported function is fn:exists() which allows to index existence or non-existence of an element or attribute. Thus, an index can be utilized to search even for something that is not (directly) stored in the database, i.e., implied information.
SELECT * FROM customers WHERE
XMLEXISTS('$CINFO/customers/sw[not(fn:exists(db2version))]');
The above query would return all customers who don't have any DB2 version of software installed. Maybe they are reading this article and then will install DB2 soon...?!
XML documents and XML-based data modeling have been around for quite some years now. DB2 started its XML support with the XML extender in the late 1990ies (version 7), DB2 pureXML shipped initially with version 9.1. Advantages of using XML include flexibility of what is included in a document (or set of documents) and how sparse data can be represented. In relational NULL values or other "empty" values need to be stored even if data for a property (column) does not exist, in XML documents that particular property could be just left off, i.e., nothing is stored. However, the "nothing is stored", the non-existing data introduced a problem for searching efficiently in some use cases. How do you find all customers that do not have a certain property, e.g., have not received the marketing letter yet or do not own a car?
To address this kind of problem, DB2 10.1 now allows certain functional indexes over XML data (when does it come for relational data - any guesses...?). One of the functions allowed in such an index definition is fn:upper-case(). That way an index supports case insensitive searches over strings:
CREATE INDEX customers_email_idx ON customers(cinfo)
GENERATE KEYS USING XMLPATTERN '/customer/contact/email/fn:upper-case(.)'
AS SQL VARCHAR(80);
SELECT * FROM customers WHERE
XMLEXISTS('$CINFO/customer/contact/email[fn:upper-case(.)="HENRIK.LOESER AT GMAIL.COM"]');
XMLEXISTS('$CINFO/customer/contact/email[fn:upper-case(.)="HENRIK.LOESER AT GMAIL.COM"]');
The other supported function is fn:exists() which allows to index existence or non-existence of an element or attribute. Thus, an index can be utilized to search even for something that is not (directly) stored in the database, i.e., implied information.
CREATE INDEX customers_db2vers_idx ON customers(cinfo)
GENERATE KEYS USING XMLPATTERN '/customer/sw/fn:exists(db2version)'
SQL AS VARCHAR(1);
XMLEXISTS('$CINFO/customers/sw[not(fn:exists(db2version))]');
The above query would return all customers who don't have any DB2 version of software installed. Maybe they are reading this article and then will install DB2 soon...?!
Monday, July 9, 2012
Another quiz: How to index something non-existing?
To start off this week with the right attitude, here is another quiz:
Imagine that I would like to store address data and information about my contacts ("friends"?). I want to keep track of when I met them. In my database I want to efficiently query whom of my contacts I met on a specific date. From database theory I know that indexing the data would help. But I would also like to search for persons I never met. How could I support that with an index in DB2? How could the data be stored?
SQL statements will be in the solution... [Update: The solution has been posted.]
Imagine that I would like to store address data and information about my contacts ("friends"?). I want to keep track of when I met them. In my database I want to efficiently query whom of my contacts I met on a specific date. From database theory I know that indexing the data would help. But I would also like to search for persons I never met. How could I support that with an index in DB2? How could the data be stored?
SQL statements will be in the solution... [Update: The solution has been posted.]
Wednesday, July 4, 2012
Implicitly hidden columns in DB2
On Monday I posted a small quiz about an SQL error reported by DB2. In the comments were guesses about the solution, including generated columns, triggers, or the table actually being a view causing the error. However, the reason for the processing problem is a different one and it was also mentioned in the comments: The presence of an implicitly hidden column.
The concept of implicitly hidden columns was first introduced in DB2 9.5 for the purpose of easier and faster optimistic locking. A column defined as IMPLICITLY HIDDEN is not returned as part of a SELECT statement if not explicitly referenced. In DB2 9.5 and DB2 9.7 this feature was limited to a ROW CHANGE TIMESTAMP column type to implement the optimistic locking. With DB2 10.1 this has changed as you can see at the CREATE TABLE statement and a special page with an introduction to the concept of hidden columns. Now, you can hide all kinds of columns.
Here is how I defined the table that I used in the quiz:
As you can see, the third column is hidden. Then I performed an insert which resulted in the reported error:
db2 => insert into ih values(1,'a')
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0407N Assignment of a NULL value to a NOT NULL column "TBSPACEID=2,
TABLEID=257, COLNO=2" is not allowed. SQLSTATE=23502
What is interesting to note is the following:
db2 => describe select * from ih
Column Information
Number of columns: 2
SQL type Type length Column name Name length
-------------------- ----------- ------------------------------ -----------
496 INTEGER 4 ID 2
449 VARCHAR 30 S 1
Performing a DESCRIBE TABLE shows all three columns.
Finally, if you are sick of all the hidden stuff, you can act as a magician and let the columns reappear again:
db2 => alter table ih alter s2 set not hidden
DB20000I The SQL command completed successfully.
db2 => describe select * from ih
Column Information
Number of columns: 3
SQL type Type length Column name Name length
-------------------- ----------- ------------------------------ -----------
496 INTEGER 4 ID 2
449 VARCHAR 30 S 1
448 VARCHAR 30 S2 2
Now I will disappear by altering my state, coffee is waiting...
The concept of implicitly hidden columns was first introduced in DB2 9.5 for the purpose of easier and faster optimistic locking. A column defined as IMPLICITLY HIDDEN is not returned as part of a SELECT statement if not explicitly referenced. In DB2 9.5 and DB2 9.7 this feature was limited to a ROW CHANGE TIMESTAMP column type to implement the optimistic locking. With DB2 10.1 this has changed as you can see at the CREATE TABLE statement and a special page with an introduction to the concept of hidden columns. Now, you can hide all kinds of columns.
Here is how I defined the table that I used in the quiz:
db2 => create table ih(id int unique not null, s varchar(30), s2 varchar(30) not null implicitly hidden)
DB20000I The SQL command completed successfully.
DB20000I The SQL command completed successfully.
As you can see, the third column is hidden. Then I performed an insert which resulted in the reported error:
db2 => insert into ih values(1,'a')
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0407N Assignment of a NULL value to a NOT NULL column "TBSPACEID=2,
TABLEID=257, COLNO=2" is not allowed. SQLSTATE=23502
What is interesting to note is the following:
- Best practice and strongly advised is to have a default value defined for hidden columns to not run into such errors.
- In computer science many enumerations start with zero, not one. The column numbers, table identifiers, and tablespace identifiers in DB2 are no exception. Hence is the column with COLNO=2 the THIRD column which could mislead you. In the description for the error SQL0407N is actually a SQL query provided that helps resolve the mystery.
- Similar issues are possible with so-called data movement utilities like IMPORT and LOAD as you might have or not have data for the hidden column.
db2 => describe select * from ih
Column Information
Number of columns: 2
SQL type Type length Column name Name length
-------------------- ----------- ------------------------------ -----------
496 INTEGER 4 ID 2
449 VARCHAR 30 S 1
Performing a DESCRIBE TABLE shows all three columns.
db2 => describe table ih
Data type Column
Column name schema Data type name Length Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
ID SYSIBM INTEGER 4 0 No
S SYSIBM VARCHAR 30 0 Yes
S2 SYSIBM VARCHAR 30 0 No
3 record(s) selected.
Data type Column
Column name schema Data type name Length Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
ID SYSIBM INTEGER 4 0 No
S SYSIBM VARCHAR 30 0 Yes
S2 SYSIBM VARCHAR 30 0 No
3 record(s) selected.
Finally, if you are sick of all the hidden stuff, you can act as a magician and let the columns reappear again:
db2 => alter table ih alter s2 set not hidden
DB20000I The SQL command completed successfully.
db2 => describe select * from ih
Column Information
Number of columns: 3
SQL type Type length Column name Name length
-------------------- ----------- ------------------------------ -----------
496 INTEGER 4 ID 2
449 VARCHAR 30 S 1
448 VARCHAR 30 S2 2
Now I will disappear by altering my state, coffee is waiting...
Monday, July 2, 2012
A small DB2 SQL quiz - why the SQL error?
I have been playing with some new DB2 10.1 features and thought that a small SQL quiz at the beginning of the week is appropriate. Consider the following SQL statements and the related output:
db2 => insert into ih values(1,'a')
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0407N Assignment of a NULL value to a NOT NULL column "TBSPACEID=2,
TABLEID=257, COLNO=2" is not allowed. SQLSTATE=23502
db2 => select * from ih
ID S
----------- ------------------------------
0 record(s) selected.
db2 => describe select * from ih
Column Information
Number of columns: 2
SQL type Type length Column name Name length
-------------------- ----------- ------------------------------ -----------
496 INTEGER 4 ID 2
449 VARCHAR 30 S 1
How was the table defined or why is it giving an error? Any guesses? Post your answer and explanation as comment. I will post the solution in a couple days. [Edit: done and linked]
db2 => insert into ih values(1,'a')
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0407N Assignment of a NULL value to a NOT NULL column "TBSPACEID=2,
TABLEID=257, COLNO=2" is not allowed. SQLSTATE=23502
db2 => select * from ih
ID S
----------- ------------------------------
0 record(s) selected.
db2 => describe select * from ih
Column Information
Number of columns: 2
SQL type Type length Column name Name length
-------------------- ----------- ------------------------------ -----------
496 INTEGER 4 ID 2
449 VARCHAR 30 S 1
How was the table defined or why is it giving an error? Any guesses? Post your answer and explanation as comment. I will post the solution in a couple days. [Edit: done and linked]
Thursday, June 28, 2012
Want to learn about DB2 security? New best practices paper
A new DB2 Best Practices paper has been published on developerWorks: DB2 best practices: A practical guide to restrictive databases. Did you know that instead of CREATE DATABASE foo you could also say CREATE DATABASE foo RESTRICTIVE? But what are the differences? The restrictive database has only very few privileges granted to the group PUBLIC and thereby starts off with increased security.
In the new paper the authors compare how the non-restrictive and restrictive databases differ in terms of privileges with a focus on common use cases and administrative groups. It helps to understand how to make use of restrictive databases. While in regular databases a typical problem to secure the database, in a restrictive database it means granting the needed privileges so users can "work".
In the new paper the authors compare how the non-restrictive and restrictive databases differ in terms of privileges with a focus on common use cases and administrative groups. It helps to understand how to make use of restrictive databases. While in regular databases a typical problem to secure the database, in a restrictive database it means granting the needed privileges so users can "work".
Monday, June 25, 2012
How to communicate within your family (kids, significant other, DB2)...
Healthy families talk with each other. But (usually) I speak differently with my wife than with my kids. There is no manual on how to address everyone individually or how to conduct get heard by all of them. Fortunately, there is a guide for the DB2 family of products. And it has been updated recently.
When I give talks about the DB2 family I always mention that we strive to coordinate new features, learn from each other (yes, even mistakes), and try to speak the same language. Time travel queries (temporal data), row and column access control, the XML data type, binary XML, etc. are some of the more visible recent coordinated endeavors. Now the so-called SQL Reference for Cross-Platform Development has been updated to reflect new DB2 features. That page is very useful for some other reasons:
When I give talks about the DB2 family I always mention that we strive to coordinate new features, learn from each other (yes, even mistakes), and try to speak the same language. Time travel queries (temporal data), row and column access control, the XML data type, binary XML, etc. are some of the more visible recent coordinated endeavors. Now the so-called SQL Reference for Cross-Platform Development has been updated to reflect new DB2 features. That page is very useful for some other reasons:
- It offers links to older versions of the SQL Reference for Cross-Platform Development.
- It has links to a lot of documentation for DB2 z/OS, DB2 for Linux, UNIX, and Windows, and for DB2 i.
- All versions combined, it has 10 years of best practices of application development.
Wednesday, June 20, 2012
NoSQL, SPARQL, RDF, Graph Store, and more buzzwords - all new in DB2 10
One of the smaller headlines during the DB2 10 announcement was the "NoSQL Graph Store". In the "What is new" section of the DB2 Information Center it is listed as "RDF application development support has been added". However, if you work in/with the Semantic Web, it is an important addition as it provides some robust and performant infrastructure. In the following, I will point you to some resources around SPARQL and RDF stores in DB2.
As the data can be represented in XML, a question about support for graph stores is actually a question that pops up from time to time in DB2 pureXML discussions. For those searching for a combination of the buzzwords and DB2, the search engines should now point you to the right pages... ;-)
- A tutorial on RDF application development with DB2 is on developerWorks.
- The SPARQL query language as defined by the W3C.
- Wikipedia has an overview of the Resource Description Framework (RDF) with many links.
- Most of the RDF/SPARQL-related content in the DB2 Information Center is in the application development section. "RDF application development for DB2 data servers" is the starting point.
As the data can be represented in XML, a question about support for graph stores is actually a question that pops up from time to time in DB2 pureXML discussions. For those searching for a combination of the buzzwords and DB2, the search engines should now point you to the right pages... ;-)
Friday, June 15, 2012
It is "venerdi" in "it" - a Friday idea
It is Friday and close to afternoon. What to do next? I type "Friday" into the search field of the DB2 Information Center and in the result list I notice the function DAYNAME. In the description I notice that I can provide a locale or language code as parameter.
Let's give it a try, it is Friday afternoon:
1
----------------------------------------------------------------------------------------------------
venerdƬ
1 record(s) selected.
If your language or country is not listed, try it yourself. Have a nice weekend.
Let's give it a try, it is Friday afternoon:
db2 => values(dayname(current date))
1
----------------------------------------------------------------------------------------------------
Friday
1 record(s) selected.
1
----------------------------------------------------------------------------------------------------
Friday
1 record(s) selected.
db2 => values(dayname(current date, 'de'))
1
----------------------------------------------------------------------------------------------------
Freitag
1 record(s) selected.
db2 => values(dayname(current date,'fr'))
1
----------------------------------------------------------------------------------------------------
vendredi
1 record(s) selected.
db2 => values(dayname(current date,'es'))
1
----------------------------------------------------------------------------------------------------
viernes
1 record(s) selected.
1
----------------------------------------------------------------------------------------------------
Freitag
1 record(s) selected.
db2 => values(dayname(current date,'fr'))
1
----------------------------------------------------------------------------------------------------
vendredi
1 record(s) selected.
db2 => values(dayname(current date,'es'))
1
----------------------------------------------------------------------------------------------------
viernes
1 record(s) selected.
db2 => values(dayname(current date,'in'))
1
----------------------------------------------------------------------------------------------------
Jumat
1 record(s) selected.
db2 => values(dayname(current date,'it'))1
----------------------------------------------------------------------------------------------------
Jumat
1 record(s) selected.
1
----------------------------------------------------------------------------------------------------
venerdƬ
1 record(s) selected.
If your language or country is not listed, try it yourself. Have a nice weekend.
Wednesday, June 13, 2012
DB2 pureXML and bufferpools - revisited
Some years ago I wrote that XML data is buffered for performance reason and looked at the XDA object. Now I found out that I didn't answer the "buffered or not" question entirely. The question which bufferpools are utilised depending on inlined or regular XML storage remains. So let me answer it.
Considering the above CREATE TABLE statement, data for table myX would be stored in two tablespaces, myT1 and myT2, and use the associated bufferpool(s). If the internal representation of the XML data for column x1 would be up to 500 bytes, the row data consisting of an integer value for id, the XML data for x1, and a descriptor pointing to the XML data of x2 would be stored in the DAT object and hence in tablespace myT1. When accessed, the data would go to the bufferpool associated with myT1. If the XML data for x1 would be larger than the 500 bytes, the row would hold an integer and two descriptors (one each for x1 and x2).
The descriptors would point to entries in data pages in the XDA object which is stored in tablespace myT2. All XML data for x2 and the data too long to be inlined for x1 would be stored in myT2. When accessed, the bufferpool associated with tablespace myT2 would hold the pages.
In short again: XML data is buffered and it goes through the bufferpool associated with the tablespace it is stored in. This is the same as with regular relational data or index pages.
CREATE TABLE myX (id INT, x1 XML INLINE LENGTH 500, x2 XML) IN myT1 LONG IN myT2;
Considering the above CREATE TABLE statement, data for table myX would be stored in two tablespaces, myT1 and myT2, and use the associated bufferpool(s). If the internal representation of the XML data for column x1 would be up to 500 bytes, the row data consisting of an integer value for id, the XML data for x1, and a descriptor pointing to the XML data of x2 would be stored in the DAT object and hence in tablespace myT1. When accessed, the data would go to the bufferpool associated with myT1. If the XML data for x1 would be larger than the 500 bytes, the row would hold an integer and two descriptors (one each for x1 and x2).
The descriptors would point to entries in data pages in the XDA object which is stored in tablespace myT2. All XML data for x2 and the data too long to be inlined for x1 would be stored in myT2. When accessed, the bufferpool associated with tablespace myT2 would hold the pages.
In short again: XML data is buffered and it goes through the bufferpool associated with the tablespace it is stored in. This is the same as with regular relational data or index pages.
Subscribe to:
Posts (Atom)