Tuesday, September 2, 2014

New DB2 Cancun Release (Version 10.5 Fixpack 4) offers many enhancements

The fixpack 4 for DB2 10.5 is available since end of last week. Because it has an unusually long list of product enhancements and new features the fixpack even has the codename or nickname "Cancun Release". For those of you not too familiar with Northern American vacation culture, Cancún is a favorite vacation/tourist destination in Mexico, located at the Carribean Sea. So "Cancun Release" may suggest relaxation, recreation, and a dream come true because of ease of use, simplification and major performance enhancements for the DB2 in-memory database feature (BLU Acceleration), the broadened pureScale support, and other nice to haves.

A good start for approaching the new DB2 Cancun release is the fixpack summary in the Knowledge Center. It lists new features by category, my personal highlights are:
  • For the in-memory database support (referred to as "column-organized tables" and known as "BLU Acceleration" some bigger items include so-called shadow table to improve analytic queries in an OLTP environment, lifting of several DDL restrictions, and major performance improvement by adding CHAR and VARCHAR columns to the synopsis table. An in-memory database can be made highly available with the HADR feature.
  • DB2 pureScale clusters can be deployed in virtualized environments (VMware ESXi, KVM), on low-cost solutions without the RDMA requirement, and geographically dispersed cluster (2 data centers) can be implemented on AIX, Red Hat, SuSE with just RoCE as requirement.
  • As part of the SQL compatibility DB2 now supports string length definitions by characters, not just by bytes as before.
  • Installation of DB2 in so-called thin server instances.
  • A SECADM can enforce encryption of backups.
  • db2audit can be used to transfer audit records to syslog for simpler analyzation with, e.g., Splunk.
  • db2look has been improved to generate the CREATE DATABASE statement and export the configuration (see my earlier blog article on that db2look improvement in DB2 10.1).
  • Official support for POWER8.
I plan to blog about some of the new functionality over the next weeks. Until then you can take a look at the new items yourself. Fixpacks can be downloaded from this IBM support website. If you have an IBM Bluemix account or plan to create one, you can use the improved DB2 as part of the Bluemix Analytics Warehouse service. Check out my earlier post about how to set it up and connect to it using a local DB2CLP.

Last but not least: What is your favorite vacation destinations? Suggest new codenames as comment and don't forget new DB2 features you want to see...

Monday, September 1, 2014

What a plot: DB2, R, and Bluemix help with vacation weather

Last week I reported on how I set up a in-memory DB2 database on IBM Bluemix and loaded some historic weather data. Over the last couple days I used some spare time to play with the Cloud-based analytic capabilities that are provided as part of the Softlayer/Bluemix/DB2 combination. Most of the time went into learning (some basics of) R, an environment for statistical computing and graphics. As an example I wanted to find out what temperatures to expect for a possible September vacation on the German island of Norderney.

For my small exercise I used data provided by the German Meteorological Service "Deutscher Wetterdienst". It allows to freely download and use (under some conditions) data from several decades of weather observation. I uploaded the data to DB2/Bluemix as described in my previous post.
Bluemix: Change DB2 column name and type
While playing with the data I noticed that the column names required escaping of quotes and the observation dates were stored as integer values (yyyymmdd). In a second upload I simplified the column names and adapted the column data type using the DB2 load wizard (see picture). Thereafter I was set for my experiments with R.

The DB2 Cloud environment provides several examples for programming in R, a special function library "bluR" to easily connect R with DB2-based data, and it features the RStudio to develop, test, and execute code in R. Within RStudio it is possible to execute several demos to learn more about analytics, graphing, and data processing. For the DB2 in-memory database API for R there is a demo as well. You can invoke it using the "demo(blur)" command:

DB2 API demo for R in RStudio
The demo shows how to connect to DB2, execute a query and use the fetched data for analytic processing in R. Last week I already tweeted about how I tested escaping of quote characters (use C style, not SQL style):

The data set which I uploaded to DB2 has daily minimum and maximum temperatures (and lots of other meteorological) for about 70 years. I used a SQL query and then the ggplot2 library to create a graphic. It shows the band for the minimum temperatures for each September day as well as the band for the maximum daily temperatures.
DB2 with R: Historic September temperatures
The code for this graphic is pretty simple (and I started last week looking at R and DB2) and available from my Github account:
1:  ########### R script to analyze historic weather data for min/max values  
2:  ## Written by Henrik Loeser  
3:  ## Connection handle con to BLU for Cloud data warehouse is provided already  
4:  ## For plotting, we are using ggplot2 package  
5:  ##   
6:  library(ggplot2)  
7:  library(bluR)  
9:  ## initialize DB2 connection and environment  
10:  con <- bluConnect("BLUDB","","")  
11:  bluAnalyticsInit(con)  
13:  ## query DB2 weather data and fetch min/max values of min/max values  
14:  ## (lower/upper boundary each)   
15:  query<-paste('select max(lufttemperatur_maximum) as maxmax,min(lufttemperatur_minimum) as minmin,min(lufttemperatur_maximum) as minmax,max(lufttemperatur_minimum) as maxmin,tag from (select lufttemperatur_maximum, lufttemperatur_minimum, day(mdatum) as tag from blu01023.klima where month(mdatum)=9) group by tag order by tag asc')   
16:  df <- bluQuery(query,as.is=F)  
18:  ## Some plotting needs to be done  
19:  jpeg(type='cairo',"tempe.jpg",width=800,height=600)   
20:  ggplot(df, aes(x = TAG))+ylab("Temperature")+xlab("Day")+          
21:     geom_ribbon(aes(ymin = MINMIN, ymax=MAXMIN), fill='blue')+  
22:     geom_ribbon(aes(ymin = MAXMAX, ymax=MINMAX), fill='red')+  
23:     geom_ribbon(aes(ymin = MAXMIN, ymax=MINMAX), fill='white')+  
24:     geom_line(aes(y = MINMIN), colour = 'black') +  
25:     geom_line(aes(y = MAXMIN), colour = 'black') +  
26:     geom_line(aes(y = MINMAX), colour = 'black') +  
27:     geom_line(aes(y = MAXMAX), colour = 'black')   
29:  sink('/dev/null')   
31:  bluClose(con)  
32:  ## connection is closed, we are done  

Pretty cool (my opinion)! I am already turning into a data scientist. And you can test this yourself on IBM Bluemix with the Analytics Warehouse service (DB2 in-memory database feature).

Monday, August 25, 2014

Setting up and using a DB2 in-memory database on IBM Bluemix

Last Friday I was on the way back from some customer visits. While traveling in a German highspeed train I used the Wifi service, connected to IBM Bluemix and created a DB2 in-memory database. Let me show you how I set it up, what you can do with it and how I am connecting to the cloud-based database from my laptop.

Unbound DB2 service on Bluemix
The first thing to know is that on Bluemix the DB2 in-memory database service is called IBM Analytics Warehouse. To create a database, you select "Add service" and leave it unbound if you want, i.e., it is not directly associated with any Bluemix application. That is ok because at this time we are only interested in the database. Once the service is added and the database itself created, you can lauch the administration console.

The console supports several administration and development tasks as show in the picture. It includes loading data, to develop analytic scripts in R, to execute queries and link the data with Microsoft Excel for processing in a spreadsheet, and it has a section to connect external tools or applications to the database.
Administration/development task in DB2 BLU console on Bluemix
One of the offered task is very interesting and I twittered about it on Friday, too:

You can set up replication from a Cloudant JSON database to DB2, so that the data stream is directly fed in for in-memory analyses. I didn't test it so far, but plan to do so with one of my other Bluemix projects.

A task that I used is to (up)load data. For this I took some historic weather data (planning ahead for a vacation location), let the load wizard extract the metadata to create a suitable data, and ran some queries.

Uploading data to DB2 on Bluemix

Specify new DB2 table and column names

For executing (simple) selects there is a "Run Query" dialogue. It allows to choose a table and columns and then generates a basic query skeleton. I looked into whether a specific German island had warm nights, i.e., a daily minimum temperature of over 20 degrees Celsius. Only 14 days out of several decades and thousands of data points qualified.

Last but not least, I connected my local DB2 installation and tools to the Bluemix/Softlayer-based instance. The "CATALOG TCPIP NODE" is needed t make the remote server and communication port known. Then the database is added. If you already have a database with the same name cataloged on the local system, it will give an error message as shown below. You can work around it by specifying an alias. So instead of calling the database BLUDB, I used BLUDB2. The final step was to connect to DB2 with BLU Acceleration in the cloud. And surprise, it uses a fixpack version that officially is not available yet for download...

DB:  => catalog tcpip node bluemix remote 50.97.xx.xxx server 50000
DB20000I  The CATALOG TCPIP NODE command completed successfully.
DB21056W  Directory changes may not be effective until the directory cache is
DB:  => catalog db bludb at node bluemix
SQL1005N  The database alias "bludb" already exists in either the local
database directory or system database directory.
DB:  => catalog db bludb as bludb2 at node bluemix
DB20000I  The CATALOG DATABASE command completed successfully.
DB21056W  Directory changes may not be effective until the directory cache is
DB:  => connect to bludb2 user blu01xxx
Enter current password for blu01xxx:

   Database Connection Information

 Database server        = DB2/LINUXX8664 10.5.4
 SQL authorization ID   = BLU01xxx
 Local database alias   = BLUDB2

I will plan to develop a simple application using the DB2 in-memory database (BLU Acceleration / Analytics Warehouse) and then write about it. Until then read more about IBM Bluemix in my other related blog entries.


Related Posts with Thumbnails