Thursday, November 20, 2014

Useful DB2 administrative functions and views

Did you know that there are about 80 (eight-zero) administrative views in the SYSIBMADM schema in DB2 that are ready for use? I have used several of them and also looked into the documentation, but 80 is quite a lot. (Almost) All of them are documented in the DB2 Knowledge Center in the "Built-in routines and views" section.

The routines live in the SYSPROC schema, administrative views can be found in the schema SYSIBMADM. Given that insight it is easy to construct a simple query to find all available views:

SELECT viewname from syscat.views where viewschema='SYSIBMADM'

Depending on your version and fixpack level of DB2 the result will vary. Speaking of fixpack level, do you know how to find out what your system is running by using SQL? The view ENV_INST_INFO may help in that case because it returns instance-related information such as the instance name, the DB2 version, fixpack, and build level:

SELECT * FROM SYSIBMADM.ENV_INST_INFO

Are you connected to, e.g., an Advanced Workgroup Server Edition (AWSE) of DB2 or an Enterprise Server Edition (ESE)? Find out by querying the product information using the view ENV_PROD_INFO. It returns the installed product, the kind of active licenses, and more:

SELECT * FROM SYSIBMADM.ENV_PROD_INFO

Next in the list of useful views with system information is ENV_SYS_INFO. It can be utilized to find out more about the operating system, the type of hardware, installed CPU and memory, etc.:

SELECT * from SYSIBMADM.ENV_SYS_INFO

Last, but not least in my list of views with basic system information are DBMCFG and DBCFG. As the name implies can these views help to retrieve the current instance (database manager / dbm) or the current database (db) configuration. So it is easy to find out whether the self-tuning memory manager (STMM) is active or where diagnostic logs are stored.

That's it for today, I am back to playing with more of those views (and routines)...

Monday, November 17, 2014

A quick look at dashDB and a happy SQuirreL

dashDB slogan on its website
This morning I took some time to take a look at dashDB, a new IBM DWaaS (Data Warehouse as a Service) offering. When you go to the dashDB website, you are offered two choices: Use the dashDB service available on IBM Bluemix or use a Cloudant account to add a warehouse to your JSON database. Let me give you a brief overview of what you can do with dashDB and how I connected a local (open source) SQuirreL SQL client to my new dashDB database.


Cloudant Warehousing (dashDB)
dashDB is a cloud-based analytics database ("analytics in a dash")) with roots in Netezza and DB2 with BLU Acceleration. Data is stored in table (rows and columns) format. It is ready to connect all kinds analytic tools, local or cloud-based, and is already set up for geo-spatial data analysis (instructions on how to use the ESRI ArcGIS Desktop are provided). The best is that your regular SQL database/analytic tools continue to work, see below for details.

dashDB: schema discovery
I started my journey by logging into my existing Cloudant account. There, on the dashboard menu is a new item "Warehousing". When clicking on the "New Warehouse" button, you can select the Cloudant databases that you want to import into the warehouse. Because multiple databases can be associated with a Cloudant account or a Bluemix Cloudant service, this step let's you pick the data of choice. After the source data is chosen, the dashDB database is created and so-called schema discovery turns the JSON documents into rows of tables. Thereafter, the data is ready to have analytics applied. That is the time to launch the dashDB control center, another so-called "dashboard".

The welcome screen shows some of the analytic options, e.g., the database is ready to be used with either Cognos, SPSS, InfoSphere DataStage, R scripts, or all of them and more:
Analytis for dashDB: Cognos, SPSS, DataStage, R
SQuirrel SQL client - dashDB connected
Because some time ago I already tested and blogged about a predecessor of dashDB (see here: how to set it up and how to use R), I was more interested in trying out a JDBC-based client with my new cloud-based data warehouse. Included as part of the dashboard are several sections that help you with the application setup. So it was easy for me to obtain the JDBC URL and configure it and the listed userid/password in my local SQuirrel SQL client (it will work in IBM Data Studio and the Optim tool, too). As you can see from the screenshot, the database connection from my laptop to the cloud-based dashDB succeeded. Ready for some SQL.

My lessons learned from testing database queries on the converted data (JSON to relational) will be part of another blog entry. Stay tuned...

Friday, October 24, 2014

Apply DB2 skills to Hadoop by using Big SQL on Bluemix

IBM Analytics for Hadoop
One of the many services offered on the Platform-as-a-Service (PaaS) IBM Bluemix is "IBM Analytics for Hadoop", basically InfoSphere BigInsights as cloud service. Because it is a Big Data service and it offers SQL capabilities I was eager to test it. Here is a first report how I got some queries running against data on my Hadoop file system in the cloud.

Software for download
After creating the Analytics for Hadoop service (which I left unbound) I launched the dashboard.
The first I noticed is a link to download software. I brought up a long list of drivers, Eclipse plugins, API packages and so forth. If you want to explore the full power of BigInsights / Hadoop you don't need to search for all the relevant and compatible software, it's just a click away. I passed on that offer and instead created new directories in the HDFS. I read in the BigInsights tutorials that the directories are needed to hold my data and to create workbooks or so-called BigSheets to transform data from files to something processable.

Creating the directories is done by clicking on the appropriate icon on top of the directory tree structure, picking the right parent directory and specifying a name. Been there, done that. I then used another icon to invoke the GUI for file upload. Few minutes later I had uploaded two files with historic weather data (see my older blog entries) and was in the BigSheets section of the dashboard, ready to create a workbook. I named it "weather" and also chose a file. Because the input file is not a CSV by definition but uses a semicolon to separate data, I had to apply the "Character Delimited Data with Text Qualifier" reader to it. I specified the semicolon as delimiter and also checked the "header included", i.e., the first row included the column names.

New BigInSights / Hadoop directory and uploaded files
After creating the workbook I clicked on the option "Create Table" and then experienced a "lesson learned":



So I created another directory, moved the files in there, deleted the workbook and wanted to create it again. However, I got an error message. Another lesson learned:



Ok, purge the deleted workbook, then recreate it. Done. I had read that I can share the data of a workbook for use by Big SQL and Hive by creating a table based on the workbook. In my test I created "sheets.weather" as a first table. I also created a second workbook based on my first, removed some columns, and created a second table "sheets.myweather" (not too creative in naming stuff...). Anyway, the main purpose of this exercise was to import data and to get ready for SQL queries.

Next in my journey was to actually using what is called Big SQL, running SQL queries against data in my Hadoop file system. The Analytics for Hadoop service on Bluemix offers a basic SQL web console for this. As an alternative you can also issue queries from applications via JDBC (see software download above) or use other tools. As I was short on time and didn't want install any software, I decided to use the provided SQL console. When executing SQL statements you have the choice of "Big SQL" and "Big SQL V1". I picked the new "Big SQL" as it is based on the DB2 query compiler and runtime infrastructure. This comes in handy when you are unsure about available tables, column names etc. As you can see from the first screenshot below, I started by querying the Big SQL (DB2) system catalog to retrieve my available table names. The second screenshot shows a simple SQL query against the uploaded weather data.

Overall, given my short amount of time I had, it was a nice experience. Having the opportunity to apply DB2 skills towards data in a Hadoop file system reduced the barrier. Given that the data is already uploaded, I am sure I will try out other features of InfoSphere BigInsights / IBM Analytics for Hadoop, too. Have you tested it or Bluemix in general? You can sign up for a free trial (no credit card required) at http://bluemix.net.
DB2 catalog tables for Hadoop

Basic SQL query against Hadoop data


Monday, October 13, 2014

Node-RED: Simple "phoney" JSON entries in Cloudant

Node-RED Starter
on IBM Bluemix
Want to automatically store data about who called you on which phone number? Last Friday tried exactly that on IBM Bluemix and was amazed how simple it is, no real programming involved. All I needed was the Node-RED starter boilerplate (icon is on the right) on Bluemix and a new API of my mobile service provider.

The Node-RED boilerplate automatically creates a Node.js runtime environment on Bluemix and installs the Node-RED tool into it. In addition, a Cloudant JSON database is created. Once everything is deployed I opened the Node-RED tool in a Web browser. It offers a basic set of different input and output methods, processing nodes, and the ability to connect them in a flow graph. One of the input nodes is a listener for http requests. They help to react to Web service requests. I placed such http input node on the work sheet and labeled it "phone" (see screenshot).
Node-RED tool on Bluemix

How did I obtain information about callers and the numbers they called? Well, I tapped my friends at a security agency. I used a new free service offering at one of my phone service providers. It is called "sipgate.io" and allows to configure an URL/Web address that is accessed whenever someone calls one of the account's phone numbers (you could have multiple phone numbers). An http POST request is sent to the configured URL and the caller's phone number and the called number are included as payload. In my Node-RED application the "phone" node would answer to this request.

What I needed now was the data processing flow of the Web services request. On Friday I already tweeted the entire flow:



On the left we see "phone" node as http input. Connected to it is the "ok" node which sends an http response back, telling the phone company's Web services that we received the information. The other connected node is a "json" processor which translates the payload (who called which number) into a meaningful JSON object. That object is then moved on to the "calls" node, a Cloudant output node. All we needed was to select the Cloudant service on Bluemix and to configure the database name.
Cloudant Output Node, Node-RED on Bluemix
The magic itself was pressing the "Deploy" button on top of my Node-RED worksheet. It created the Node.js code for my Bluemix app. I was a little bit nervous about testing the app because I didn't code anything, just clicking. For my test I took the phone in my home office and dialed the number of my mobile phone number. The result? A new, nice and shiny, "phoney" JSON document in Cloudant:
"Phoney" record in Cloudant
Almost too easy. Great stuff, but unfortunately no code to share... :)

LinkWithin

Related Posts with Thumbnails