Data Henrik: database

Showing posts with label database. Show all posts

Wednesday, March 12, 2025

Db2 for your data lakehouse

Db2 as engine and data source
for the data lakehouse

A while ago, I was working with IBM watsonx.data to prepare a presentation on data lakehouse solutions. When talking about the (query) engines for a data lakehouse, typically, it is mostly about Presto and Spark. Did you know that Db2 can be used both as data source AND as query engine in watsonx.data (see screenshot)? Let's take a look...

Db2 catalog: Updated documentation links to database metadata

New Db2 metadata created

In 2017 I wrote the post "Db2 Catalog - A guided tour", a title which I also use for some of my talks at user group meetings and conferences. It contains some now outdated links to the Db2 documentation. Time for an update, so that I can find that important information quicker again. And maybe you too...

Once again: Db2 External Tables

Db2 log files and data on IBM COS

Today, I once again share some notes about Db2 External Tables. Recently, I set up Db2 12.1 Community Edition and had some time to "play" with external tables, multi-tenancy, and catalog tables. As this blog serves me as external memory :), here is what I need to remember...

Revisited: Db2 backup to S3 / Object Storage

Db2 backup in an S3 bucket

One of the popular posts I have written is on Db2 and S3-compatible Cloud Object Storage, a follow-up from an older article on that same topic. Because things change and I recently set up my new test environment, I just tested backing up a Db2 database to an S3-compatible bucket in my IBM Cloud storage service. Here are my notes.

Registration for IDUG EMEA 2024 Db2 Tech Conference is now open

IDUG, the International Db2 User Group, just announced that the registration for its EMEA conference in Valencia, Spain, this Fall is now open. You can find the details about the currently available conference packages by going to the IDUG EMEA 2024 Db2 Tech Conference page. Right now, an early bird rate, a special rate for first time attendees, and for virtual participants is available. Moreover, you can also register for the Db2 for z/OS and for the Db2 for LUW Fundamentals courses.

Tuesday, May 26, 2020

New performance and security feature in Db2: Authentication cache

New security feature in Db2

The recent release of Db2 11.5.3 (Db2 V11.5 Mod Pack 3) includes a small gem that impacts both performance and security. It is a cache for User ID and Password-based authentication. That feature is great when your system uses LDAP / Active Directory for authentication because lookup results can be cached by Db2. By default, that feature is not active and you need to configure it. In this post, I am going to discuss how to enable and configure it.

Some advanced SQL to analyze COVID-19 data

Learn to write SQL

All of us are impacted by COVID-19. Looking at daily case numbers, the basic reproduction number or mortality rates has become a routine. But what is behind those numbers? How are they computed and how does SQL help? In this post, I want to discuss how some SQL analytics clauses help to dig through the data. When I teach database systems, I always try to show how it applies to life. Here, SQL may not directly help to cure the disease, but SQL is essential to compute and understand the statistics.

All the best combined: Cloud, Db2, Python, Serverless and Security

Add a cloud service ID as Db2 user

Earlier this year, I wrote about how to use an API key or access token to connect to Db2 (on Cloud). Today, I am going to show you how to set up a service ID (technical user) on IBM Cloud, assign it a Db2 user ID (see screenshot) and grant database privileges. Thereafter, I share Python code for connecting to Db2 using the service ID with an API key.

All this helps to reduce the set of privileges held by a user or service and hence increases cloud security.

Obtaining device metadata for FIDO / FIDO2 security devices

Zoo of FIDO2 security keys

Last year, I started my journey to passwordless logins. I bought my first FIDO2 USB security key. Then, I added the key as an alternative to time-based one-time passwords (TOTP) to my online accounts where possible. Over time, I got more FIDO devices and also enabled an IBM Cloud tutorial on end-to-end security for passwordless support. What I learned through the process is that sometimes device metadata is important. So today I am going to write about what it is and where / how to obtain it.

25th meeting of German Db2 User Group

Celebrating the 25th DeDUG meeting

Last Friday, the 25th meeting of the German Db2 User Group was held, this time hosted by Fiducia & GAD IT AG in Karlsruhe. As usual, it was both a great networking and informative event. I had the opportunity to catch up with many Db2 users - customers, partner and IBMers. Moreover, similar to a couple of previous meetings, I had the after-lunch talk, this time featuring Db2 on Cloud security topics.

ETL in the cloud using SQL Query and Db2

From COS to Db2 using SQL

The SQL Query service on IBM Cloud allows to process data stored on Cloud Object Storage (COS) by writing SQL queries. So far, results were either shown in the console and / or written back to files on COS. Thus, I was happy to notice a new feature: Query results can now be written back to Db2 tables (on cloud). From my tests both Db2 on Cloud and Db2 Warehouse on Cloud are supported.

Control your database encryption keys for Db2 on Cloud

Db2 master key managed by IBM Cloud Key Protect

Since Db2 10.5 I have blogged a couple of times about the native database encryption built into Db2. Today, I want to show you how easy it is to take control of the database encryption keys for Db2 on Cloud. All Db2 database on IBM Cloud are encrypted by default, but with a system encryption key. You can increase data security even further by using your own encryption key (BYOK - bring your own key). Want to know how? Read on...

Use a Delivery Pipeline to rotate credentials

Job in Delivery Pipeline to rotate keys

In my recent posts I touched on updating credentials for solutions deployed on IBM Cloud Functions or using Cloud Foundry on IBM Cloud. Today, I am showing you how to rotate API keys and passwords for a containerized solution on IBM Kubernetes Service that makes use of a delivery pipeline (devops). I am going to use the app discussed in the tutorial on how to apply end to end security to a cloud application.

Rotating service credentials for IBM Cloud Functions

Keep your service keys secret

If you have followed some of my work, you know that I use IBM Cloud Functions, i.e., a serverless approach, for many projects. The tutorials with a database-driven (Db2-backed) Slackbot and the GitHub traffic analytics are such examples. In this blog post, I want to detail some of the security-related aspects. This includes how to share service credentials (think of a database username and password) with a cloud function and how to rotate the credentials.

Updated tutorial: Database-driven chatbot

If you want to build a chatbot that gets its content from a database, there is a good news. The existing tutorial “Build a database-driven Slackbot” was just updated to adapt to latest features of IBM Watson Assistant. First, define a skill that reaches out to a database service like Db2. Thereafter, use the built-in integrations to easily tie in the assistant with Slack, Facebook Messenger, embed the chatbot into your own application or use the WordPress plugin.

Architecture of database-driven chatbot

Tutorial on how to apply end to end security to a cloud application

Before you head out to the weekend I wanted to point you to a new cloud security tutorial. If you read this at the beginning of your week: What a great start... ;-)

Did you ever wonder how different security services work together to secure a cloud application? In the new tutorial we use

IBM Cloud Activity Tracker to log all security-related events. This includes logging in to the account, provisioning or deleting services, working with encryption keys and more.
IBM Cloud Key Protect to manage encryption keys. For the tutorial, we generate a root key for envelope encryption of stored files. You could also import your own root key (bring your own key, BYOK). We use the root key to create encrypted buckets in the IBM Cloud Object Storage service.
IBM Cloud Object Storage (COS) service to produce expiring links to individual files. The links can be shared with others and expire after the set amount of time, so that the file cannot be accessed thereafter.
IBM Cloud App ID as a wrapper around (enterprise and social) Identity Providers to manage authentication and authorization through a single interface. The App ID service can be directly integrated with Kubernetes Ingress.
IBM Cloud Container Registry as a private image registry from which we deploy the application as container into a Kubernetes cluster (IBM Cloud Kubernetes Service). The container registry includes a Vulnerability Advisors that scans for and assesses container vulnerability and then recommends fixes.

Best of all, the code for the security tutorial is shared on GitHub in this repository. If you are in a hurry, it even allows you to deploy the full Node.js in Docker application and its services with the press of a button via toolchain.

If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.

Friday, April 27, 2018

Db2: CTE and CONNECT BY - two kinds of recursion

Writing recursive SQL

Recently, I gave a talk on SQL Recursion. One of the cool features in Db2 that I (re-)discovered for that presentation is that there are two different forms of syntax for recursive queries. One is the Common Table Expression (CTE), another the CONNECT BY syntax. Let me introduce you to the two and show you a sample query in both styles.

If you had to write recursive queries, e.g., to create a bill of material or to find dependencies in data, you would typically use a SELECT statement with a common table expression (CTE). CTEs are easily spotted by starting with the keyword WITH. Some form of temporary table is defined, then the initial data (seed) is specified followed by UNION ALL and the SELECT for the recursive part. A regular fullselect is then applied to the result of that computation.

The CONNECT BY provides another syntax for hiearachical and recursive queries. It was first introduced by Oracle and made its way into Db2 as part of the SQL compatibility features. It allows to write a, more or less, regular SELECT statement followed by a START WITH clause to define the initial data (see) and a CONNECT BY clause to specify the recursion. There are some additional keywords to access data from the prior step, the root ancestor or the (string) path from the root to a node. It is quite powerful and often shorter than a CTE. The only drawback is that you have to enable it in the Db2 compatibility vector.

The following two queries return the same result. It is the organizational structure of the departments in the SAMPLE database. The first query uses CONNECT BY and is directly taken out of the Db2 documentation.

SELECT LEVEL, CAST(SPACE((LEVEL - 1) * 4) || '/' || DEPTNAME
       AS VARCHAR(40)) AS DEPTNAME
FROM DEPARTMENT
     START WITH DEPTNO = 'A00'
     CONNECT BY NOCYCLE PRIOR DEPTNO = ADMRDEPT

The second query is my version of the above with a CTE. As you can see, it is longer. Readability depends on what you are used to.

WITH tdep(level, deptname, deptno) as (
    SELECT 1, CAST( DEPTNAME AS VARCHAR(40)) AS DEPTNAME, deptno
    FROM department 
    WHERE DEPTNO = 'A00'
    UNION ALL
    SELECT t.LEVEL+1, CAST(SPACE(t.LEVEL  * 4) || '/' || d.DEPTNAME
       AS VARCHAR(40)) AS DEPTNAME, d.deptno
    FROM DEPARTMENT d, tdep t
    WHERE d.admrdept=t.deptno and d.deptno<>'A00')
SELECT level, deptname
FROM tdep

If you only knew one style of recursive query then it is time to try out the other. Both queries above work with the SAMPLE database. If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.

Friday, April 13, 2018

IBM Cloud and Watson security questionnaires

If you are reading my blog regularly, you are aware of the entries on cloud security. Today, I want to point you to some newer reading material at the Cloud Security Alliance. They now have up-to-date self-assessment security questionnaires for the IBM Cloud.

The three available security documents cover IBM Cloud Infrastructure (IaaS), IBM Cloud Platform (PaaS), and the IBM Watson Services with their cognitive / AI capabilities. Using the provided material, you can learn about application security, audits, compliance, how business continuity and operational resilience are handled, procedures for change control, data and data center security, IAM (identify and access management) and much more. The answers include many links to references, documentation and other valuable information. Just for that it is worth going over the security questionnaires for IBM Cloud.

If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.

Tuesday, February 6, 2018

Chatbots: Some tricks with slots in IBM Watson Conversation

As you might remember, I have been using the IBM Watson Conversation service and DB2. My goal was to write a database-driven Slackbot, a Slack app that serves as chat interface to data stored in Db2. I will write more about that entire Slackbot soon, but today I wanted to share some chatbot tricks I learned. How to gather input data, perform checks and clean up the processing environment.

Slots

With my chatbot interface to Db2 I want to both query the database and insert new records. Thus, I need to collect input data of various kind. The Conversation service has a neat feature named input slots that simplifies that process. Within a dialog node (a logical step within the chat flow) I can specify a list of items the Conversation service should check for. I can tell in which variable to save that input and what question to ask if that data was not provided yet. Optional slots, i.e., optional data, can be enabled.

IBM Cloud: Some fun with Python and Cloud Foundry Logging

IBM Cloud: Turn Log Data into Donut

Last month, after receiving user questions, I blogged about how to decipher Cloud Foundry log entries. Today, I want to point you to a small Cloud Foundry Python app I wrote. It helps to better understand Python and Cloud Foundry logging. You can also use it to test the IBM Cloud Log Analysis service which provides an easy-to-use interface to logs generated by applications running in the IBM Cloud. In the premium plans, external log events can also be fed into the service for consolidated storage and analysis.

As usual, the code for my app is available on Github: https://github.com/data-henrik/application-log-analysis/. Once deployed to IBM Cloud, the app can be used to send messages on a chosen log level back to the server. The server-side log level, i.e., the threshold for processed log messages can also be set. The app produces diagnostic output on "stdout" and "stderr". The two are treated differently by Cloud Foundry. Here is a screenshot of the logging app:

Test app for Cloud Foundry logging

The produced log entries can also be used to try out the IBM Cloud Log Analysis service. Diagnostic logs are automatically forwarded to the Log Search of that service. The messages are fed into Elasticsearch and can be analyzed using Kibana. I wrote some search queries (one shown below) and then built visualizations like the shown "Donut" based on those queries. I will write more about that in a future blog post.

Search Query for Elasticsearch / IBM Cloud Log Analysis

An official tutorial using that app and Log Analysis is available in the IBM Cloud docs.

If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.

Pages

Wednesday, March 12, 2025

Monday, February 24, 2025

Friday, January 17, 2025

Friday, August 9, 2024

Thursday, May 2, 2024

Tuesday, May 26, 2020

Monday, May 18, 2020

Monday, April 20, 2020

Monday, March 23, 2020

Monday, January 27, 2020

Tuesday, November 26, 2019

Thursday, November 7, 2019

Thursday, August 1, 2019

Wednesday, July 17, 2019

Wednesday, April 24, 2019

Friday, September 14, 2018

Friday, April 27, 2018

Friday, April 13, 2018

Tuesday, February 6, 2018

Slots

Thursday, November 30, 2017