Thursday, May 24, 2018

How to pack serverless Python actions

Serverless access to Db2 and GitHub
For my tutorial on automated data retrieval and analytics, I use IBM Cloud Functions to automatically fetch GitHub traffic statistics once a day. It is implemented as a serverless Python action. Because some Python packages are needed, the question was how to pack and create the action. In this blog, I share some of my experiences.

Friday, April 27, 2018

Db2: CTE and CONNECT BY - two kinds of recursion

Writing recursive SQL (based on M.C.Escher by Henrik Loeser)
Writing recursive SQL
Recently, I gave a talk on SQL Recursion. One of the cool features in Db2 that I (re-)discovered for that presentation is that there are two different forms of syntax for recursive queries. One is the Common Table Expression (CTE), another the CONNECT BY syntax. Let me introduce you to the two and show you a sample query in both styles.

If you had to write recursive queries, e.g., to create a bill of material or to find dependencies in data, you would typically use a SELECT statement with a common table expression (CTE). CTEs are easily spotted by starting with the keyword WITH. Some form of temporary table is defined, then the initial data (seed) is specified followed by UNION ALL and the SELECT for the recursive part. A regular fullselect is then applied to the result of that computation.

The CONNECT BY provides another syntax for hiearachical and recursive queries. It was first introduced by Oracle and made its way into Db2 as part of the SQL compatibility features. It allows to write a, more or less, regular SELECT statement followed by a START WITH clause to define the initial data (see) and a CONNECT BY clause to specify the recursion. There are some additional keywords to access data from the prior step, the root ancestor or the (string) path from the root to a node. It is quite powerful and often shorter than a CTE. The only drawback is that you have to enable it in the Db2 compatibility vector.

The following two queries return the same result. It is the organizational structure of the departments in the SAMPLE database. The first query uses CONNECT BY and is directly taken out of the Db2 documentation.


SELECT LEVEL, CAST(SPACE((LEVEL - 1) * 4) || '/' || DEPTNAME
       AS VARCHAR(40)) AS DEPTNAME
FROM DEPARTMENT
     START WITH DEPTNO = 'A00'
     CONNECT BY NOCYCLE PRIOR DEPTNO = ADMRDEPT

The second query is my version of the above with a CTE. As you can see, it is longer. Readability depends on what you are used to.

WITH tdep(level, deptname, deptno) as (
    SELECT 1, CAST( DEPTNAME AS VARCHAR(40)) AS DEPTNAME, deptno
    FROM department 
    WHERE DEPTNO = 'A00'
    UNION ALL
    SELECT t.LEVEL+1, CAST(SPACE(t.LEVEL  * 4) || '/' || d.DEPTNAME
       AS VARCHAR(40)) AS DEPTNAME, d.deptno
    FROM DEPARTMENT d, tdep t
    WHERE d.admrdept=t.deptno and d.deptno<>'A00')
SELECT level, deptname
FROM tdep 


If you only knew one style of recursive query then it is time to try out the other. Both queries above work with the SAMPLE database. If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.

Tuesday, April 24, 2018

Automated, regular database jobs with IBM Cloud Functions (and Db2)

IBM Cloud Functions and Db2
Yesterday, I blogged about the latest tutorial I wrote. The tutorial discusses how to combine serverless and Cloud Foundry for data retrieval and analytics. That scenario came up when I looked into regularly downloading GitHub traffic statistics for improved usage insights. What I needed was a mechanism to execute small Python script on a daily or weekly basis. After looking into some possible solutions, IBM Cloud Functions was the clear winner. In this blog, I am going to discuss how simple it is to implement some regular, automated activities, such as maintenance jobs for a cloud database.

LinkWithin

Related Posts with Thumbnails