Monday, July 26, 2021

How to connect from Python to Db2


One of the recurring questions I have seen is "how to connect from Python to Db2". I have blogged about Python and Db2 a couple times before. Recently, the question has been popping up again more frequently. From my view, the increased security with mandatory SSL/TLS encryption and the use of Python-based Jupyter notebooks for data science and AI / ML projects are the drivers behind that increase. Moreover, there is Db2 on-premises and the Db2 on Cloud / Db2 Warehouse on Cloud plus container-based offerings. Today, I am trying to sort this out and answer that question. Again.... :)

Db2 and Python drivers

Db2 does not offer one, but four drivers (clients) for Python. The Db2 documentation page "Python, SQLAlchemy, and Django Framework application development for IBM Database servers" provides a good overview about the four drivers:

Note that there are additional Python database interfaces which make use of existing JDBC or ODBC drivers which can be used to connect to Db2. You can use SQLAlchemy (ibm_db_sa) with the popular Flask framework. To use Db2 with pandas utilize ibm_db_dbi. All of the above Db2 drivers are available on GitHub and are based on the CLI (Call Level Interface / ODBC).

Db2 connections

Typical connection information is made up of the Db2 server (hostname), the port, the database name and username / password information. If nothing else is specified, most drivers assume that the connection is not encrypted. Thus, to connect over an encrypted connection more parameters are needed. They depend on the Db2 version, the type of Db2 product and some more. Let's start easy.

Newer Db2 versions simplified the use of SSL/TLS because certificates are now part of the package. A typical connection string would then look like this:

conn_str='database=MYDB;hostname=db2host.example.com;port=50001;protocol=tcpip;\
uid=db2inst1;pwd=secret;security=SSL'
ibm_db_conn = ibm_db.connect(conn_str,'','')

An important parameter is "security=SSL" to tell the driver to use encryption for the data in transit. In a blog last year I looked into how to connect to Db2 on Cloud with an IBM Cloud IAM API key instead of username / password. It uses a (different) so-called security plugin and hence needs extra parameters. See the blog post for detailed sample code.

If your organisation issues their own certificates or if you want to enhance security by, e.g, hostname validation then additional parameters are required. As I mentioned, the Python drivers are based on CLI / ODBC. Thus, take a look at the (many) CLI configuration keywords and how to specify them.

UPDATE 2022-05-12:

When using Db2 with SQL Alchemy, I pass an URI similar to 

ibm_db_sa://user:password@hostname:port/database?Security=SSL 

to get the connection established.

Conclusions

Network communication and security is evolving, so are Db2 and its clients. Thus, sometimes there are new options or parameters added which later are simplified. In most cases, a connection string like the one shown above okay. Just adding "security=SSL;" and changing the port number usually is enough to move from unencrypted to secured network traffic.

If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.