Thursday, May 24, 2018

How to pack serverless Python actions

Serverless access to Db2 and GitHub
For my tutorial on automated data retrieval and analytics, I use IBM Cloud Functions to automatically fetch GitHub traffic statistics once a day. It is implemented as a serverless Python action. Because some Python packages are needed, the question was how to pack and create the action. In this blog, I share some of my experiences.

Overview

IBM Cloud Functions offers three options to provide the code when creating a Python action.
  1. Provide a single file with the code.
  2. Provide a zip archive that has the source code and required modules.
  3. Provide a zip archive with the virtualenv of the code and the environment with installed modules.
  4. An entirely different approach would be to build a Docker image with the Python code.
Option 1 typically works for smaller functions or projects that only use the built-in libraries. Depending on the chosen base environment, there is already a longer, useful list of Python libraries provided. Option 2 seems doable for few missing libraries and option 3 would be the last resort.

Packaging Python actions

Naturally, I started by looking into the easiest option first. The newer Jessie-based Python 3 environment features the IBM Db2 driver, support for Cloud Object Storage, Cloudant, Watson services, data science and analytics, and cryptography. However, it was missing the module I planned to use for simplified access to the GitHub API.

Thus, I started to compare creating a zip archive with either the required files or the virtual environment. The virtualenv is easy to create and to automate (with some experience), but I wanted to optimize the archive size. With only few files needed for my function, I downloaded the required libarary files and put them in a directory with my own code. Thereafter, I created the zip archive and was able to deploy the Python action.

Conclusions

If you write cloud functions that interact with IBM Cloud services, often no extra libaries are needed. They are available as part of the latest runtime environments. If a function is dependent on additional modules, downloading and packing them into a zip archive usually is straightforward and a good choice. For more complex dependencies, using a virtual environment provides the necessary support to succeed with Python actions.

If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.