How to Build an ETL App for HCL Domino Data in Python with CData



Create ETL applications and real-time data pipelines for HCL Domino data in Python with petl.

The rich ecosystem of Python modules lets you get to work quickly and integrate your systems more effectively. With the CData Python Connector for HCL Domino and the petl framework, you can build HCL Domino-connected applications and pipelines for extracting, transforming, and loading HCL Domino data. This article shows how to connect to HCL Domino with the CData Python Connector and use petl and pandas to extract, transform, and load HCL Domino data.

With built-in, optimized data processing, the CData Python Connector offers unmatched performance for interacting with live HCL Domino data in Python. When you issue complex SQL queries from HCL Domino, the driver pushes supported SQL operations, like filters and aggregations, directly to HCL Domino and utilizes the embedded SQL engine to process unsupported operations client-side (often SQL functions and JOIN operations).

Connecting to HCL Domino Data

Connecting to HCL Domino data looks just like connecting to any relational data source. Create a connection string using the required connection properties. For this article, you will pass the connection string as a parameter to the create_engine function.

Connecting to Domino

To connect to Domino data, set the following properties:

  • URL: The host name or IP of the server hosting the Domino database. Include the port of the server hosting the Domino database. For example: http://sampleserver:1234/
  • DatabaseScope: The name of a scope in the Domino Web UI. The driver exposes forms and views for the schema governed by the specified scope. In the Domino Admin UI, select the Scopes menu in the sidebar. Set this property to the name of an existing scope.

Authenticating with Domino

Domino supports authenticating via login credentials or an Azure Active Directory OAuth application:

Login Credentials

To authenticate with login credentials, set the following properties:

  • AuthScheme: Set this to "OAuthPassword"
  • User: The username of the authenticating Domino user
  • Password: The password associated with the authenticating Domino user

The driver uses the login credentials to automatically perform an OAuth token exchange.

AzureAD

This authentication method uses Azure Active Directory as an IdP to obtain a JWT token. You need to create a custom OAuth application in Azure Active Directory and configure it as an IdP. To do so, follow the instructions in the Help documentation. Then set the following properties:

  • AuthScheme: Set this to "AzureAD"
  • InitiateOAuth: Set this to GETANDREFRESH. You can use InitiateOAuth to avoid repeating the OAuth exchange and manually setting the OAuthAccessToken.
  • OAuthClientId: The Client ID obtained when setting up the custom OAuth application.
  • OAuthClientSecret: The Client secret obtained when setting up the custom OAuth application.
  • CallbackURL: The redirect URI defined when you registered your app. For example: https://localhost:33333
  • AzureTenant: The Microsoft Online tenant being used to access data. Supply either a value in the form companyname.microsoft.com or the tenant ID.

    The tenant ID is the same as the directory ID shown in the Azure Portal's Azure Active Directory > Properties page.

After installing the CData HCL Domino Connector, follow the procedure below to install the other required modules and start accessing HCL Domino through Python objects.

Install Required Modules

Use the pip utility to install the required modules and frameworks:

pip install petl
pip install pandas

Build an ETL App for HCL Domino Data in Python

Once the required modules and frameworks are installed, we are ready to build our ETL app. Code snippets follow, but the full source code is available at the end of the article.

First, be sure to import the modules (including the CData Connector) with the following:

import petl as etl
import pandas as pd
import cdata.domino as mod

You can now connect with a connection string. Use the connect function for the CData HCL Domino Connector to create a connection for working with HCL Domino data.

cnxn = mod.connect("Server=https://domino.corp.com;AuthScheme=OAuthPassword;User=my_domino_user;Password=my_domino_password;")

Create a SQL Statement to Query HCL Domino

Use SQL to create a statement for querying HCL Domino. In this article, we read data from the ByName entity.

sql = "SELECT Name, Address FROM ByName WHERE City = 'Miami'"

Extract, Transform, and Load the HCL Domino Data

With the query results stored in a DataFrame, we can use petl to extract, transform, and load the HCL Domino data. In this example, we extract HCL Domino data, sort the data by the Address column, and load the data into a CSV file.

Loading HCL Domino Data into a CSV File

table1 = etl.fromdb(cnxn,sql)

table2 = etl.sort(table1,'Address')

etl.tocsv(table2,'byname_data.csv')

With the CData Python Connector for HCL Domino, you can work with HCL Domino data just like you would with any database, including direct access to data in ETL packages like petl.

Free Trial & More Information

Download a free, 30-day trial of the CData Python Connector for HCL Domino to start building Python apps and scripts with connectivity to HCL Domino data. Reach out to our Support Team if you have any questions.



Full Source Code


import petl as etl
import pandas as pd
import cdata.domino as mod

cnxn = mod.connect("Server=https://domino.corp.com;AuthScheme=OAuthPassword;User=my_domino_user;Password=my_domino_password;")

sql = "SELECT Name, Address FROM ByName WHERE City = 'Miami'"

table1 = etl.fromdb(cnxn,sql)

table2 = etl.sort(table1,'Address')

etl.tocsv(table2,'byname_data.csv')

Ready to get started?

Download a free trial of the HCL Domino Connector to get started:

 Download Now

Learn more:

HCL Domino Icon HCL Domino Python Connector

Python Connector Libraries for HCL Domino Data Connectivity. Integrate HCL Domino with popular Python tools like Pandas, SQLAlchemy, Dash & petl.