Load Outlook Data to a Database Using Embulk

Jerod Johnson
Jerod Johnson
Director, Technology Evangelism
Use CData JDBC drivers with the open source ETL/ELT tool Embulk to load Outlook data to a database.

Embulk is an open source bulk data loader. When paired with the CData API Driver for JDBC, Embulk easily loads data from Outlook to any supported destination. In this article, we explain how to use the CData API Driver for JDBC in Embulk to load Outlook data to a MySQL dtabase.

With built-in optimized data processing, the CData JDBC Driver offers unmatched performance for interacting with live Outlook data. When you issue complex SQL queries to Outlook, the driver pushes supported SQL operations, like filters and aggregations, directly to Outlook and utilizes the embedded SQL engine to process unsupported operations client-side (often SQL functions and JOIN operations).

Configure a JDBC Connection to Outlook Data

Before creating a bulk load job in Embulk, note the installation location for the JAR file for the JDBC Driver (typically C:\Program Files\CData\CData API Driver for JDBC\lib).

Embulk supports JDBC connectivity, so you can easily connect to Outlook and execute SQL queries. Before creating a bulk load job, create a JDBC URL for authenticating with Outlook.

Using OAuth Authentication

Microsoft Graph API uses OAuth 2.0 for authentication. You must register an application in the Microsoft Azure Portal to obtain OAuth credentials (Client ID and Client Secret).

Obtaining OAuth Credentials

  1. Log in to the Azure Portal.
  2. Navigate to Azure Active Directory > App registrations.
  3. Click New registration to create a new application.
  4. Enter an application name and select the appropriate account types.
  5. Set the Redirect URI to your application's callback URL (e.g., http://localhost:33333 for desktop apps).
  6. Click Register to create the application.
  7. On the application overview page, copy the Application (client) ID - this is your OAuthClientId.
  8. Navigate to Certificates & secrets and create a new client secret.
  9. Copy the client secret value - this is your OAuthClientSecret.
  10. Navigate to API permissions and add the required Microsoft Graph API permissions:
    • Mail.Read - For accessing email messages
    • Contacts.Read - For accessing contacts
    • Calendars.Read - For accessing calendar events
    • Tasks.Read - For accessing To Do tasks
    • offline_access - For obtaining refresh tokens
  11. Click Grant admin consent to grant these permissions.

Connecting with OAuth

After setting the following connection properties, you are ready to connect:

  • AuthScheme: Set this to OAuth.
  • InitiateOAuth: Set this to GETANDREFRESH. The CData API Profile for Outlook will automatically walk through the OAuth process in order to obtain the access token.
  • OAuthClientId: Set this to the Application (client) ID from Azure Portal.
  • OAuthClientSecret: Set this to the client secret value from Azure Portal.
  • TenantId: Set this to your Azure AD tenant identifier (GUID or domain name like 'contoso.onmicrosoft.com').
  • CallbackURL: Set this to the Redirect URI you specified in your app registration (e.g., http://localhost:33333 for desktop apps).

Example connection string

Profile=C:\profiles\Outlook.apip;AuthScheme=OAuth;InitiateOAuth=GETANDREFRESH;OAuthClientId=your_client_id;OAuthClientSecret=your_client_secret;TenantId=your_tenant_id;CallbackUrl=http://localhost:33333;

Built-in Connection String Designer

For assistance in constructing the JDBC URL, use the connection string designer built into the Outlook JDBC Driver. Either double-click the JAR file or execute the jar file from the command-line.

java -jar cdata.jdbc.api.jar

Fill in the connection properties and copy the connection string to the clipboard.

Below is a typical JDBC connection string for Outlook:

jdbc:api:Profile=C:\profiles\Outlook.apip;AuthScheme=OAuth;InitiateOAuth=GETANDREFRESH;OAuthClientId=your_client_id;OAuthClientSecret=your_client_secret;TenantId=your_tenant_id;CallbackUrl=http://localhost:33333;

Load Outlook Data in Embulk

After installing the CData JDBC Driver and creating a JDBC connection string, install the required Embulk plugins.

Install Embulk Input & Output Plugins

  1. Install the JDBC Input Plugin in Embulk.
    https://github.com/embulk/embulk-input-jdbc/tree/master/embulk-input-jdbc
  2. embulk gem install embulk-input-jdbc
    
  3. In this article, we use MySQL as the destination database. You can also choose SQL Server, PostgreSQL, or Google BigQuery as the destination using the output Plugins.
    https://github.com/embulk/embulk-output-jdbc/tree/master/embulk-output-mysql
    embulk gem install embulk-output-mysql
    

With the input and output plugins installed, we are ready to load Outlook data into MySQL using Embulk.

Create a Job to Load Outlook Data

Start by creating a config file in Embulk, using a name like api-mysql.yml.

  1. For the input plugin options, use the CData API Driver for JDBC, including the path to the driver JAR file, the driver class (e.g. cdata.jdbc.api.APIDriver), and the JDBC URL from above
  2. For the output plugin options, use the values and credentials for the MySQL database

Sample Config File (api-mysql.yml)

in:
	type: jdbc
	driver_path: C:\Program Files\CData[product_name] 20xx\lib\cdata.jdbc.api.jar
	driver_class: cdata.jdbc.api.APIDriver
	url: jdbc:api:Profile=C:\profiles\Outlook.apip;AuthScheme=OAuth;InitiateOAuth=GETANDREFRESH;OAuthClientId=your_client_id;OAuthClientSecret=your_client_secret;TenantId=your_tenant_id;CallbackUrl=http://localhost:33333;
	table: "CalendarGroupCalendars"
out: 
	type: mysql
	host: localhost
	database: DatabaseName
	user: UserId
	password: UserPassword
	table: "CalendarGroupCalendars"
	mode: insert

After creating the file, run the Embulk job.

embulk run api-mysql.yml

After running the the Embulk job, find the Salesforce data in the MySQL table.

Load Filtered Outlook Data

In addition to loading data directly from a table, you can use a custom SQL query to have more granular control of the data loaded. You can also perform increment loads by setting a last updated column in a SQL WHERE clause in the query field.

in:
	type: jdbc
	driver_path: C:\Program Files\CData[product_name] 20xx\lib\cdata.jdbc.api.jar
	driver_class: cdata.jdbc.api.APIDriver
	url: jdbc:api:Profile=C:\profiles\Outlook.apip;AuthScheme=OAuth;InitiateOAuth=GETANDREFRESH;OAuthClientId=your_client_id;OAuthClientSecret=your_client_secret;TenantId=your_tenant_id;CallbackUrl=http://localhost:33333;
	query: "SELECT ,  FROM CalendarGroupCalendars WHERE [RecordId] = 1"
out: 
	type: mysql
	host: localhost
	database: DatabaseName
	user: UserId
	password: UserPassword
	table: "CalendarGroupCalendars"
	mode: insert

More Information & Free Trial

By using CData API Driver for JDBC as a connector, Embulk can integrate Outlook data into your data load jobs. And with drivers for more than 200+ other enterprise sources, you can integrate any enterprise SaaS, big data, or NoSQL source as well. Download a 30-day free trial and get started today.

Ready to get started?

Connect to live data from Outlook with the API Driver

Connect to Outlook