Build IBM Cloud Object Storage-Connected ETL Processes in Google Data Fusion

Upload the CData JDBC Driver for IBM Cloud Object Storage to Google Data Fusion

Upload the CData JDBC Driver for IBM Cloud Object Storage to your Google Data Fusion instance to work with live IBM Cloud Object Storage data. Due to the naming restrictions for JDBC drivers in Google Data Fusion, create a copy or rename the JAR file to match the following format driver-version.jar. For example: cdataibmcloudobjectstorage-2020.jar

Open your Google Data Fusion instance

Click the to add an entity and upload a driver

On the "Upload driver" tab, drag or browse to the renamed JAR file.

On the "Driver configuration" tab:

Name: Create a name for the driver (cdata.jdbc.ibmcloudobjectstorage) and make note of the name
Class name: Set the JDBC class name: (cdata.jdbc.ibmcloudobjectstorage.IBMCloudObjectStorageDriver)

Configuring the driver (Salesforce is shown.)

Click "Finish"

Connect to IBM Cloud Object Storage Data in Google Data Fusion

With the JDBC Driver uploaded, you are ready to work with live IBM Cloud Object Storage data in Google Data Fusion Pipelines.

Navigate to the Pipeline Studio to create a new Pipeline

From the "Source" options, click "Database" to add a source for the JDBC Driver

Click "Properties" on the Database source to edit the properties

NOTE: To use the JDBC Driver in Google Data Fusion, you will need a license (full or trial) and a Runtime Key (RTK). For more information on obtaining this license (or a trial), contact our sales team.

Set the Label
Set Reference Name to a value for any future references (i.e.: cdata-ibmcloudobjectstorage)
Set Plugin Type to "jdbc"
Set Connection String to the JDBC URL for IBM Cloud Object Storage. For example:

jdbc:ibmcloudobjectstorage:RTK=5246...;ApiKey=myApiKey;CloudObjectStorageCRN=MyInstanceCRN;Region=myRegion;OAuthClientId=MyOAuthClientId;OAuthClientSecret=myOAuthClientSecret;

Register a New Instance of Cloud Object Storage

If you do not already have Cloud Object Storage in your IBM Cloud account, follow the procedure below to install an instance of SQL Query in your account:
1. Log in to your IBM Cloud account.
2. Navigate to the page, choose a name for your instance and click Create. You will be redirected to the instance of Cloud Object Storage you just created.
Connecting using OAuth Authentication

There are certain connection properties you need to set before you can connect. You can obtain these as follows:

API Key

To connect with IBM Cloud Object Storage, you need an API Key. You can obtain this as follows:
1. Log in to your IBM Cloud account.
2. Navigate to the Platform API Keys page.
3. On the middle-right corner click "Create an IBM Cloud API Key" to create a new API Key.
4. In the pop-up window, specify the API Key name and click "Create". Note the API Key as you can never access it again from the dashboard.
Cloud Object Storage CRN

If you have multiple accounts, you will need to specify the CloudObjectStorageCRN explicitly. To find the appropriate value, you can:
- Query the Services view. This will list your IBM Cloud Object Storage instances along with the CRN for each.
- Locate the CRN directly in IBM Cloud. To do so, navigate to your IBM Cloud Dashboard. In the Resource List, Under Storage, select your Cloud Object Storage resource to get its CRN.
Connecting to Data

You can now set the following to connect to data:
- InitiateOAuth: Set this to GETANDREFRESH. You can use InitiateOAuth to avoid repeating the OAuth exchange and manually setting the OAuthAccessToken.
- ApiKey: Set this to your API key which was noted during setup.
- CloudObjectStorageCRN (Optional): Set this to the cloud object storage CRN you want to work with. While the connector attempts to retrieve this automatically, specifying this explicitly is recommended if you have more than Cloud Object Storage account.
When you connect, the connector completes the OAuth process.
1. Extracts the access token and authenticates requests.
2. Saves OAuth values in OAuthSettingsLocation to be persisted across connections.
Built-in Connection String Designer

For assistance in constructing the JDBC URL, use the connection string designer built into the IBM Cloud Object Storage JDBC Driver. Either double-click the JAR file or execute the jar file from the command-line.
java -jar cdata.jdbc.ibmcloudobjectstorage.jar
Fill in the connection properties and copy the connection string to the clipboard.
Set Import Query to a SQL query that will extract the data you want from IBM Cloud Object Storage, i.e.:
SELECT * FROM Objects

From the "Sink" tab, click to add a destination sink (we use Google BigQuery in this example)

Click "Properties" on the BigQuery sink to edit the properties

Set the Label
Set Reference Name to a value like ibmcloudobjectstorage-bigquery
Set Project ID to a specific Google BigQuery Project ID (or leave as the default, "auto-detect")
Set Dataset to a specific Google BigQuery dataset
Set Table to the name of the table you wish to insert IBM Cloud Object Storage data into

With the Source and Sink configured, you are ready to pipe IBM Cloud Object Storage data into Google BigQuery. Save and deploy the pipeline. When you run the pipeline, Google Data Fusion will request live data from IBM Cloud Object Storage and import it into Google BigQuery.

While this is a simple pipeline, you can create more complex IBM Cloud Object Storage pipelines with transforms, analytics, conditions, and more. Download a free, 30-day trial of the CData JDBC Driver for IBM Cloud Object Storage and start working with your live IBM Cloud Object Storage data in Google Data Fusion today.

Ready to get started?

In this article

Related articles

Build IBM Cloud Object Storage-Connected ETL Processes in Google Data Fusion

Upload the CData JDBC Driver for IBM Cloud Object Storage to Google Data Fusion

Connect to IBM Cloud Object Storage Data in Google Data Fusion

Register a New Instance of Cloud Object Storage

Connecting using OAuth Authentication

API Key

Cloud Object Storage CRN

Connecting to Data

Built-in Connection String Designer