Ready to get started?

Download a free trial of the IBM Cloud Object Storage Driver to get started:

 Download Now

Learn more:

IBM Cloud Object Storage Icon IBM Cloud Object Storage JDBC Driver

Rapidly create and deploy powerful Java applications that integrate with IBM Cloud Object Storage.

Build IBM Cloud Object Storage-Connected ETL Processes in Google Data Fusion



Load the CData JDBC Driver into Google Data Fusion and create ETL processes with access live IBM Cloud Object Storage data.

Google Data Fusion allows users to perform self-service data integration to consolidate disparate data. Uploading the CData JDBC Driver for IBM Cloud Object Storage enables users to access live IBM Cloud Object Storage data from within their Google Data Fusion pipelines. While the CData JDBC Driver enables piping IBM Cloud Object Storage data to any data source natively supported in Google Data Fusion, this article walks through piping data from IBM Cloud Object Storage to Google BigQuery,

Upload the CData JDBC Driver for IBM Cloud Object Storage to Google Data Fusion

Upload the CData JDBC Driver for IBM Cloud Object Storage to your Google Data Fusion instance to work with live IBM Cloud Object Storage data. Due to the naming restrictions for JDBC drivers in Google Data Fusion, create a copy or rename the JAR file to match the following format driver-version.jar. For example: cdataibmcloudobjectstorage-2020.jar

  1. Open your Google Data Fusion instance
  2. Click the to add an entity and upload a driver
  3. On the "Upload driver" tab, drag or browse to the renamed JAR file.
  4. On the "Driver configuration" tab:
    • Name: Create a name for the driver (cdata.jdbc.ibmcloudobjectstorage) and make note of the name
    • Class name: Set the JDBC class name: (cdata.jdbc.ibmcloudobjectstorage.IBMCloudObjectStorageDriver)
  5. Click "Finish"

Connect to IBM Cloud Object Storage Data in Google Data Fusion

With the JDBC Driver uploaded, you are ready to work with live IBM Cloud Object Storage data in Google Data Fusion Pipelines.

  1. Navigate to the Pipeline Studio to create a new Pipeline
  2. From the "Source" options, click "Database" to add a source for the JDBC Driver
  3. Click "Properties" on the Database source to edit the properties

    NOTE: To use the JDBC Driver in Google Data Fusion, you will need a license (full or trial) and a Runtime Key (RTK). For more information on obtaining this license (or a trial), contact our sales team.

    • Set the Label
    • Set Reference Name to a value for any future references (i.e.: cdata-ibmcloudobjectstorage)
    • Set Plugin Type to "jdbc"
    • Set Connection String to the JDBC URL for IBM Cloud Object Storage. For example:

      jdbc:ibmcloudobjectstorage:RTK=5246...;ApiKey=myApiKey;CloudObjectStorageCRN=MyInstanceCRN;Region=myRegion;OAuthClientId=MyOAuthClientId;OAuthClientSecret=myOAuthClientSecret;

      Register a New Instance of Cloud Object Storage

      If you do not already have Cloud Object Storage in your IBM Cloud account, follow the procedure below to install an instance of SQL Query in your account:

      1. Log in to your IBM Cloud account.
      2. Navigate to the page, choose a name for your instance and click Create. You will be redirected to the instance of Cloud Object Storage you just created.

      Connecting using OAuth Authentication

      There are certain connection properties you need to set before you can connect. You can obtain these as follows:

      API Key

      To connect with IBM Cloud Object Storage, you need an API Key. You can obtain this as follows:

      1. Log in to your IBM Cloud account.
      2. Navigate to the Platform API Keys page.
      3. On the middle-right corner click "Create an IBM Cloud API Key" to create a new API Key.
      4. In the pop-up window, specify the API Key name and click "Create". Note the API Key as you can never access it again from the dashboard.

      Cloud Object Storage CRN

      If you have multiple accounts, you will need to specify the CloudObjectStorageCRN explicitly. To find the appropriate value, you can:

      • Query the Services view. This will list your IBM Cloud Object Storage instances along with the CRN for each.
      • Locate the CRN directly in IBM Cloud. To do so, navigate to your IBM Cloud Dashboard. In the Resource List, Under Storage, select your Cloud Object Storage resource to get its CRN.

      Connecting to Data

      You can now set the following to connect to data:

      • InitiateOAuth: Set this to GETANDREFRESH. You can use InitiateOAuth to avoid repeating the OAuth exchange and manually setting the OAuthAccessToken.
      • ApiKey: Set this to your API key which was noted during setup.
      • CloudObjectStorageCRN (Optional): Set this to the cloud object storage CRN you want to work with. While the connector attempts to retrieve this automatically, specifying this explicitly is recommended if you have more than Cloud Object Storage account.

      When you connect, the connector completes the OAuth process.

      1. Extracts the access token and authenticates requests.
      2. Saves OAuth values in OAuthSettingsLocation to be persisted across connections.

      Built-in Connection String Designer

      For assistance in constructing the JDBC URL, use the connection string designer built into the IBM Cloud Object Storage JDBC Driver. Either double-click the JAR file or execute the jar file from the command-line.

      java -jar cdata.jdbc.ibmcloudobjectstorage.jar

      Fill in the connection properties and copy the connection string to the clipboard.

    • Set Import Query to a SQL query that will extract the data you want from IBM Cloud Object Storage, i.e.:
      SELECT * FROM Objects
  4. From the "Sink" tab, click to add a destination sink (we use Google BigQuery in this example)
  5. Click "Properties" on the BigQuery sink to edit the properties
    • Set the Label
    • Set Reference Name to a value like ibmcloudobjectstorage-bigquery
    • Set Project ID to a specific Google BigQuery Project ID (or leave as the default, "auto-detect")
    • Set Dataset to a specific Google BigQuery dataset
    • Set Table to the name of the table you wish to insert IBM Cloud Object Storage data into

With the Source and Sink configured, you are ready to pipe IBM Cloud Object Storage data into Google BigQuery. Save and deploy the pipeline. When you run the pipeline, Google Data Fusion will request live data from IBM Cloud Object Storage and import it into Google BigQuery.

While this is a simple pipeline, you can create more complex IBM Cloud Object Storage pipelines with transforms, analytics, conditions, and more. Download a free, 30-day trial of the CData JDBC Driver for IBM Cloud Object Storage and start working with your live IBM Cloud Object Storage data in Google Data Fusion today.