Connect to Lakebase Data in RapidMiner

Jerod Johnson
Jerod Johnson
Senior Technology Evangelist
Integrate Lakebase data with standard components and data source configuration wizards in RapidMiner Studio.

This article shows how you can easily integrate the CData JDBC driver for Lakebase into your processes in RapidMiner. This article uses the CData JDBC Driver for Lakebase to transfer Lakebase data to a process in RapidMiner.

Connect to Lakebase in RapidMiner as a JDBC Data Source

You can follow the procedure below to establish a JDBC connection to Lakebase:

  1. Add a new database driver for Lakebase: Click Connections -> Manage Database Drivers.
  2. In the resulting wizard, click the Add button and enter a name for the connection.
  3. Enter the prefix for the JDBC URL:
    jdbc:lakebase:
    
  4. Enter the path to the cdata.jdbc.lakebase.jar file, located in the lib subfolder of the installation directory.
  5. Enter the driver class:
    cdata.jdbc.lakebase.LakebaseDriver
    
  6. Create a new Lakebase connection: Click Connections -> Manage Database Connections.
  7. Enter a name for your connection.
  8. For Database System, select the Lakebase driver you configured previously.
  9. Enter your connection string in the Host box. To connect to Databricks Lakebase, start by setting the following properties:
    • DatabricksInstance: The Databricks instance or server hostname, provided in the format instance-abcdef12-3456-7890-abcd-abcdef123456.database.cloud.databricks.com.
    • Server: The host name or IP address of the server hosting the Lakebase database.
    • Port (optional): The port of the server hosting the Lakebase database, set to 5432 by default.
    • Database (optional): The database to connect to after authenticating to the Lakebase Server, set to the authenticating user's default database by default.

    OAuth Client Authentication

    To authenicate using OAuth client credentials, you need to configure an OAuth client in your service principal. In short, you need to do the following:

    1. Create and configure a new service principal
    2. Assign permissions to the service principal
    3. Create an OAuth secret for the service principal

    For more information, refer to the Setting Up OAuthClient Authentication section in the Help documentation.

    OAuth PKCE Authentication

    To authenticate using the OAuth code type with PKCE (Proof Key for Code Exchange), set the following properties:

    • AuthScheme: OAuthPKCE.
    • User: The authenticating user's user ID.

    For more information, refer to the Help documentation.

    Built-in Connection String Designer

    For assistance in constructing the JDBC URL, use the connection string designer built into the Lakebase JDBC Driver. Either double-click the JAR file or execute the jar file from the command-line.

    java -jar cdata.jdbc.lakebase.jar
    

    Fill in the connection properties and copy the connection string to the clipboard.

    A typical connection string is below:

    DatabricksInstance=lakebase;Server=127.0.0.1;Port=5432;Database=my_database;InitiateOAuth=GETANDREFRESH;
    
  10. Enter your username and password if necessary.

You can now use your Lakebase connection with the various RapidMiner operators in your process. To retrieve Lakebase data, drag the Retrieve operator from the Operators view. With the Retrieve operator selected, you can then define which table to retrieve in the Parameters view by clicking the folder icon next to the "repository entry." In the resulting Repository Browser, you can expand your connection node to select the desired example set.

Finally, wire the output to the Retrieve process to a result, and run the process to see the Lakebase data.

Ready to get started?

Download a free trial of the Lakebase Driver to get started:

 Download Now

Learn more:

Lakebase Icon Lakebase JDBC Driver

Rapidly create and deploy powerful Java applications that integrate with Lakebase.