Replicate Multiple Lakebase Accounts via the CData Sync CLI

Jerod Johnson
Jerod Johnson
Director, Technology Evangelism
Replicate multiple Lakebase accounts to one or many databases.

CData Sync for Lakebase is a stand-alone application that provides solutions for a variety of replication scenarios such as replicating sandbox and production instances into your database. Both Sync for Windows and Sync for Java include a command-line interface (CLI) that makes it easy to manage multiple Lakebase connections. In this article we show how to use the CLI to replicate multiple Lakebase accounts.

Configure Lakebase Connections

You can save connection and email notification settings in an XML configuration file. To replicate multiple Lakebase accounts, use multiple configuration files. Below is an example configuration to replicate Lakebase to SQLite:

Windows

<?xml version="1.0" encoding="UTF-8" ?>
<CDataSync>
  <DatabaseType>SQLite</DatabaseType>
  <DatabaseProvider>System.Data.SQLite</DatabaseProvider>
  <ConnectionString>DatabricksInstance=lakebase;Server=127.0.0.1;Port=5432;Database=my_database;InitiateOAuth=GETANDREFRESH;</ConnectionString>
  <ReplicateAll>False</ReplicateAll>
  <NotificationUserName></NotificationUserName>
  <DatabaseConnectionString>Data Source=C:\my.db</DatabaseConnectionString>
  <TaskSchedulerStartTime>09:51</TaskSchedulerStartTime>
  <TaskSchedulerInterval>Never</TaskSchedulerInterval>
</CDataSync>

Java

<?xml version="1.0" encoding="UTF-8" ?>
<CDataSync>
<DatabaseType>SQLite</DatabaseType>
  <DatabaseProvider>org.sqlite.JDBC</DatabaseProvider>
  <ConnectionString>DatabricksInstance=lakebase;Server=127.0.0.1;Port=5432;Database=my_database;InitiateOAuth=GETANDREFRESH;</ConnectionString>
  <ReplicateAll>False</ReplicateAll>
  <NotificationUserName></NotificationUserName>
  <DatabaseConnectionString>Data Source=C:\my.db</DatabaseConnectionString>
</CDataSync>
To connect to Databricks Lakebase, start by setting the following properties:
  • DatabricksInstance: The Databricks instance or server hostname, provided in the format instance-abcdef12-3456-7890-abcd-abcdef123456.database.cloud.databricks.com.
  • Server: The host name or IP address of the server hosting the Lakebase database.
  • Port (optional): The port of the server hosting the Lakebase database, set to 5432 by default.
  • Database (optional): The database to connect to after authenticating to the Lakebase Server, set to the authenticating user's default database by default.

OAuth Client Authentication

To authenicate using OAuth client credentials, you need to configure an OAuth client in your service principal. In short, you need to do the following:

  1. Create and configure a new service principal
  2. Assign permissions to the service principal
  3. Create an OAuth secret for the service principal

For more information, refer to the Setting Up OAuthClient Authentication section in the Help documentation.

OAuth PKCE Authentication

To authenticate using the OAuth code type with PKCE (Proof Key for Code Exchange), set the following properties:

  • AuthScheme: OAuthPKCE.
  • User: The authenticating user's user ID.

For more information, refer to the Help documentation.

Configure Queries for Each Lakebase Instance

Sync enables you to control replication with standard SQL. The REPLICATE statement is a high-level command that caches and maintains a table in your database. You can define any SELECT query supported by the Lakebase API. The statement below caches and incrementally updates a table of Lakebase data:

REPLICATE Orders;

You can specify a file containing the replication queries you want to use to update a particular database. Separate replication statements with semicolons. The following options are useful if you are replicating multiple Lakebase accounts into the same database:

You can use a different table prefix in the REPLICATE SELECT statement:

REPLICATE PROD_Orders SELECT * FROM Orders 

Alternatively, you can use a different schema:

REPLICATE PROD.Orders SELECT * FROM Orders

Run Sync

After you have configured the connection strings and replication queries, you can run Sync with the following command-line options:

Windows

LakebaseSync.exe -g MyProductionLakebaseConfig.xml -f MyProductionLakebaseSync.sql

Java

java -Xbootclasspath/p:c:\sqlitejdbc.jar -jar LakebaseSync.jar -g MyProductionLakebaseConfig.xml -f MyProductionLakebaseSync.sql

Ready to get started?

Learn more or sign up for a free trial:

CData Sync