Replicate Multiple Hugging Face Accounts via the CData Sync CLI

Jerod Johnson
Jerod Johnson
Director, Technology Evangelism
Replicate multiple Hugging Face accounts to one or many databases.

CData Sync for Hugging Face is a stand-alone application that provides solutions for a variety of replication scenarios such as replicating sandbox and production instances into your database. Both Sync for Windows and Sync for Java include a command-line interface (CLI) that makes it easy to manage multiple Hugging Face connections. In this article we show how to use the CLI to replicate multiple Hugging Face accounts.

Configure Hugging Face Connections

You can save connection and email notification settings in an XML configuration file. To replicate multiple Hugging Face accounts, use multiple configuration files. Below is an example configuration to replicate Hugging Face to SQLite:

Windows

<?xml version="1.0" encoding="UTF-8" ?>
<CDataSync>
  <DatabaseType>SQLite</DatabaseType>
  <DatabaseProvider>System.Data.SQLite</DatabaseProvider>
  <ConnectionString>Profile=C:\profiles\HuggingFace.apip;ProfileSettings='APIKey=hf_xxxxxxxxxxxxxxxxxxxx';</ConnectionString>
  <ReplicateAll>False</ReplicateAll>
  <NotificationUserName></NotificationUserName>
  <DatabaseConnectionString>Data Source=C:\my.db</DatabaseConnectionString>
  <TaskSchedulerStartTime>09:51</TaskSchedulerStartTime>
  <TaskSchedulerInterval>Never</TaskSchedulerInterval>
</CDataSync>

Java

<?xml version="1.0" encoding="UTF-8" ?>
<CDataSync>
<DatabaseType>SQLite</DatabaseType>
  <DatabaseProvider>org.sqlite.JDBC</DatabaseProvider>
  <ConnectionString>Profile=C:\profiles\HuggingFace.apip;ProfileSettings='APIKey=hf_xxxxxxxxxxxxxxxxxxxx';</ConnectionString>
  <ReplicateAll>False</ReplicateAll>
  <NotificationUserName></NotificationUserName>
  <DatabaseConnectionString>Data Source=C:\my.db</DatabaseConnectionString>
</CDataSync>

HuggingFace Hub uses token-based authentication to enable access to its API. The API provides access to machine learning models, datasets, spaces, papers, and other resources on the HuggingFace Hub platform.

Using API Key Authentication

To authenticate to HuggingFace Hub, you will need to provide an API Key (Access Token). To obtain your access token:

  1. Log in to your HuggingFace account at https://huggingface.co
  2. Navigate to Settings > Access Tokens
  3. Click "New token" to create a new access token
  4. Select the appropriate permissions (read or write)
  5. Copy the token value

After obtaining your access token, set the following connection properties:

  • AuthScheme: Set this to APIKey.
  • APIKey: Set this to your HuggingFace access token.

Example connection string

Profile=C:\profiles\HuggingFace.apip;ProfileSettings='APIKey=hf_xxxxxxxxxxxxxxxxxxxx';

Configure Queries for Each Hugging Face Instance

Sync enables you to control replication with standard SQL. The REPLICATE statement is a high-level command that caches and maintains a table in your database. You can define any SELECT query supported by the Hugging Face API. The statement below caches and incrementally updates a table of Hugging Face data:

REPLICATE Collections;

You can specify a file containing the replication queries you want to use to update a particular database. Separate replication statements with semicolons. The following options are useful if you are replicating multiple Hugging Face accounts into the same database:

You can use a different table prefix in the REPLICATE SELECT statement:

REPLICATE PROD_Collections SELECT * FROM Collections 

Alternatively, you can use a different schema:

REPLICATE PROD.Collections SELECT * FROM Collections

Run Sync

After you have configured the connection strings and replication queries, you can run Sync with the following command-line options:

Windows

APISync.exe -g MyProductionAPIConfig.xml -f MyProductionAPISync.sql

Java

java -Xbootclasspath/p:c:\sqlitejdbc.jar -jar APISync.jar -g MyProductionAPIConfig.xml -f MyProductionAPISync.sql

Ready to get started?

Learn more or sign up for a free trial:

CData Sync