Replicate Multiple Hugging Face Accounts via the CData Sync CLI
CData Sync for Hugging Face is a stand-alone application that provides solutions for a variety of replication scenarios such as replicating sandbox and production instances into your database. Both Sync for Windows and Sync for Java include a command-line interface (CLI) that makes it easy to manage multiple Hugging Face connections. In this article we show how to use the CLI to replicate multiple Hugging Face accounts.
Configure Hugging Face Connections
You can save connection and email notification settings in an XML configuration file. To replicate multiple Hugging Face accounts, use multiple configuration files. Below is an example configuration to replicate Hugging Face to SQLite:
Windows
<?xml version="1.0" encoding="UTF-8" ?> <CDataSync> <DatabaseType>SQLite</DatabaseType> <DatabaseProvider>System.Data.SQLite</DatabaseProvider> <ConnectionString>Profile=C:\profiles\HuggingFace.apip;ProfileSettings='APIKey=hf_xxxxxxxxxxxxxxxxxxxx';</ConnectionString> <ReplicateAll>False</ReplicateAll> <NotificationUserName></NotificationUserName> <DatabaseConnectionString>Data Source=C:\my.db</DatabaseConnectionString> <TaskSchedulerStartTime>09:51</TaskSchedulerStartTime> <TaskSchedulerInterval>Never</TaskSchedulerInterval> </CDataSync>
Java
<?xml version="1.0" encoding="UTF-8" ?> <CDataSync> <DatabaseType>SQLite</DatabaseType> <DatabaseProvider>org.sqlite.JDBC</DatabaseProvider> <ConnectionString>Profile=C:\profiles\HuggingFace.apip;ProfileSettings='APIKey=hf_xxxxxxxxxxxxxxxxxxxx';</ConnectionString> <ReplicateAll>False</ReplicateAll> <NotificationUserName></NotificationUserName> <DatabaseConnectionString>Data Source=C:\my.db</DatabaseConnectionString> </CDataSync>
HuggingFace Hub uses token-based authentication to enable access to its API. The API provides access to machine learning models, datasets, spaces, papers, and other resources on the HuggingFace Hub platform.
Using API Key Authentication
To authenticate to HuggingFace Hub, you will need to provide an API Key (Access Token). To obtain your access token:
- Log in to your HuggingFace account at https://huggingface.co
- Navigate to Settings > Access Tokens
- Click "New token" to create a new access token
- Select the appropriate permissions (read or write)
- Copy the token value
After obtaining your access token, set the following connection properties:
- AuthScheme: Set this to APIKey.
- APIKey: Set this to your HuggingFace access token.
Example connection string
Profile=C:\profiles\HuggingFace.apip;ProfileSettings='APIKey=hf_xxxxxxxxxxxxxxxxxxxx';
Configure Queries for Each Hugging Face Instance
Sync enables you to control replication with standard SQL. The REPLICATE statement is a high-level command that caches and maintains a table in your database. You can define any SELECT query supported by the Hugging Face API. The statement below caches and incrementally updates a table of Hugging Face data:
REPLICATE Collections;
You can specify a file containing the replication queries you want to use to update a particular database. Separate replication statements with semicolons. The following options are useful if you are replicating multiple Hugging Face accounts into the same database:
You can use a different table prefix in the REPLICATE SELECT statement:
REPLICATE PROD_Collections SELECT * FROM Collections
Alternatively, you can use a different schema:
REPLICATE PROD.Collections SELECT * FROM Collections
Run Sync
After you have configured the connection strings and replication queries, you can run Sync with the following command-line options:
Windows
APISync.exe -g MyProductionAPIConfig.xml -f MyProductionAPISync.sql
Java
java -Xbootclasspath/p:c:\sqlitejdbc.jar -jar APISync.jar -g MyProductionAPIConfig.xml -f MyProductionAPISync.sql