Integrate Hugging Face with External Services using SnapLogic
SnapLogic is an integration platform-as-a-service (iPaaS) that allows users to create data integration flows with no code. When paired with the CData JDBC Drivers, users get access to live data from more than 250+ SaaS, Big Data and NoSQL sources, including Hugging Face, in their SnapLogic workflows.
With built-in optimized data processing, the CData JDBC Driver offers unmatched performance for interacting with live Hugging Face data. When platforms issue complex SQL queries to Hugging Face, the driver pushes supported SQL operations, like filters and aggregations, directly to Hugging Face and utilizes the embedded SQL engine to process unsupported operations client-side (often SQL functions and JOIN operations). Its built-in dynamic metadata querying lets you work with Hugging Face data using native data types.
Connect to Hugging Face in SnapLogic
To connect to Hugging Face data in SnapLogic, download and install the CData Hugging Face JDBC Driver. Follow the installation dialog. When the installation is complete, the JAR file can be found in the installation directory (C:/Program Files/CData/CData API Driver for JDBC/lib by default).
Upload the Hugging Face JDBC Driver
After installation, upload the JDBC JAR file to a location in SnapLogic (for example, projects/Jerod Johnson) from the Manager tab.
Configure the Connection
Once the JDBC Driver is uploaded, we can create the connection to Hugging Face.
- Navigate to the Designer tab
- Expand "JDBC" from Snaps and drag a "Generic JDBC - Select" snap onto the designer
- Click Add Account (or select an existing one) and click "Continue"
- In the next form, configure the JDBC connection properties:
- Under JDBC JARs, add the JAR file we previously uploaded
- Set JDBC Driver Class to cdata.jdbc.api.APIDriver
Set JDBC URL to a JDBC connection string for the Hugging Face JDBC Driver, for example:
jdbc:api:Profile=C:\profiles\HuggingFace.apip;ProfileSettings='APIKey=hf_xxxxxxxxxxxxxxxxxxxx';RTK=XXXXXX;
NOTE: RTK is a trial or full key. Contact our Support team for more information.
Built-In Connection String Designer
For assistance in constructing the JDBC URL, use the connection string designer built into the Hugging Face JDBC Driver. Either double-click the JAR file or execute the jar file from the command-line.
java -jar cdata.jdbc.api.jar
Fill in the connection properties and copy the connection string to the clipboard.
HuggingFace Hub uses token-based authentication to enable access to its API. The API provides access to machine learning models, datasets, spaces, papers, and other resources on the HuggingFace Hub platform.
Using API Key Authentication
To authenticate to HuggingFace Hub, you will need to provide an API Key (Access Token). To obtain your access token:
- Log in to your HuggingFace account at https://huggingface.co
- Navigate to Settings > Access Tokens
- Click "New token" to create a new access token
- Select the appropriate permissions (read or write)
- Copy the token value
After obtaining your access token, set the following connection properties:
- AuthScheme: Set this to APIKey.
- APIKey: Set this to your HuggingFace access token.
Example connection string
Profile=C:\profiles\HuggingFace.apip;ProfileSettings='APIKey=hf_xxxxxxxxxxxxxxxxxxxx';
- After entering the connection properties, click "Validate" and "Apply"
Read Hugging Face Data
In the form that opens after validating and applying the connection, configure your query.
- Set Schema name to "API"
- Set Table name to a table for Hugging Face using the schema name, for example: "API"."Collections" (use the drop-down to see the full list of available tables)
- Add Output fields for each item you wish to work with from the table
Save the Generic JDBC - Select snap.
With connection and query configured, click the end of the snap to preview the data (highlighted below).
Once you confirm the results are what you expect, you can add additional snaps to funnel your Hugging Face data to another endpoint.
Piping Hugging Face Data to External Services
For this article, we will load data in a Google Spreadsheet. You can use any of the supported snaps, or even use a Generic JDBC snap with another CData JDBC Driver, to move data into an external service.
- Start by dropping a "Worksheet Writer" snap onto the end of the "Generic JDBC - Select" snap.
- Add an account to connect to Google Sheets
- Configure the Worksheet Writer snap to write your Hugging Face data to a Google Spreadsheet
You can now execute the fully configured pipeline to extract data from Hugging Face and push it into a Google Spreadsheet.
More Information & Free Trial
Using the CData API Driver for JDBC you can create a pipeline in SnapLogic for integrating Hugging Face data with external services. For more information about connecting to Hugging Face, check at our CData API Driver for JDBC page. Download a free, 30 day trial of the CData API Driver for JDBC and get started today.