Access Live Hugging Face Data in TIBCO Data Virtualization
TIBCO Data Virtualization (TDV) is an enterprise data virtualization solution that orchestrates access to multiple and varied data sources. When paired with the CData API Driver for Tibco DV, you get federated access to live Hugging Face data directly within TIBCO Data Virtualization. This article explains how to deploy an adapter and create a new data source based on Hugging Face.
With built-in optimized data processing, the CData TIBCO DV Adapter offers unmatched performance for interacting with live Hugging Face data. When you issue complex SQL queries to Hugging Face, the adapter pushes supported SQL operations, like filters and aggregations, directly to Hugging Face. Its built-in dynamic metadata querying allows you to work with and analyze Hugging Face data using native data types.
Deploy the Hugging Face TIBCO DV Adapter
In a console, navigate to the bin folder in the TDV Server installation directory. If there is a current version of the adapter installed, you will need to undeploy it.
.\server_util.bat -server localhost -user admin -password ******** -undeploy -version 1 -name API
Extract the CData TIBCO DV Adapter to a local folder and deploy the JAR file (tdv.api.jar) to the server from the extract location.
.\server_util.bat -server localhost -user admin -password ******** -deploy -package /PATH/TO/tdv.api.jar
You may need to restart the server to ensure the new JAR file is loaded properly, which can be accomplished by running the composite.bat script located at: C:\Program Files\TIBCO\TDV Server <version>\bin. Note that reauthenticating to the TDV Studio is required after restarting the server.
Sample Restart Call
.\composite.bat monitor restartOnce you deploy the adapter, you can create a new data source in TDV Studio for Hugging Face.
Create a Hugging Face Data Source in TDV Studio
With the CData API Driver for Tibco DV, you can easily create a data source for Hugging Face and introspect the data source to add resources to TDV.
Create the Data Source
- Right-click on the folder you wish to add the data source to and select New -> New Data Source
- Scroll until you find the adapter (e.g. Hugging Face) and click Next
- Name the data source (e.g. CData Hugging Face Source)
- Fill in the required connection properties
- Log in to your HuggingFace account at https://huggingface.co
- Navigate to Settings > Access Tokens
- Click "New token" to create a new access token
- Select the appropriate permissions (read or write)
- Copy the token value
- AuthScheme: Set this to APIKey.
- APIKey: Set this to your HuggingFace access token.
- Click Create & Close.
HuggingFace Hub uses token-based authentication to enable access to its API. The API provides access to machine learning models, datasets, spaces, papers, and other resources on the HuggingFace Hub platform.
Using API Key Authentication
To authenticate to HuggingFace Hub, you will need to provide an API Key (Access Token). To obtain your access token:
After obtaining your access token, set the following connection properties:
Example connection string
Profile=C:\profiles\HuggingFace.apip;ProfileSettings='APIKey=hf_xxxxxxxxxxxxxxxxxxxx';
Introspect the Data Source
Once the data source is created, you can introspect the data source by right-clicking and selecting Open. In the dashboard, click Add/Remove Resources and select the Tables, Views, and Stored Procedures to include as part of the data source. Click Next and Finish to add the selected Hugging Face tables, views, and stored procedures as resources.
After creating and introspecting the data source, you are ready to work with Hugging Face data in TIBCO Data Virtualization just like you would any other relational data source. You can create views, query using SQL, publish the data source, and more.