Access Live ScrapingBee Data in TIBCO Data Virtualization

Jerod Johnson
Director, Technology Evangelism

Use the CData API Driver for Tibco DV to create a ScrapingBee data data source in TIBCO Data Virtualization Studio and gain access to live ScrapingBee data from your TDV Server.

TIBCO Data Virtualization (TDV) is an enterprise data virtualization solution that orchestrates access to multiple and varied data sources. When paired with the CData API Driver for Tibco DV, you get federated access to live ScrapingBee data directly within TIBCO Data Virtualization. This article explains how to deploy an adapter and create a new data source based on ScrapingBee.

With built-in optimized data processing, the CData TIBCO DV Adapter offers unmatched performance for interacting with live ScrapingBee data. When you issue complex SQL queries to ScrapingBee, the adapter pushes supported SQL operations, like filters and aggregations, directly to ScrapingBee. Its built-in dynamic metadata querying allows you to work with and analyze ScrapingBee data using native data types.

Deploy the ScrapingBee TIBCO DV Adapter

In a console, navigate to the bin folder in the TDV Server installation directory. If there is a current version of the adapter installed, you will need to undeploy it.
```
.\server_util.bat -server localhost -user admin -password ******** -undeploy -version 1 -name API
```
Extract the CData TIBCO DV Adapter to a local folder and deploy the JAR file (tdv.api.jar) to the server from the extract location.
```
.\server_util.bat -server localhost -user admin -password ******** -deploy -package /PATH/TO/tdv.api.jar
```

You may need to restart the server to ensure the new JAR file is loaded properly, which can be accomplished by running the composite.bat script located at: C:\Program Files\TIBCO\TDV Server <version>\bin. Note that reauthenticating to the TDV Studio is required after restarting the server.

Sample Restart Call

.\composite.bat monitor restart

Once you deploy the adapter, you can create a new data source in TDV Studio for ScrapingBee.

Create a ScrapingBee Data Source in TDV Studio

With the CData API Driver for Tibco DV, you can easily create a data source for ScrapingBee and introspect the data source to add resources to TDV.

Create the Data Source

Right-click on the folder you wish to add the data source to and select New -> New Data Source
Scroll until you find the adapter (e.g. ScrapingBee) and click Next
Name the data source (e.g. CData ScrapingBee Source)
Fill in the required connection properties

Using API Key Authentication

ScrapingBee uses API key authentication. To obtain an API key:

Sign in to your ScrapingBee account at https://app.scrapingbee.com
Navigate to the Dashboard and locate your API key in the top section.
Copy the API key for use in the connection string.

After obtaining your API key, set the following connection properties:

AuthScheme: Set this to APIKey.

ProfileSettings

APIKey: Set this to your ScrapingBee API key.

Example Connection String

Profile=C:\profiles\ScrapingBee.apip;AuthScheme=APIKey;ProfileSettings="APIKey=your_api_key";

Connecting to ScrapingBee

Once the authentication is configured, you can connect to ScrapingBee and query data from any of the available tables. All tables require at least one input parameter (such as a search query or product ID) to retrieve data.

Filling in Connection Information (Salesforce is shown.)

Click Create & Close.

Introspect the Data Source

Once the data source is created, you can introspect the data source by right-clicking and selecting Open. In the dashboard, click Add/Remove Resources and select the Tables, Views, and Stored Procedures to include as part of the data source. Click Next and Finish to add the selected ScrapingBee tables, views, and stored procedures as resources.

Introspecting the Data Source (Salesforce is shown.)

After creating and introspecting the data source, you are ready to work with ScrapingBee data in TIBCO Data Virtualization just like you would any other relational data source. You can create views, query using SQL, publish the data source, and more.

Ready to get started?

Learn more:

TIBCO DV Adapters

CData is the data layer that makes AI work in production—live connectivity and replication across hundreds of the most critical enterprise sources, semantic context, and built-in governance. Powering AI for Databricks, Microsoft, Google, Palantir, and 10,000+ customers worldwide.