Access Live Spark Data in TIBCO Data Virtualization

Use the CData TIBCO DV Adapter for Spark to create a Spark data source in TIBCO Data Virtualization Studio and gain access to live Spark data from your TDV Server.

TIBCO Data Virtualization (TDV) is an enterprise data virtualization solution that orchestrates access to multiple and varied data sources. When paired with the CData TIBCO DV Adapter for Spark, you get federated access to live Spark data directly within TIBCO Data Virtualization. This article walks through deploying an adapter and creating a new data source based on Spark.

With built-in optimized data processing, the CData TIBCO DV Adapter offers unmatched performance for interacting with live Spark data. When you issue complex SQL queries to Spark, the adapter pushes supported SQL operations, like filters and aggregations, directly to Spark. Its built-in dynamic metadata querying allows you to work with and analyze Spark data using native data types.

Deploy the Spark TIBCO DV Adapter

In a console, navigate to the bin folder in the TDV Server installation directory. If there is a current version of the adapter installed, you will need to undeploy it.
```
.\server_util.bat -server localhost -user admin -password ******** -undeploy -version 1 -name SparkSQL
```
Extract the CData TIBCO DV Adapter to a local folder and deploy the JAR file (tdv.sparksql.jar) to the server from the extract location.
```
.\server_util.bat -server localhost -user admin -password ******** -deploy -package /PATH/TO/tdv.sparksql.jar
```

You may need to restart the server to ensure the new JAR file is loaded properly, which can be accomplished by running the composite.bat script located at: C:\Program Files\TIBCO\TDV Server <version>\bin. Note that reauthenticating to the TDV Studio is required after restarting the server.

Sample Restart Call

.\composite.bat monitor restart

Once you deploy the adapter, you can create a new data source in TDV Studio for Spark.

Create a Spark Data Source in TDV Studio

With the CData TIBCO DV Adapter for Spark, you can easily create a data source for Spark and introspect the data source to add resources to TDV.

Create the Data Source

Right-click on the folder you wish to add the data source to and select New -> New Data Source.
Scroll until you find the adapter (e.g. Spark) and click Next.
Name the data source (e.g. CData Spark Source).
Fill in the required connection properties.

Set the Server, Database, User, and Password connection properties to connect to SparkSQL.
Click Create & Close.

Introspect the Data Source

Once the data source is created, you can introspect the data source by right-clicking and selecting Open. In the dashboard, click Add/Remove Resources and select the Tables, Views, and Stored Procedures to include as part of the data source. Click Next and Finish to add the selected Spark tables, views, and stored procedures as resources.

Introspecting the Data Source (Salesforce is shown.)

After creating and introspecting the data source, you are ready to work with Spark data in TIBCO Data Virtualization just like you would any other relational data source. You can create views, query using SQL, publish the data source, and more.

CData Software is a leading provider of data access and connectivity solutions. Our standards-based connectors streamline data access and insulate customers from the complexities of integrating with on-premise or cloud databases, SaaS, APIs, NoSQL, and Big Data.

Connect With Us

Get Started

Data Connectors

ETL/ ELT Solutions

Cloud & API Connectivity

OEM & Custom Drivers

Connect With Us

Get Started

Data Visualization

Company

Resources

Ready to get started?

In this article