Ready to get started?

Download a free trial of the Apache Spark ODBC Driver to get started:

 Download Now

Learn more:

Apache Spark Icon Apache Spark ODBC Driver

The Spark ODBC Driver is a powerful tool that allows you to connect with Apache Spark, directly from any applications that support ODBC connectivity.

The Driver maps SQL to Spark SQL, enabling direct standard SQL-92 access to Apache Spark.

How to create an RPA flow for Spark Data in UiPath Studio



Use the Spark ODBC Driver to create workflows that access real-time Spark data without any coding.

UiPath is a Robotic Process Automation (RPA) platform with rich features and an easy-to-use UI that enables non-developers to create process automation. By using UiPath Studio, you can build an RPA program just like drawing a diagram. With the CData ODBC Driver for Apache Spark, users can embed Spark data in the workflow.

This article walks through using the Spark ODBC Driver in UiPath Studio to create an RPA program that accesses Spark data.

Configure the Connection to Spark

If you have not already, first specify connection properties in an ODBC DSN (data source name). This is the last step of the driver installation. You can use the Microsoft ODBC Data Source Administrator to create and configure ODBC DSNs.

Set the Server, Database, User, and Password connection properties to connect to SparkSQL.

Connect UiPath Studio to Spark Data

Now you are ready to use Spark data ODBC DSN in UiPath Studio with the following steps.

  1. From the Start page, click Blank to create a New Project.
  2. Click Manage Packages then search for and install UiPath.Database.Activities.
  3. Navigate to the Activities and drop a Flowchart (Workflow -> Flowchart -> Flowchart) onto the process.
  4. Drop a database Connect activity (App Integration -> Datbase -> Connect) after the Start activity.
  5. Double-click the Connect activity and configure the Connection.
    1. Click the Connection Wizard
    2. Select "Microsoft ODBC Data Source"
    3. In Connection Properties, select your DSN (CData SparkSQL Source) and click OK
  6. To store Connection info, create a variable and bind to Output in the Properties section. Choose DatabaseConnection in Output.

Create an Execute Query Activity

With the connection configured, we are ready to query Spark data in our RPA.

  1. From the Activities navigation, select Execute Query and drop it on the Flowchart.
  2. Double-click the Execute Query activity and set the properties as follows:
    • ExistingDbConnection: Your Connection variable
    • Sql: SELECT statement like SELECT City, Balance FROM Customers
    • DataTable: Create and use a variable with the Type System.Data.DataTable

Create Write CSV Activity

With the Connection and Execute Query activities configured, we are ready to add a Write CSV activity to the Flowchart to replicate the Spark data.

  1. From the Activities navigation, select Write CSV and drop it after the Execute Query activity.
  2. Double-click the Write CSV activity and set the properties as follows:
    • FilePath: Set to a file (new or existing) on disk (i.e.: C:\UiPath[id]-data.csv
    • DataTable: Set to the DataTable variable you created earlier

Connect the Activities and Run the Flowchart

If they are not already connected, connect each Activity that you created to complete the RPA project for extracting Spark data and exporting it to CSV.

Click Run to extract Spark data and create a CSV file.

In this article, we used the CData ODBC Driver for Apache Spark to create an automation flow that accesses Spark data in UiPath Studio. Download a free, 30-day trial of the ODBC Driver and start working with live Spark data in UiPath Studio today!