Ready to get started?

Download a free trial of the Apache Spark ODBC Driver to get started:

 Download Now

Learn more:

Apache Spark Icon Apache Spark ODBC Driver

The Spark ODBC Driver is a powerful tool that allows you to connect with Apache Spark, directly from any applications that support ODBC connectivity.

The Driver maps SQL to Spark SQL, enabling direct standard SQL-92 access to Apache Spark.

Build Charts with Spark Data in Clear Analytics



Create dynamic charts and perform analytics based on Spark data in Clear Analytics.

The CData ODBC Driver for Spark enables access to live data from Spark under the ODBC standard, allowing you work with Spark data in a wide variety of BI, reporting, and ETL tools and directly, using familiar SQL queries. This article shows how to use Clear Analytics, a Microsoft Excel Add-In, to connect to Spark as an ODBC source and create queries, tables, and charts (including PivotTables) based on Spark data.

Connect to Spark Data


Configure the ODBC Data Source Name

If you have not already done so, provide values for the required connection properties in the data source name (DSN). You can use the built-in Microsoft ODBC Data Source Administrator to configure the DSN. This is also the last step of the driver installation. See the "Getting Started" chapter in the help documentation for a guide to using the Microsoft ODBC Data Source Administrator to create and configure a DSN.

Set the Server, Database, User, and Password connection properties to connect to SparkSQL.

When you configure the DSN, you may also want to set the Max Rows connection property. This will limit the number of rows returned, which is especially helpful for improving performance when designing reports and visualizations.

Configure the Data Source in Clear Analytics

  1. Open Excel and navigate to the CLEAR ANALYTICS ribbon. Once there, open the Data Manager.
  2. Select Database as the data source.
  3. In the Set Connection section, click the option to create a new database.
  4. Select Microsoft ODBC Data Source as the data source and click OK.
  5. Select the DSN you already configured from the drop-down menu.
  6. Back on the Set Connection section, select Standard (ANSI ODBC) Query Builder as the SQL Builder Provider and click Next.
  7. Select the Schema/Owner and choose the domains (tables) that you wish to use in Clear Analytics.
  8. Prepare your data objects as needed by customizing the display names and descriptions of the tables and columns.
  9. For the vast majority of the CData ODBC Drivers, you will not set a key date for your domains.
  10. In the Domain Relations section, add any relational information between tables.
  11. In the Domain Tree section, create groups for your data and add the available items to the groups.
  12. Review the summary of your data and click Finish.

Create a Chart with Spark Data

You are now ready to create a chart with Spark data.

Create a New Query

  1. Click Repository in the CLEAR ANALYTICS ribbon.
  2. Create a new query.
  3. Select the columns you wish to retrieve.
  4. Set the aggregation type for your data (use the blank entry if you do not wish to aggregate the data).
  5. Set filters and formulas by dragging columns to the lower window.
  6. Name your query and click Save.

Build a Chart Based on a Query Report

With the query created, you are now ready to execute a report and display a chart.
  1. Click Report Explorer in the CLEAR ANALYTICS ribbon.
  2. In the Report Explorer pane, click the 'New Report' icon in the toolbar.
  3. Select the query you just created.
  4. Name the report and click 'Save and Execute'.
  5. Click the Results tab within the Report Explorer
  6. Expand your report and drag the chart to the Excel spreadsheet.
  7. In the resulting PivotChart window, drag the fields (columns) to the Filters, Legends, Axis (Categories), and Values windows.

With a new data source in Clear Analytics established and a chart created, you are ready to begin analysis of Spark data. With the ODBC Driver for Spark and Clear Analytics, you can perform self-service analytics in Excel with live data, directly from Spark.