Ready to get started?

Learn more about the CData ODBC Driver for Apache Spark or download a free trial:

Download Now

Create Datasets from Spark in Domo Workbench and Build Visualizations of Spark Data in Domo

Use the CData ODBC Driver for Spark to create datasets from Spark data in Domo Workbench and then build visualizations in the Domo service.

Domo helps you manage, analyze, and share data across your entire organization, enabling decision makers to identify and act on strategic opportunities. Domo Workbench provides a secure, client-side solution for uploading your on-premise data to Domo. The CData ODBC Driver for Spark links Domo Workbench to operational Spark data. You can build datasets from Spark data using standard SQL queries in Workbench and then create real-time visualizations of Spark data in the Domo service.

The CData ODBC Drivers offer unmatched performance for interacting with live Spark data in Domo due to optimized data processing built into the driver. When you issue complex SQL queries from Domo to Spark, the driver pushes supported SQL operations, like filters and aggregations, directly to Spark and utilizes the embedded SQL Engine to process unsupported operations (often SQL functions and JOIN operations) client-side. With built-in dynamic metadata querying, you can visualize and analyze Spark data using native Domo data types.

Connect to Spark as an ODBC Data Source

If you have not already, first specify connection properties in an ODBC DSN (data source name). This is the last step of the driver installation. You can use the Microsoft ODBC Data Source Administrator to create and configure ODBC DSNs.

Set the Server, Database, User, and Password connection properties to connect to SparkSQL.

When you configure the DSN, you may also want to set the Max Rows connection property. This will limit the number of rows returned, which is especially helpful for improving performance when designing reports and visualizations.

After creating a DSN, you will need to create a dataset for Spark in Domo Workbench using the Spark DSN and build a visualization in the Domo service based on the dataset.

Build a Dataset for Spark Data

You can follow the steps below to build a dataset based on a table in Spark in Domo Workbench using the CData ODBC Driver for Spark.

  1. Open Domo Workbench and, if you have not already, add your Domo service server to Workbench. In the Accounts submenu, click Add New, type in the server address (i.e., domain.domo.com) and click through the wizard to authenticate.
  2. In the DataSet Jobs submenu, click Add New.
  3. Name the dataset job (i.e., ODBC Spark Customers), select ODBC Connection Provider as the transport method, and click through the wizard.
  4. In the newly created DataSet Job, navigate to Source and click to configure the settings.
  5. Select System DSN for the Connection Type.
  6. Select the previously configured DSN (CData SparkSQL Sys) for the System DSN.
  7. Click to validate the configuration.
  8. Below the settings, set the Query to a SQL query: SELECT * FROM Customers NOTE: By connecting to Spark data using an ODBC driver, you simply need to know SQL in order to get your data, circumventing the need to know Spark-specific APIs or protocols.
  9. Click preview.
  10. Check over the generated schema, add any transformations, then save and run the dataset job.

With the dataset job run, the dataset will be accessible from the Domo service, allowing you to build visualizations, reports, and more based on Spark data.

Create Data Visualizations

With the DataSet Job saved and run in Domo Workbench, we are ready to build visualizations of the Spark data in the Domo service.

  1. Navigate to the Data Center.
  2. In the data warehouse, select the ODBC data source and drill down to our new dataset.
  3. With the dataset selected, choose to create a visualization.
  4. In the new card:
    • Drag a Dimension to the X Value.
    • Drag a Measure to the Y Value.
    • Choose a Visualization.

With the CData ODBC Driver for Spark, you can build custom datasets based on Spark data using only SQL in Domo Workbench and then build and share visualizations and reports through the Domo service.