We are proud to share our inclusion in the 2024 Gartner Magic Quadrant for Data Integration Tools. We believe this recognition reflects the differentiated business outcomes CData delivers to our customers.
Get the Report →Build Dashboards with Databricks Data in DBxtra
Create dynamic dashboards and perform analytics based on Databricks data in DBxtra.
The CData ODBC driver for Databricks enables access to live data from Databricks under the ODBC standard, allowing you work with Databricks data in a wide variety of BI, reporting, and ETL tools and directly, using familiar SQL queries. This article shows how to connect to Databricks data as a generic ODBC Data Provider and create charts, reports, and dashboards based on Databricks data in DBxtra.
About Databricks Data Integration
Accessing and integrating live data from Databricks has never been easier with CData. Customers rely on CData connectivity to:
- Access all versions of Databricks from Runtime Versions 9.1 - 13.X to both the Pro and Classic Databricks SQL versions.
- Leave Databricks in their preferred environment thanks to compatibility with any hosting solution.
- Secure authenticate in a variety of ways, including personal access token, Azure Service Principal, and Azure AD.
- Upload data to Databricks using Databricks File System, Azure Blog Storage, and AWS S3 Storage.
While many customers are using CData's solutions to migrate data from different systems into their Databricks data lakehouse, several customers use our live connectivity solutions to federate connectivity between their databases and Databricks. These customers are using SQL Server Linked Servers or Polybase to get live access to Databricks from within their existing RDBMs.
Read more about common Databricks use-cases and how CData's solutions help solve data problems in our blog: What is Databricks Used For? 6 Use Cases.
Getting Started
Connect to Databricks Data
- If you have not already done so, provide values for the required connection properties in the data source name (DSN). You can configure the DSN using the built-in Microsoft ODBC Data Source Administrator. This is also the last step of the driver installation. See the "Getting Started" chapter in the Help documentation for a guide to using the Microsoft ODBC Data Source Administrator to create and configure a DSN.
To connect to a Databricks cluster, set the properties as described below.
Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.
- Server: Set to the Server Hostname of your Databricks cluster.
- HTTPPath: Set to the HTTP Path of your Databricks cluster.
- Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).
When you configure the DSN, you may also want to set the Max Rows connection property. This will limit the number of rows returned, which is especially helpful for improving performance when designing reports and visualizations.
- Open the DBxtra application and in the New menu click Project and name the Project.
- Select ODBC Connection as the Data Connection Type.
- Click the browse option () for the Data Source.
- In the Data Link Properties window, select Microsoft OLE DB Provider for ODBC Drivers on the Provider tab.
- On the Connection tab, select the Data Source Name and the initial catalog to use (CData).
- Name the Connection and select the appropriate User Groups.
- Double-click the Connection from within the Project to connect to the data.
Create a Dashboard with Databricks Data
You are now ready to create a dashboard with Databricks data.
- Right-click Report Objects under the Project and select New Report Object.
- In the new Report Object, click the link to create the Query.
- In the Select Data Connection window, select the newly created data connection.
- On the Query tab, expand the connection objects and select the Tables, Views, and specific columns you wish to include in your dashboard. You can specify search requirements and even create complex queries which include JOINs and aggregations.
- On the Dashboard tab, select the visualizations and features for your dashboard. Assign the data values from the query to the appropriate fields for the Dashboards items (Values, Series, etc.)
With a new Dashboard created, you are ready to begin analysis of Databricks data. Thanks to the ODBC Driver for Databricks, you can refresh the Dashboard and immediately see any changes made at the source. In the same way, you can create and view Reports with live, up-to-date Databricks data.