Discover how a bimodal integration strategy can address the major data management challenges facing your organization today.
Get the Report →Excel Spreadsheet Automation with the QUERY Formula
Pull data, automate spreadsheets, and more with the QUERY formula.
The CData Excel Add-In for Databricks provides formulas that can edit, save, and delete Databricks data. The following three steps show how you can automate the following task: Search Databricks data for a user-specified value and then organize the results into an Excel spreadsheet.
The syntax of the CDATAQUERY formula is the following:
=CDATAQUERY(Query, [Connection], [Parameters], [ResultLocation]);
This formula requires three inputs:
- Query: The declaration of the Databricks data records you want to retrieve or the modifications to be made, written in standard SQL.
Connection: Either the connection name, such as DatabricksConnection1, or a connection string. The connection string consists of the required properties for connecting to Databricks data, separated by semicolons.
To connect to a Databricks cluster, set the properties as described below.
Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.
- Server: Set to the Server Hostname of your Databricks cluster.
- HTTPPath: Set to the HTTP Path of your Databricks cluster.
- Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).
- ResultLocation: The cell that the output of results should start from.
Pass Spreadsheet Cells as Inputs to the Query
The procedure below results in a spreadsheet that organizes all the formula inputs in the first column.
- Define cells for the formula inputs. In addition to the connection inputs, add another input to define a criterion for a filter to be used to search Databricks data, such as Country.
- In another cell, write the formula, referencing the cell values from the user input cells defined above. Single quotes are used to enclose values such as addresses that may contain spaces.
- Change the filter to change the data.
=CDATAQUERY("SELECT * FROM Customers WHERE Country = '"&B8&"'","Server="&B1&";Port="&B2&";TransportMode="&B3&";HTTPPath="&B4&";UseSSL="&B5&";User="&B6&";Password="&B7&";Provider=Databricks",B9)