Extend Google Sheets with Databricks Data



Make calls to the API Server from Google Apps Script.

Interact with Databricks data from Google Sheets through macros, custom functions, and add-ons. The CData API Server enables connectivity to Databricks data from cloud-based and mobile applications like Google Sheets. The API Server is a lightweight Web application that produces OData services for Databricks.

Google Apps Script can consume these OData services in the JSON format. This article shows how to create a simple add-on that populates a Google Spreadsheet with Customers data and, as you make changes, executes updates to Databricks data.

About Databricks Data Integration

Accessing and integrating live data from Databricks has never been easier with CData. Customers rely on CData connectivity to:

  • Access all versions of Databricks from Runtime Versions 9.1 - 13.X to both the Pro and Classic Databricks SQL versions.
  • Leave Databricks in their preferred environment thanks to compatibility with any hosting solution.
  • Secure authenticate in a variety of ways, including personal access token, Azure Service Principal, and Azure AD.
  • Upload data to Databricks using Databricks File System, Azure Blog Storage, and AWS S3 Storage.

While many customers are using CData's solutions to migrate data from different systems into their Databricks data lakehouse, several customers use our live connectivity solutions to federate connectivity between their databases and Databricks. These customers are using SQL Server Linked Servers or Polybase to get live access to Databricks from within their existing RDBMs.

Read more about common Databricks use-cases and how CData's solutions help solve data problems in our blog: What is Databricks Used For? 6 Use Cases.


Getting Started


Set Up the API Server

If you have not already done so, download the CData API Server. Once you have installed the API Server, follow the steps below to begin producing secure Databricks OData services:

Connect to Databricks

To work with Databricks data from Google Sheets, we start by creating and configuring a Databricks connection. Follow the steps below to configure the API Server to connect to Databricks data:

  1. First, navigate to the Connections page.
  2. Click Add Connection and then search for and select the Databricks connection.
  3. Enter the necessary authentication properties to connect to Databricks.

    To connect to a Databricks cluster, set the properties as described below.

    Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.

    • Server: Set to the Server Hostname of your Databricks cluster.
    • HTTPPath: Set to the HTTP Path of your Databricks cluster.
    • Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).
  4. After configuring the connection, click Save & Test to confirm a successful connection.

Configure API Server Users

Next, create a user to access your Databricks data through the API Server. You can add and configure users on the Users page. Follow the steps below to configure and create a user:

  1. On the Users page, click Add User to open the Add User dialog.
  2. Next, set the Role, Username, and Privileges properties and then click Add User.
  3. An Authtoken is then generated for the user. You can find the Authtoken and other information for each user on the Users page:

Creating API Endpoints for Databricks

Having created a user, you are ready to create API endpoints for the Databricks tables:

  1. First, navigate to the API page and then click Add Table .
  2. Select the connection you wish to access and click Next.
  3. With the connection selected, create endpoints by selecting each table and then clicking Confirm.

Gather the OData Url

Having configured a connection to Databricks data, created a user, and added resources to the API Server, you now have an easily accessible REST API based on the OData protocol for those resources. From the API page in API Server, you can view and copy the API Endpoints for the API:

Retrieve Databricks Data

Open the Script Editor from your spreadsheet by clicking Tools -> Script Editor. In the Script Editor, add the following function to populate a spreadsheet with the results of an OData query:

function retrieve(){ var url = "https://MyUrl/api.rsc/Customers?select=Id,City,CompanyName,Country"; var response = UrlFetchApp.fetch(url,{ headers: {"Authorization": "Basic " + Utilities.base64Encode("MyUser:MyAuthtoken")} }); var json = response.getContentText(); var sheet = SpreadsheetApp.getActiveSheet(); var a1 = sheet.getRange('a1'); var index=1; var customers = JSON.parse(json).value; var cols = [["Id","City","CompanyName","Country"]]; sheet.getRange(1,1,1,4).setValues(cols); row=2; for(var i in customers){ for (var j in customers[i]) { switch (j) { case "Id": a1.offset(row,0).setValue(account[i][j]); break; case "City": a1.offset(row,1).setValue(account[i][j]); break; case "CompanyName": a1.offset(row,2).setValue(account[i][j]); break; case "Country": a1.offset(row,3).setValue(account[i][j]); break; } } row++; } }

Follow the steps below to add an installable trigger to populate the spreadsheet when opened:

  1. Click Resources -> Current Project's Triggers -> Add a New Trigger.
  2. Select retrieve in the Run menu.
  3. Select From Spreadsheet.
  4. Select On open.

After closing the dialog, you are prompted to allow access to the application.

Post Changes to Databricks Data

Add the following function to post changes to cells back to the API Server:

function buildReq(e){ var sheet = SpreadsheetApp.getActiveSheet(); var changes = e.range; var id = sheet.getRange(changes.getRow(),1).getValue(); var col = sheet.getRange(1,changes.getColumn()).getValue(); var url = "http://MyServer/api.rsc/Customers("+id+")"; var putdata = "{\"@odata.type\" : \"CDataAPI.Customers\", \""+col+"\": \""+changes.getValue()+"\"}";; UrlFetchApp.fetch(url,{ method: "put", contentType: "application/json", payload: putdata, headers: {"Authorization": "Basic " + Utilities.base64Encode("MyUser:MyAuthtoken")} }); }

Follow the steps below to add the update trigger:

  1. Click Resources -> Current Project's Triggers.
  2. Select buildReq in the Run menu.
  3. Select From Spreadsheet.
  4. Select On edit.

You can test the script by clicking Publish -> Test as Add-On. Select the version, installation type, and spreadsheet to create a test configuration. You can then select and run the test configuration.

As you make changes to cells, the API Server executes updates to Databricks data.

Ready to get started?

Learn more or sign up for a free trial:

CData API Server