Query Databricks Data from Node.js

Ready to get started?

Learn more or sign up for a free trial:

CData API Server



The API Server exposes Web services that allow connectivity to your data. Use the OData endpoint of the CData API Server to execute CRUD queries to Databricks data from Node.js.

The CData API Server, when paired with the ADO.NET Provider for Databricks, exposes Databricks data (or data from any of 200+ other ADO.NET Providers) as an OData endpoint, which can be queried from Node.js using simple HTTP requests. This article shows how to use the API Server to request JSON-formatted Databricks data in Node.js.

Set Up the API Server

Follow the steps below to begin producing secure Databricks OData services:

Deploy

The API Server runs on your own server. On Windows, you can deploy using the stand-alone server or IIS. On a Java servlet container, drop in the API Server WAR file. See the help documentation for more information and how-tos.

The API Server is also easy to deploy on Microsoft Azure, Amazon EC2, and Heroku.

Connect to Databricks

After you deploy the API Server and the ADO.NET Provider for Databricks, provide authentication values and other connection properties needed to connect to Databricks by clicking Settings -> Connections and adding a new connection in the API Server administration console.

To connect to a Databricks cluster, set the properties as described below.

Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.

  • Server: Set to the Server Hostname of your Databricks cluster.
  • HTTPPath: Set to the HTTP Path of your Databricks cluster.
  • Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).

You can then choose the Databricks entities you want to allow the API Server access to by clicking Settings -> Resources.

Authorize API Server Users

After determining the OData services you want to produce, authorize users by clicking Settings -> Users. The API Server uses authtoken-based authentication and supports the major authentication schemes. Access can also be restricted based on IP address; all IP addresses except the local machine are restricted by default. You can authenticate as well as encrypt connections with SSL.

Consume Databricks OData Feeds from Node.js

OData feeds are easy to work with in Node.js. You can use the HTTP client in Node.js to request JSON-formatted data from the API Server's OData endpoint. After making the request, you can construct the body of the response and call the JSON.parse() function to parse it into records.

The code below will make an authenticated request for Customers data. The example URL below applies a simple filter that searches for records with a value of US in the Country column.

var http = require('http'); http.get({ protocol: "http:", hostname: "MyServer.com", port: MyPort, path: "/api.rsc/Customers?$filter=" + encodeURIComponent("Country eq 'US'"), auth: 'MyUser:MyAuthtoken' }, function(res) { var body = ''; res.on('data', function(chunk) { body += chunk; }); res.on('end', function() { console.log(body); var jsonData = JSON.parse(body); }); }).on('error', function(e) { console.log("Error: ", e); });