Query Databricks Data as a MySQL Database in Node.js

Ready to get started?

Learn more or sign up for a free trial:

CData Connect



Execute MySQL queries against Databricks data from Node.js.

You can use CData Connect Cloud to query Databricks data through a MySQL interface. Follow the procedure below to create a virtual database for Databricks in Connect Cloud and start querying using Node.js.

CData Connect Cloud provides a pure MySQL, cloud-to-cloud interface for Databricks, allowing you to easily query live Databricks data in Node.js — without replicating the data to a natively supported database. As you query data in Node.js, CData Connect Cloud pushes all supported SQL operations (filters, JOINs, etc) directly to Databricks, leveraging server-side processing to quickly return Databricks data.

Create a Virtual MySQL Database for Databricks Data

CData Connect Cloud uses a straightforward, point-and-click interface to connect to data sources and generate APIs.

  1. Login to Connect Cloud and click Databases.
  2. Select "Databricks" from Available Data Sources.
  3. Enter the necessary authentication properties to connect to Databricks.

    To connect to a Databricks cluster, set the properties as described below.

    Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.

    • Server: Set to the Server Hostname of your Databricks cluster.
    • HTTPPath: Set to the HTTP Path of your Databricks cluster.
    • Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).
  4. Click Test Database.
  5. Click Privileges -> Add and add the new user (or an existing user) with the appropriate permissions.

With the virtual database created, you are ready to connect to Databricks data from any MySQL client.

Query Databricks from Node.js

The following example shows how to define a connection and execute queries to Databricks with the MySQL module. You will need the following information:

  • Host name, or address, and port: The address of your instance of the Connect Cloud (myinstance.cdatacloud.net) and the port (3306)
  • Username and password: The username and password of a user you authorized on Connect Cloud
  • Database name: The database you configured for Databricks (databricksdb)

Connect to Databricks data and start executing queries with the code below:

var mysql      = require('mysql');
var fs         = require('fs');
var connection = mysql.createConnection({
  host     : 'myinstance.cdatacloud.net',
  database : 'databricksdb',
  port	   : '3306',
  user     : 'admin',
  password : 'password',
  ssl      : {
    ca : fs.readFileSync('C:/certs/myCA.pem')
  }
});
connection.connect();
connection.query('SELECT * FROM Customers', function(err, rows, fields) {
  if (err) throw err;
  console.log(rows);
});

connection.end();