Query HDFS Data as a MySQL Database in Node.js

Ready to get started?

Learn more or sign up for a free trial:

CData Connect



Execute MySQL queries against HDFS data from Node.js.

You can use CData Connect Cloud to query HDFS data through a MySQL interface. Follow the procedure below to create a virtual database for HDFS in Connect Cloud and start querying using Node.js.

CData Connect Cloud provides a pure MySQL, cloud-to-cloud interface for HDFS, allowing you to easily query live HDFS data in Node.js — without replicating the data to a natively supported database. As you query data in Node.js, CData Connect Cloud pushes all supported SQL operations (filters, JOINs, etc) directly to HDFS, leveraging server-side processing to quickly return HDFS data.

Create a Virtual MySQL Database for HDFS Data

CData Connect Cloud uses a straightforward, point-and-click interface to connect to data sources and generate APIs.

  1. Login to Connect Cloud and click Databases.
  2. Select "HDFS" from Available Data Sources.
  3. Enter the necessary authentication properties to connect to HDFS.

    In order to authenticate, set the following connection properties:

    • Host: Set this value to the host of your HDFS installation.
    • Port: Set this value to the port of your HDFS installation. Default port: 50070
  4. Click Test Database.
  5. Click Privileges -> Add and add the new user (or an existing user) with the appropriate permissions.

With the virtual database created, you are ready to connect to HDFS data from any MySQL client.

Query HDFS from Node.js

The following example shows how to define a connection and execute queries to HDFS with the MySQL module. You will need the following information:

  • Host name, or address, and port: The address of your instance of the Connect Cloud (myinstance.cdatacloud.net) and the port (3306)
  • Username and password: The username and password of a user you authorized on Connect Cloud
  • Database name: The database you configured for HDFS (hdfsdb)

Connect to HDFS data and start executing queries with the code below:

var mysql      = require('mysql');
var fs         = require('fs');
var connection = mysql.createConnection({
  host     : 'myinstance.cdatacloud.net',
  database : 'hdfsdb',
  port	   : '3306',
  user     : 'admin',
  password : 'password',
  ssl      : {
    ca : fs.readFileSync('C:/certs/myCA.pem')
  }
});
connection.connect();
connection.query('SELECT * FROM Files', function(err, rows, fields) {
  if (err) throw err;
  console.log(rows);
});

connection.end();