Ready to get started?

Learn more about CData Cloud Hub or sign up for a free trial:

Learn More

Query HDFS Data as a MySQL Database in Node.js

Execute MySQL queries against HDFS data from Node.js.

You can use the CData Cloud Hub to query HDFS data through a MySQL interface. Follow the procedure below to create a virtual database for HDFS in the Cloud Hub and start querying using Node.js.

The CData Cloud Hub provides a pure MySQL, cloud-to-cloud interface for HDFS, allowing you to easily query live HDFS data in Node.js — without replicating the data to a natively supported database. As you query data in Node.js, the CData Cloud Hub pushes all supported SQL operations (filters, JOINs, etc) directly to HDFS, leveraging server-side processing to quickly return HDFS data.

Create a Virtual MySQL Database for HDFS Data

CData Cloud Hub uses a straightforward, point-and-click interface to connect to data sources and generate APIs.

  1. Login to Cloud Hub and click Databases.
  2. Select "HDFS" from Available Data Sources.
  3. Enter the necessary authentication properties to connect to HDFS.

    In order to authenticate, set the following connection properties:

    • Host: Set this value to the host of your HDFS installation.
    • Port: Set this value to the port of your HDFS installation. Default port: 50070
  4. Click Test Database.
  5. Click Privileges -> Add and add the new user (or an existing user) with the appropriate permissions.

With the virtual database created, you are ready to connect to HDFS data from any MySQL client.

Query HDFS from Node.js

The following example shows how to define a connection and execute queries to HDFS with the MySQL module. You will need the following information:

  • Host name, or address, and port: The address of your instance of the Cloud Hub (myinstance.cdatacloud.net) and the port (3306)
  • Username and password: The username and password of a user you authorized on the Cloud Hub
  • Database name: The database you configured for HDFS (hdfsdb)

Connect to HDFS data and start executing queries with the code below:

var mysql      = require('mysql');
var fs         = require('fs');
var connection = mysql.createConnection({
  host     : 'myinstance.cdatacloud.net',
  database : 'hdfsdb',
  port	   : '3306',
  user     : 'admin',
  password : 'password',
  ssl      : {
    ca : fs.readFileSync('C:/certs/myCA.pem')
  }
});
connection.connect();
connection.query('SELECT * FROM Files', function(err, rows, fields) {
  if (err) throw err;
  console.log(rows);
});

connection.end();