Ready to get started?

Learn more about the CData JDBC Driver for HDFS or download a free trial:

Download Now

Connect to HDFS Data in JRuby

Create a simple JRuby app with access to live HDFS data.

JRuby is a high-performance, stable, fully threaded Java implementation of the Ruby programming language. The CData JDBC Driver for HDFS makes it easy to integrate connectivity to live HDFS data in JRuby. This article shows how to create a simple JRuby app that connects to HDFS data, executes a query, and displays the results.

Configure a JDBC Connection to HDFS Data

Before creating the app, note the installation location for the JAR file for the JDBC Driver (typically C:\Program Files\CData\CData JDBC Driver for HDFS\lib).

JRuby natively supports JDBC, so you can easily connect to HDFS and execute SQL queries. Initialize the JDBC connection with the getConnection function of the java.sql.DriverManager class.

In order to authenticate, set the following connection properties:

  • Host: Set this value to the host of your HDFS installation.
  • Port: Set this value to the port of your HDFS installation. Default port: 50070

Built-in Connection String Designer

For assistance in constructing the JDBC URL, use the connection string designer built into the HDFS JDBC Driver. Either double-click the JAR file or execute the jar file from the command-line.

java -jar cdata.jdbc.hdfs.jar

Fill in the connection properties and copy the connection string to the clipboard.

Below is a typical JDBC connection string for HDFS:

jdbc:hdfs:Host=sandbox-hdp.hortonworks.com;Port=50070;Path=/user/root;User=root;

Create a JRuby App with Connectivity to HDFS Data

Create a new Ruby file (for example: HDFSSelect.rb) and open it in a text editor. Copy the following code into your file:

require 'java' require 'rubygems' require 'C:/Program Files/CData/CData JDBC Driver for HDFS 2018/lib/cdata.jdbc.hdfs.jar' url = "jdbc:hdfs:Host=sandbox-hdp.hortonworks.com;Port=50070;Path=/user/root;User=root;" conn = java.sql.DriverManager.getConnection(url) stmt = conn.createStatement rs = stmt.executeQuery("SELECT FileId, ChildrenNum FROM Files") while (rs.next) do puts rs.getString(1) + ' ' + rs.getString(2) end

With the file completed, you are ready to display your HDFS data with JRuby. To do so, simply run your file from the command line:

jruby -S HDFSSelect.rb

Writing SQL-92 queries to HDFS allows you to quickly and easily incorporate HDFS data into your own JRuby applications. Download a free trial today!