Publish Reports with HDFS Data in Crystal Reports

Ready to get started?

Download for a free trial:

Download Now

Learn more:

HDFS ODBC Driver

The HDFS ODBC Driver is a powerful tool that allows you to connect with live data from HDFS, directly from any applications that support ODBC connectivity.

Access HDFS data like you would a database - read, write, and update HDFS HDFSData, etc. through a standard ODBC Driver interface.



Use the Report Wizard to design a report based on up-to-date HDFS data.

Crystal Reports has many options for offloading data processing to remote data; this enables real-time reporting. The CData ODBC Driver for HDFS brings this capability to Crystal Reports. This article shows how to create a report on HDFS data that refreshes when you run the report.

Connect to HDFS Data

If you have not already, first specify connection properties in an ODBC DSN (data source name). This is the last step of the driver installation. You can use the Microsoft ODBC Data Source Administrator to create and configure ODBC DSNs.

In order to authenticate, set the following connection properties:

  • Host: Set this value to the host of your HDFS installation.
  • Port: Set this value to the port of your HDFS installation. Default port: 50070

When you configure the DSN, you may also want to set the Max Rows connection property. This will limit the number of rows returned, which is especially helpful for improving performance when designing reports and visualizations.

You can then follow the procedure below to use the Report Wizard to create the HDFS connection.

  1. In a new report, click Create New Connection -> ODBC.

  2. In the resulting wizard, click Select Data Source and select the DSN in the Data Source Name menu.

Design a Report

After adding a ODBC connection to HDFS, you can then use the Report Wizard to add HDFS data to your report.

  1. Click File -> New -> Standard Report.
  2. Expand the ODBC node under Create New Connection and double-click Make a New Connection.
  3. Configure the data source by selecting the tables and fields needed in the report. This example uses the FileId and ChildrenNum columns from the Files table.

You can then configure grouping, sorting, and summaries. For example, this article groups on FileId and summarizes on ChildrenNum. See the following section to use the aggregate and summary to create a chart.

Create a Chart

After selecting a column to group by, the Standard Report Creation Wizard presents the option to create a chart. Follow the steps below to create a chart that aggregates the values in the FileId column.

  1. In the Standard Report Creation Wizard, select the Bar Chart option and select the column you grouped by, FileId in this example, in the On Change Of menu.
  2. In the Show Summary menu, select the summary you created.
  3. Select filters and a template, as needed, to finish the wizard.

Preview the finished report to view the chart, populated with your data. If you want to filter out null values, use a SelectionFormula.

Working with Remote Data

To ensure that you see updates to volatile data, click File and clear the "Save Data with Report" option. As you interact with the report, for example, drilling down to hidden details, Crystal Reports executes SQL queries to retrieve the data needed to display the report. To reload data you have already retrieved, refresh or rerun the report.

You can offload processing onto the driver by hiding details elements and enabling server-side grouping. To do this, you will need to have selected a column to group on in the report creation wizard.

  1. Click File -> Report Options and select the Perform Grouping On Server option.
  2. Click Report -> Section Expert and select the Details section of your report. Select the Hide (Drill-Down OK) option.

When you preview your report with the hidden details, Crystal Reports executes a GROUP BY query. When you double-click a column in the chart to drill down to details, Crystal Reports executes a SELECT WHERE query that decreases load times by retrieving only the data needed.