Create Data Visualizations in Cognos BI with Databricks Data



Access Databricks data as an ODBC data source in Cognos Business Intelligence and create data visualizations in Cognos Report Studio.

You can use the CData ODBC driver for Databricks to integrate Databricks data with the drag-and-drop style of Cognos Report Studio. This article describes both a graphical approach to create data visualizations, with no SQL required, as well as how to execute any SQL query supported by Databricks.

Configure and Publish the Data Source

If you have not already, first specify connection properties in an ODBC DSN (data source name). This is the last step of the driver installation. You can use the Microsoft ODBC Data Source Administrator to create and configure ODBC DSNs.

To connect to a Databricks cluster, set the properties as described below.

Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.

  • Server: Set to the Server Hostname of your Databricks cluster.
  • HTTPPath: Set to the HTTP Path of your Databricks cluster.
  • Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).

When you configure the DSN, you may also want to set the Max Rows connection property. This will limit the number of rows returned, which is especially helpful for improving performance when designing reports and visualizations.

If you are running Cognos from a 64-bit machine and want to modify the DSN or create other Databricks DSNs, you must use a system DSN. You will also need to open the 32-bit ODBC Data Source Administrator. You can open it with the following command:

C:\Windows\sysWOW64\odbcad32.exe

After creating a DSN, you can then publish the data source:

  1. Open Cognos Administration and click Data Source Connections to add a new data source:
  2. Select the ODBC option and enter the DSN, CData Databricks Sys, and a user-friendly name.

  3. Click Retrieve Objects and choose the CData Databricks database object.

Add Data Visualizations to a Report

You can now create reports on Databricks data in Cognos Report Studio by dragging and dropping table columns from the Source Explorer onto report objects. The sections below show how to create a simple report with a chart that shows up-to-date data.

As you build the report, Cognos Report Studio will generate SQL queries and rely on the driver to execute them. The driver will convert queries into requests to the Databricks API. To execute queries to the live Databricks data, the driver depends on the capabilities of the underlying API.

Create a Chart Based on an Aggregate

You can populate almost any report object in Cognos with Databricks data by simply dragging and dropping columns from the Source Explorer onto the dimensions of the object. The column in the Series dimension of the chart is automatically grouped.

Additionally, Cognos sets a logical default aggregate function for the measure dimension based on the data type. For this example, override the default by clicking the CompanyName column in the Data Items tab and set the Aggregate Function property to Not Applicable. The Rollup Aggregate Function property must be set to Automatic.

Convert a Query Object to SQL

When you know the query you need, or if you want to adjust the generated query, convert a query object into an SQL statement. After a query has been converted to SQL, the UI controls are not available for the query object. Follow the procedure below to populate a chart with user-defined SQL.

Cognos will rely on the driver to execute the user-defined query. Using the driver's SQL engine ensures that queries will always return up-to-date results, as there is no cached copy of the data.

  1. Hover over the Query Explorer and click the Queries folder to display the query objects in your report.
  2. If you want to edit the autogenerated query, click the button in the Generated SQL property for the query object. In the resulting dialog, click Convert.

    If you want to enter a new SQL statement, drop an SQL object in-line with the query object.

  3. Modify the properties for the SQL object: Select the Databricks data source in the SQL properties and set the SQL Syntax property to Native.
  4. Click the button in the SQL property and enter the SQL query in the resulting dialog. This example uses the query below:

    SELECT City, CompanyName FROM Customers WHERE Country = 'US'
  5. Modify the properties for the query object: Set the Processing property to "Limited Local". This value is required to convert a query object to SQL.

Fill a Chart with the Results of a Query

You can now access the results of the SQL query as objects in the Data Items tab. Follow the procedure below to create a chart with the results; for example, the CompanyName for each City from the Customers table.

  1. Return to the page by hovering over the Page Explorer and then clicking the page object.
  2. Drag a pie chart from the toolbox onto the workspace.
  3. In the properties for the chart, set the Query property to the name of the query you created above.
  4. Click the Data Items tab and drag columns onto the x- and y-axes. In this example, drag the City column to the Series (pie slices) box and the CompanyName column to the Default Measure box.
  5. Modify the default properties for the Default Measure (the CompanyName values): In the Aggregate Function box, select the Total option.

Run the report to add the results of the query.

Ready to get started?

Download a free trial of the Databricks ODBC Driver to get started:

 Download Now

Learn more:

Databricks Icon Databricks ODBC Driver

The Databricks ODBC Driver is a powerful tool that allows you to connect with live data from Databricks, directly from any applications that support ODBC connectivity.

Access Databricks data like you would a database - read, write, and update through a standard ODBC Driver interface.