Ready to get started?

Learn more about CData Cloud Hub or sign up for a free trial:

Learn More

Create Reports from Hive Data in Google Data Studio

Use the CData Cloud Hub to create a virtual MySQL Database for Hive data and create custom reports in Google Data Studio.

Google Data Studio allows you to create branded reports with data visualizations to share with your clients. When paired with the CData Cloud Hub, you get instant, cloud-to-cloud access to Hive data for visualizations, dashboards, and more. This article shows how to create a virtual database for Hive and build reports from Hive data in Google Data Studio.

The CData Cloud Hub provides a pure cloud-to-cloud interface for Hive, allowing you to easily build reports from live Hive data in Google Data Studio — without replicating the data. As you build visualizations, Google Data Studio generates queries to gather data. Using optimized data processing out of the box, the CData Cloud Hub pushes all supported query operations (filters, JOINs, etc) directly to Hive, leveraging server-side processing to quickly return Hive data.

This article requires a CData Cloud Hub instance and the CData Cloud Hub Connector for Google Data Studio. Get more information on the CData Cloud Hub and sign up for a free trial at https://www.cdata.com/cloudhub.


Connect to Hive from the Cloud Hub

CData Cloud Hub uses a straightforward, point-and-click interface to connect to data sources and generate APIs.

  1. Log into Cloud Hub and click Databases.
  2. Select "Hive" from Available Data Sources.
  3. Enter the necessary authentication properties to connect to Hive. Set the Server, Port, TransportMode, and AuthScheme connection properties to connect to Hive.
  4. Click Test Database.
  5. Click Privileges -> Add and add the new user (or an existing user) with the appropriate permissions.

With the virtual database created, you are ready to connect to Hive data from Google Data Studio.

Visualize Live Hive Data in Google Data Studio

The steps below outline connecting to the CData Cloud Hub from Google Data Studio to create a new Hive data source and build a simple visualization from the data.

  1. Log into Google Data Studio, click data sources, create a new data source, and choose CData Cloud Hub Connector.
  2. Authorize the Connector to connect to an external service (your Cloud Hub instance).
  3. Use your instance name (myinstance in myinstance.cdatacloud.net), username, and password to connect to your Cloud Hub instance.
    • Username: myinstance/username
    • Password: your Cloud Hub password
  4. Select a Database (e.g. ApacheHive1) and click Next.
  5. Select a Table (e.g. Customers) and click Next.
  6. Click Connect.
  7. If needed, modify columns, click Create Report, and add the data source to the report.
  8. Select a visualization style and add it to the report.
  9. Select Dimensions and Measures to customize your visualization.

Optional: Connect with the MySQL Connector

If you need to work with data from a custom SQL query, you can use the MySQL Connector. Connect using the server information for your Cloud Hub instance (server address, port, username, and password).

SQL Access to Hive Data from Cloud Applications

Now you have a direct, cloud-to-cloud connection to live Hive data from your Google Data Studio workbook. You can create more data sources and new visualizations, build reports, and more — all without replicating Hive data.

Try the CData Cloud Hub and get SQL data access to 100+ SaaS, Big Data, and NoSQL sources directly from your cloud applications.