Ready to get started?

Learn more about CData Cloud Hub or sign up for a free trial:

Learn More

Create Reports from Spark Data in Google Data Studio

Use the CData Cloud Hub to create a virtual MySQL Database for Spark data and create custom reports in Google Data Studio.

Google Data Studio allows you to create branded reports with data visualizations to share with your clients. When paired with the CData Cloud Hub, you get instant, cloud-to-cloud access to Spark data for visualizations, dashboards, and more. This article shows how to create a virtual database for Spark and build reports from Spark data in Google Data Studio.

The CData Cloud Hub provides a pure cloud-to-cloud interface for Spark, allowing you to easily build reports from live Spark data in Google Data Studio — without replicating the data. As you build visualizations, Google Data Studio generates queries to gather data. Using optimized data processing out of the box, the CData Cloud Hub pushes all supported query operations (filters, JOINs, etc) directly to Spark, leveraging server-side processing to quickly return Spark data.

This article requires a CData Cloud Hub instance and the CData Cloud Hub Connector for Google Data Studio. Get more information on the CData Cloud Hub and sign up for a free trial at https://www.cdata.com/cloudhub.


Connect to Spark from the Cloud Hub

CData Cloud Hub uses a straightforward, point-and-click interface to connect to data sources and generate APIs.

  1. Log into Cloud Hub and click Databases.
  2. Select "Spark" from Available Data Sources.
  3. Enter the necessary authentication properties to connect to Spark.

    Set the Server, Database, User, and Password connection properties to connect to SparkSQL.

  4. Click Test Database.
  5. Click Privileges -> Add and add the new user (or an existing user) with the appropriate permissions.

With the virtual database created, you are ready to connect to Spark data from Google Data Studio.

Visualize Live Spark Data in Google Data Studio

The steps below outline connecting to the CData Cloud Hub from Google Data Studio to create a new Spark data source and build a simple visualization from the data.

  1. Log into Google Data Studio, click data sources, create a new data source, and choose CData Cloud Hub Connector.
  2. Authorize the Connector to connect to an external service (your Cloud Hub instance).
  3. Use your instance name (myinstance in myinstance.cdatacloud.net), username, and password to connect to your Cloud Hub instance.
    • Username: myinstance/username
    • Password: your Cloud Hub password
  4. Select a Database (e.g. SparkSQL1) and click Next.
  5. Select a Table (e.g. Customers) and click Next.
  6. Click Connect.
  7. If needed, modify columns, click Create Report, and add the data source to the report.
  8. Select a visualization style and add it to the report.
  9. Select Dimensions and Measures to customize your visualization.

Optional: Connect with the MySQL Connector

If you need to work with data from a custom SQL query, you can use the MySQL Connector. Connect using the server information for your Cloud Hub instance (server address, port, username, and password).

SQL Access to Spark Data from Cloud Applications

Now you have a direct, cloud-to-cloud connection to live Spark data from your Google Data Studio workbook. You can create more data sources and new visualizations, build reports, and more — all without replicating Spark data.

Try the CData Cloud Hub and get SQL data access to 100+ SaaS, Big Data, and NoSQL sources directly from your cloud applications.