Ready to get started?

Learn more about CData Cloud Hub or sign up for a free trial:

Learn More

Create Reports from Spark Data in Google Data Studio

Use the CData Cloud Hub to create a virtual MySQL Database for Spark data and create custom reports in Google Data Studio.

Google Data Studio allows you to create branded reports with data visualizations to share with your clients. When paired with the CData Cloud Hub, you get instant, cloud-to-cloud access to Spark data for visualizations, dashboards, and more. This article shows how to use a MySQL client to create a virtual database for Spark and build reports from Spark data in Google Data Studio.

The CData Cloud Hub provides a pure MySQL, cloud-to-cloud interface for Spark, allowing you to easily build reports from live Spark data in Google Data Studio — without replicating the data to a natively supported database. As you build visualizations, Google Data Studio generates SQL queries to gather data. Using optimized data processing out of the box, the CData Cloud Hub pushes all supported SQL operations (filters, JOINs, etc) directly to Spark, leveraging server-side processing to quickly return Spark data.

Create a Virtual MySQL Database for Spark Data

CData Cloud Hub uses a straightforward, point-and-click interface to connect to data sources and generate APIs.

  1. Login to Cloud Hub and click Databases.
  2. Select "Spark" from Available Data Sources.
  3. Enter the necessary authentication properties to connect to Spark.

    Set the Server, Database, User, and Password connection properties to connect to SparkSQL.

  4. Click Test Database.
  5. Click Privileges -> Add and add the new user (or an existing user) with the appropriate permissions.

With the virtual database created, you are ready to connect to Spark data from any MySQL client.

Visualize Live Spark Data in Google Data Studio

The steps below outline connecting to the CData Cloud Hub from Google Data Studio to create a new Spark data source and build a simple visualization from the data.

  1. Log into Google Data Studio, click data sources, create a new data source and choose MySQL.
  2. Choose the basic configuration and set the connection properties:
    • Host name or IP: myinstance.cdatacloud.net
    • Port: 3306
    • Database: sparkdb
    • Username: your Cloud Hub username
    • Password: your Cloud Hub password
    • Click Enable SSL, then upload the certificates
  3. Click Authenticate
  4. Select the table to visualize or enter a custom query and click Connect.
    NOTE: JOINs are not supported in the user interface, but they are supported as custom queries.
  5. If needed, modify columns, click Create Report and add the data source to the report.
  6. Select a visualization style and add it to the report.
  7. Select Dimensions and Measures to customize your visualization.

SQL Access to Spark Data from Cloud Applications

Now you have a direct, cloud-to-cloud connection to live Spark data from your Google Data Studio workbook. You can create more data sources and new visualizations, build reports, and more — all without replicating Spark data.

To get SQL data access to 100+ SaaS, Big Data, and NoSQL sources directly from your cloud applications, see the CData Cloud Hub.