Model Context Protocol (MCP) finally gives AI models a way to access the business data needed to make them really useful at work. CData MCP Servers have the depth and performance to make sure AI has access to all of the answers.
Try them now for free →Connect and Visualize Live Presto Data in Databricks Lakehouse Federation with CData Connect Cloud
Use CData Connect Cloud to integrate live Presto data into the Databricks platform and create visualization dashboards with real-time Presto data.
Databricks Lakehouse Federation enables organizations to query and integrate data from multiple sources without requiring data movement. It allows federated queries across databases, data warehouses, and lakehouses, providing a unified interface for data analysis and management within Databricks. When combined with CData Connect Cloud, it enables seamless access to Presto data for data virtualization, while also supporting data lineage and fine-grained access control.
This article explains how to use CData Connect Cloud to establish a live connection to Presto and how to access live Presto data from the Databricks platform.
About Presto Data Integration
Accessing and integrating live data from Trino and Presto SQL engines has never been easier with CData. Customers rely on CData connectivity to:
- Access data from Trino v345 and above (formerly PrestoSQL) and Presto v0.242 and above (formerly PrestoDB)
- Read and write access all of the data underlying your Trino or Presto instances
- Optimized query generation for maximum throughput.
Presto and Trino allow users to access a variety of underlying data sources through a single endpoint. When paired with CData connectivity, users get pure, SQL-92 access to their instances, allowing them to integrate business data with a data warehouse or easily access live data directly from their preferred tools, like Power BI and Tableau.
In many cases, CData's live connectivity surpasses the native import functionality available in tools. One customer was unable to effectively use Power BI due to the size of the datasets needed for reporting. When the company implemented the CData Power BI Connector for Presto they were able to generate reports in real-time using the DirectQuery connection mode.
Getting Started
CData Connect Cloud offers a seamless SQL Server, cloud-to-cloud interface for Presto, enabling you to effortlessly create dashboards and visualizations using live Presto data in Databricks. While building visualizations, Databricks requires SQL queries to retrieve the necessary data. With built-in optimized data processing, CData Connect Cloud pushes all supported SQL operations (such as filters and JOINs) directly to Presto, utilizing server-side processing for fast and efficient data retrieval of Presto data.
Configure Presto connectivity for Databricks in CData Connect Cloud
To work with Presto data in Databricks - Lakehouse Federation, you need to connect to Presto from Connect Cloud and provide user access to the connection.
- Log into Connect Cloud, click Connections and click Add Connection.
- Select "Presto" from the Add Connection panel.
-
Enter the necessary authentication properties to connect to Presto.
Set the Server and Port connection properties to connect, in addition to any authentication properties that may be required.
To enable TLS/SSL, set UseSSL to true.
Authenticating with LDAP
In order to authenticate with LDAP, set the following connection properties:
- AuthScheme: Set this to LDAP.
- User: The username being authenticated with in LDAP.
- Password: The password associated with the User you are authenticating against LDAP with.
Authenticating with Kerberos
In order to authenticate with KERBEROS, set the following connection properties:
- AuthScheme: Set this to KERBEROS.
- KerberosKDC: The Kerberos Key Distribution Center (KDC) service used to authenticate the user.
- KerberosRealm: The Kerberos Realm used to authenticate the user with.
- KerberosSPN: The Service Principal Name for the Kerberos Domain Controller.
- KerberosKeytabFile: The Keytab file containing your pairs of Kerberos principals and encrypted keys.
- User: The user who is authenticating to Kerberos.
- Password: The password used to authenticate to Kerberos.
- Click Create & Test.
- Navigate to the Permissions tab in the Add Presto Connection page and update the User-based permissions
Add a Personal Access Token
If you are connecting from a service, application, platform, or framework that does not support OAuth authentication, you can create a Personal Access Token (PAT) to use for authentication. Best practices would dictate that you create a separate PAT for each service, to maintain granularity of access.
- Click on your username at the top right of the Connect Cloud app and click User Profile
- On the User Profile page, scroll down to the Personal Access Tokens section and click Create PAT
- Give your PAT a name and click Create
- The personal access token is only visible at creation, so be sure to copy it and store it securely for future use
With the connection configured, you are ready to connect to Presto data from Databricks.
Connecting live Presto data in Databricks
Follow these steps to establish a connection from Databricks to the CData Connect Cloud Virtual SQL Server API.
- Log into Databricks.
- Navigate to SQL Warehouses and start any warehouse of your choice.
- In the navigation pane, select Catalog. Click and select Create a connection.
- In the Connection basics section (or Step 1 of Set up connection page), enter the following connection details and click Next:
- Connection name: a user-defined connection name.
- Connection type: select SQL Server from the drop-down list.
- Auth type: select Username and password.
- In the Authentication section (or Step 2), enter the required authentication details, and click Next:
- Host: tds.cdata.com
- Port: 14333
- User: enter your CData Connect Cloud username, displayed in the top-right corner of the CData Connect Cloud interface. For example, [email protected]
- Password: enter the PAT generated and copied in the previous section.
- In the Connection details section (or Step 3), enable the Trust server certificate checkbox and select the appropriate Application intent. Click Create Connection.
- In the Catalog basics section (or Step 4), enter the required details and click Create catalog:
- Catalog name: enter a name of your choice
- Connection: this will be the Databricks connection you defined earlier
- Database: enter your Presto connection name (for example, Presto1)
- In the Access section (or Step 5), assign the Workspace, User access rights, and Grant read or edit privileges to the catalog.
- Click Next > Save to save all the details for the catalog.
Access the catalog and visualize live Presto data in Databricks
To access the newly created catalog and create a dashboard to visualize live Presto data in Databricks, follow these steps:
- Select the catalog and expand it. A list of tables from Presto will appear on the screen.
- Choose the desired table and click the Overview tab to view the table metadata.
- Click the Sample Data tab to view real-time data in the table.
- Now, click Create at the top right corner and select Dashboard.
- Manually create a visualization by selecting at least one field in the visualization editor from the widget, or choose one of the visualization options suggested by Databricks AI.
- Once the visualization is created, edit the details in the widget settings of the dashboard.
- Click Publish to publish the dashboard report.
Live access to Presto data from cloud applications
At this stage, you have established a direct, cloud-to-cloud connection to live Presto data in Databricks. This enables you to create dashboards to monitor and visualize your data seamlessly.
For more details on accessing live data from over 100 SaaS, Big Data, and NoSQL sources through cloud applications like Databricks, visit our Connect Cloud page. As always, let us know if you have any questions during your evaluation. Our world-class CData Support Team is always available to help!