Use ChatGPT to Talk to Your Databricks Data via CData Connect AI
ChatGPT is an AI chatbot developed by OpenAI, launched in November 2022. Based on large language models (LLMs), it enables users to refine and steer conversations through natural language processing. ChatGPT's developer mode, available to Plus and Pro subscribers, provides full Model Context Protocol (MCP) support for connecting to external data sources and tools.
CData Connect AI offers a dedicated cloud-to-cloud interface for connecting to Databricks data. The CData Connect AI Remote MCP Server enables secure communication between ChatGPT and Databricks. This allows you to ask questions and take actions on your Databricks data using ChatGPT, all without the need for data replication to a natively supported database. With its inherent optimized data processing capabilities, CData Connect AI efficiently channels all supported SQL operations, including filters and JOINs, directly to Databricks. This leverages server-side processing to swiftly deliver the requested Databricks data.
About Databricks Data Integration
Accessing and integrating live data from Databricks has never been easier with CData. Customers rely on CData connectivity to:
- Access all versions of Databricks from Runtime Versions 9.1 - 13.X to both the Pro and Classic Databricks SQL versions.
- Leave Databricks in their preferred environment thanks to compatibility with any hosting solution.
- Secure authenticate in a variety of ways, including personal access token, Azure Service Principal, and Azure AD.
- Upload data to Databricks using Databricks File System, Azure Blog Storage, and AWS S3 Storage.
While many customers are using CData's solutions to migrate data from different systems into their Databricks data lakehouse, several customers use our live connectivity solutions to federate connectivity between their databases and Databricks. These customers are using SQL Server Linked Servers or Polybase to get live access to Databricks from within their existing RDBMs.
Read more about common Databricks use-cases and how CData's solutions help solve data problems in our blog: What is Databricks Used For? 6 Use Cases.
Getting Started
Step 1: Configure Databricks Connectivity for ChatGPT
Connectivity to Databricks from ChatGPT is made possible through CData Connect AI Remote MCP. To interact with Databricks data from ChatGPT, we start by creating and configuring a Databricks connection in CData Connect AI.
- Log into Connect AI, click Sources, and then click Add Connection
- Select "Databricks" from the Add Connection panel
-
Enter the necessary authentication properties to connect to Databricks.
To connect to a Databricks cluster, set the properties as described below.
Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.
- Server: Set to the Server Hostname of your Databricks cluster.
- HTTPPath: Set to the HTTP Path of your Databricks cluster.
- Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).
- Click Save & Test
With the connection configured, we are ready to connect to Databricks data from ChatGPT.
Step 2: Connect ChatGPT to CData Connect AI
Follow these steps to add a CData Connect AI connector in ChatGPT:
- Sign in to ChatGPT with a Plus or Pro subscription.
- Navigate to Settings > Connectors.
- Under the "Advanced settings" section, toggle on Developer mode.
- Once developer mode is enabled, return to the Connectors page and click Create.
- Enter a name for the connector (such as Connect AI MCP).
- In the Remote MCP server URL field, enter:
https://mcp.cloud.cdata.com/mcp
- Set Authentication to "OAuth."
- Check I trust this application and click Create
- You will be redirected to CData Connect AI's OAuth authorization page. Sign in with your Connect AI credentials.
- Review the permissions requested and click Authorize to grant ChatGPT access to your Connect AI resources.
- After successful authorization, you will be redirected back to ChatGPT.
- The Connect AI MCP Server will now appear in your available connectors list where you can manage the connector and enable/disable actions (tools).
Step 3: Explore Live Databricks Data with ChatGPT
- Start a new conversation in ChatGPT.
- In the tool picker, enable "Developer Mode."
- Enable your Connect AI MCP Sever.
- You can now start exploring your data with natural language prompts. ChatGPT will use the Connect AI MCP server to query your live Databricks data. Example prompts:
- "Show me all customers from the last 30 days"
- "What are my top performing products?"
- "Analyze sales trends for this quarter"
- "List all active projects and their current status"
- ChatGPT will translate your natural language queries into SQL and execute them against your Databricks data through the Connect AI MCP server.
IMPORTANT: ChatGPT's developer mode provides full read/write MCP access. Be cautious when allowing write operations to your Databricks data. Always review and confirm any data modifications before allowing them to proceed.
NOTE: Developer mode is in beta and only available to ChatGPT Plus and Pro subscribers. Refer to OpenAI's documentation for the latest setup information.
Get CData Connect AI
To get live data access to 300+ SaaS, Big Data, and NoSQL sources directly from your cloud applications, try CData Connect AI today!