Integrate Theia IDE with Live Databricks Data via CData Connect AI
Theia IDE is an open-source, cloud and desktop IDE platform that provides a flexible, extensible development environment with built-in AI capabilities. Its AI features support multiple LLM providers and MCP (Model Context Protocol) tool integrations, allowing developers to interact with live external data sources directly from chat-based agents inside the IDE.
By integrating Theia IDE with CData Connect AI through the built-in MCP server, Theia's AI agents gain governed, real-time access to live Databricks data. This enables developers to list catalogs, explore schemas, and query records from Databricks data without leaving the editor or writing custom integration code.
This article explains how to configure Databricks connectivity in Connect AI, generate the required personal access token, register the CData Connect AI MCP Server in Theia IDE, enable AI features with an LLM provider, and verify the integration by querying live Databricks data from the Theia AI Chat.
About Databricks Data Integration
Accessing and integrating live data from Databricks has never been easier with CData. Customers rely on CData connectivity to:
- Access all versions of Databricks from Runtime Versions 9.1 - 13.X to both the Pro and Classic Databricks SQL versions.
- Leave Databricks in their preferred environment thanks to compatibility with any hosting solution.
- Secure authenticate in a variety of ways, including personal access token, Azure Service Principal, and Azure AD.
- Upload data to Databricks using Databricks File System, Azure Blog Storage, and AWS S3 Storage.
While many customers are using CData's solutions to migrate data from different systems into their Databricks data lakehouse, several customers use our live connectivity solutions to federate connectivity between their databases and Databricks. These customers are using SQL Server Linked Servers or Polybase to get live access to Databricks from within their existing RDBMs.
Read more about common Databricks use-cases and how CData's solutions help solve data problems in our blog: What is Databricks Used For? 6 Use Cases.
Getting Started
Step 1: Configure Databricks connectivity for Theia IDE
Connectivity to Databricks from Theia IDE is made possible through Connect AI's Remote MCP Server. To interact with Databricks data from Theia IDE, start by creating and configuring a Databricks connection in Connect AI.
- Log into Connect AI, click Sources, and then click Add Connection
- Select Databricks from the Add Connection panel
-
Enter the necessary authentication properties to connect to Databricks.
To connect to a Databricks cluster, set the properties as described below.
Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.
- Server: Set to the Server Hostname of your Databricks cluster.
- HTTPPath: Set to the HTTP Path of your Databricks cluster.
- Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).
- Click Save & Test
- Navigate to the Permissions tab and update user-based permissions
Add a Personal Access Token
A Personal Access Token (PAT) is used to authenticate the connection to Connect AI from Theia IDE. It is best practice to create a separate PAT for each integration to maintain granular access control.
- Click the gear icon () at the top right of the Connect AI app to open Settings
- On the Settings page, go to the Access Tokens section and click Create PAT
- Give the PAT a descriptive name and click Create
- Copy the token when displayed and store it securely. It will not be shown again
With the Databricks connection configured and a PAT generated, Theia IDE can now connect to Databricks data through Connect AI.
Step 2: Configure Connect AI MCP in Theia IDE
Next, register the CData Connect AI Remote MCP Server in Theia IDE so that the built-in AI agents can discover and call live data tools through Connect AI.
- Download and install the Theia IDE
- Open Theia IDE and navigate to Settings (or press Ctrl + ,) to open the Settings view
-
In the Settings panel, expand AI Features and select MCP
-
Click Edit in settings.json to open the configuration file and paste the following JSON:
{ "ai-features.mcp.mcpServers": { "cdata": { "serverUrl": "https://mcp.cloud.cdata.com/mcp", "serverAuthToken": "Basic your_base64_encoded_email_PAT", "serverAuthTokenHeader": "Authorization" } } }Note: Theia IDE will use Basic authentication with Connect AI. Combine your Connect AI user email and the PAT you created earlier in the format email:PAT, base64 encode the combined string, and prefix it with Basic. For example, given [email protected]:ABC123...XYZ789, the serverAuthToken value becomes something like: Basic dXNlckBkb21haW4uY29tOkFCQzEyMy4uLlhZWjc4OQ==
- Save the settings.json file
Enable AI and configure an LLM provider
Theia IDE requires AI features to be enabled and at least one LLM provider configured to power the agent's reasoning.
- Return to Settings and under AI Features, select AI Enablement
-
Check the Enable AI box to activate Theia's AI capabilities
- Under AI Features, choose your preferred LLM provider (e.g., Anthropic, OpenAI, Google, Hugging Face) and enter your API key
With the MCP server registered and an LLM provider configured, Theia's AI agents are ready to query live Databricks data through Connect AI.
Step 3: Query live Databricks data from the Theia AI Chat
With the integration complete, use the Theia AI Chat panel to interact with live Databricks data.
- Open the AI Chat panel from the right sidebar of the Theia IDE
- At the bottom of the chat, click the Toggle Capabilities Configuration icon (or press Ctrl + Shift + .) to open the capabilities panel
-
Under Generic Capabilities, expand MCP and check the cdata server (and any specific tools you want to expose) to make the Connect AI tools available to the agent
-
Type @AppTester in the chat input followed by your prompt, for example:
- List all catalogs in my cdata mcp
- Show the available schemas and tables for Databricks
- Query the top 5 records from a table in Databricks data
-
The agent calls the Connect AI MCP Server and returns live results from Databricks data
At this point, your Theia IDE communicates with the Connect AI MCP Server and retrieves live Databricks data through remote MCP directly from the editor.
Get CData Connect AI
To access hundreds of SaaS, Big Data, and NoSQL sources directly from your cloud applications, try CData Connect AI today! Start a free 14-day trial of CData Connect AI today, and as always, our world-class Support Team is available to assist you with any questions you may have.