Use ChatGPT to Talk to Your Azure Data Lake Storage Data via CData Connect AI

Jerod Johnson
Jerod Johnson
Director, Technology Evangelism
Leverage the CData Connect AI Remote MCP Server to enable ChatGPT to securely answer questions and take actions on your Azure Data Lake Storage data for you.

ChatGPT is an AI chatbot developed by OpenAI, launched in November 2022. Based on large language models (LLMs), it enables users to refine and steer conversations through natural language processing. ChatGPT's developer mode, available to Plus and Pro subscribers, provides full Model Context Protocol (MCP) support for connecting to external data sources and tools.

CData Connect AI offers a dedicated cloud-to-cloud interface for connecting to Azure Data Lake Storage data. The CData Connect AI Remote MCP Server enables secure communication between ChatGPT and Azure Data Lake Storage. This allows you to ask questions and take actions on your Azure Data Lake Storage data using ChatGPT, all without the need for data replication to a natively supported database. With its inherent optimized data processing capabilities, CData Connect AI efficiently channels all supported SQL operations, including filters and JOINs, directly to Azure Data Lake Storage. This leverages server-side processing to swiftly deliver the requested Azure Data Lake Storage data.

Step 1: Configure Azure Data Lake Storage Connectivity for ChatGPT

Connectivity to Azure Data Lake Storage from ChatGPT is made possible through CData Connect AI Remote MCP. To interact with Azure Data Lake Storage data from ChatGPT, we start by creating and configuring a Azure Data Lake Storage connection in CData Connect AI.

  1. Log into Connect AI, click Sources, and then click Add Connection
  2. Select "Azure Data Lake Storage" from the Add Connection panel
  3. Enter the necessary authentication properties to connect to Azure Data Lake Storage.

    Authenticating to a Gen 1 DataLakeStore Account

    Gen 1 uses OAuth 2.0 in Entra ID (formerly Azure AD) for authentication.

    For this, an Active Directory web application is required. You can create one as follows:

    1. Sign in to your Azure Account through the .
    2. Select "Entra ID" (formerly Azure AD).
    3. Select "App registrations".
    4. Select "New application registration".
    5. Provide a name and URL for the application. Select Web app for the type of application you want to create.
    6. Select "Required permissions" and change the required permissions for this app. At a minimum, "Azure Data Lake" and "Windows Azure Service Management API" are required.
    7. Select "Key" and generate a new key. Add a description, a duration, and take note of the generated key. You won't be able to see it again.

    To authenticate against a Gen 1 DataLakeStore account, the following properties are required:

    • Schema: Set this to ADLSGen1.
    • Account: Set this to the name of the account.
    • OAuthClientId: Set this to the application Id of the app you created.
    • OAuthClientSecret: Set this to the key generated for the app you created.
    • TenantId: Set this to the tenant Id. See the property for more information on how to acquire this.
    • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.

    Authenticating to a Gen 2 DataLakeStore Account

    To authenticate against a Gen 2 DataLakeStore account, the following properties are required:

    • Schema: Set this to ADLSGen2.
    • Account: Set this to the name of the account.
    • FileSystem: Set this to the file system which will be used for this account.
    • AccessKey: Set this to the access key which will be used to authenticate the calls to the API. See the property for more information on how to acquire this.
    • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.
  4. Click Save & Test

With the connection configured, we are ready to connect to Azure Data Lake Storage data from ChatGPT.

Step 2: Connect ChatGPT to CData Connect AI

Follow these steps to add a CData Connect AI connector in ChatGPT:

  1. Sign in to ChatGPT with a Plus or Pro subscription.
  2. Navigate to Settings > Connectors.
  3. Under the "Advanced settings" section, toggle on Developer mode.
  4. Once developer mode is enabled, return to the Connectors page and click Create.
  5. Enter a name for the connector (such as Connect AI MCP).
  6. In the Remote MCP server URL field, enter:
    https://mcp.cloud.cdata.com/mcp
  7. Set Authentication to "OAuth."
  8. Check I trust this application and click Create
  9. You will be redirected to CData Connect AI's OAuth authorization page. Sign in with your Connect AI credentials.
  10. Review the permissions requested and click Authorize to grant ChatGPT access to your Connect AI resources.
  11. After successful authorization, you will be redirected back to ChatGPT.
  12. The Connect AI MCP Server will now appear in your available connectors list where you can manage the connector and enable/disable actions (tools).

Step 3: Explore Live Azure Data Lake Storage Data with ChatGPT

  1. Start a new conversation in ChatGPT.
  2. In the tool picker, enable "Developer Mode."
  3. Enable your Connect AI MCP Sever.
  4. You can now start exploring your data with natural language prompts. ChatGPT will use the Connect AI MCP server to query your live Azure Data Lake Storage data. Example prompts:
    • "Show me all customers from the last 30 days"
    • "What are my top performing products?"
    • "Analyze sales trends for this quarter"
    • "List all active projects and their current status"
    Refer to CData's prompt library for more prompt ideas.
  5. ChatGPT will translate your natural language queries into SQL and execute them against your Azure Data Lake Storage data through the Connect AI MCP server.

IMPORTANT: ChatGPT's developer mode provides full read/write MCP access. Be cautious when allowing write operations to your Azure Data Lake Storage data. Always review and confirm any data modifications before allowing them to proceed.

NOTE: Developer mode is in beta and only available to ChatGPT Plus and Pro subscribers. Refer to OpenAI's documentation for the latest setup information.

Get CData Connect AI

To get live data access to 300+ SaaS, Big Data, and NoSQL sources directly from your cloud applications, try CData Connect AI today!

Ready to get started?

Learn more about CData Connect AI or sign up for free trial access:

Free Trial