Connecting Mastra with Azure Data Lake Storage Data via CData Connect AI MCP Server

Somya Sharma
Somya Sharma
Technical Marketing Engineer
Leverage the CData Connect AI MCP Server to enable Mastra agents to securely query, read, and act on real-time data across 300+ enterprise sources no replication required.

Mastra is designed for developers and enterprise teams building intelligent, composable AI agents. Its modular framework and declarative architecture make it simple to orchestrate agents, integrate LLMs, and automate data-driven workflows. But when agents need to work with data beyond their local memory or predefined APIs, many implementations rely on custom middleware or scheduled syncs to copy data from external systems into local stores. This approach adds complexity, increases maintenance overhead, introduces latency, and limits the real-time potential of your agents.

CData Connect AI bridges this gap with live, direct connectivity to more than 300 enterprise applications, databases, ERPs, and analytics platforms. Through CData's remote Model Context Protocol (MCP) Server, Mastra agents can securely query, read, and act on real-time data without replication. The result is grounded responses, faster reasoning, and automated decision-making across systems all with stronger governance and fewer moving parts.

This article outlines the steps required to configure CData Connect AI MCP connectivity, register the MCP server in Mastra Studio, and build an agent that queries live Azure Data Lake Storage data in real time.

Prerequisites

Before starting, make sure you have:

  1. A CData Connect AI account
  2. Node.js 18+ and npm installed
  3. A working Mastra project (created via npm create mastra@latest)
  4. Access to Azure Data Lake Storage

Credentials checklist

Ensure you have these credentials ready for the connection:

  1. USERNAME: Your CData email login
  2. PAT: Connect AI, go to Settings and click on Access Tokens (copy once)
  3. MCP_BASE_URL: https://mcp.cloud.cdata.com/mcp

Step 1: Configure Azure Data Lake Storage connectivity for Mastra

Connectivity to Azure Data Lake Storage from Mastra is made possible through CData Connect AI Remote MCP. To interact with Azure Data Lake Storage data from Mastra, we start by creating and configuring a Azure Data Lake Storage connection in CData Connect AI.

  1. Log into Connect AI, click Sources, and then click Add Connection
  2. Select "Azure Data Lake Storage" from the Add Connection panel
  3. Enter the necessary authentication properties to connect to Azure Data Lake Storage.

    Authenticating to a Gen 1 DataLakeStore Account

    Gen 1 uses OAuth 2.0 in Entra ID (formerly Azure AD) for authentication.

    For this, an Active Directory web application is required. You can create one as follows:

    1. Sign in to your Azure Account through the .
    2. Select "Entra ID" (formerly Azure AD).
    3. Select "App registrations".
    4. Select "New application registration".
    5. Provide a name and URL for the application. Select Web app for the type of application you want to create.
    6. Select "Required permissions" and change the required permissions for this app. At a minimum, "Azure Data Lake" and "Windows Azure Service Management API" are required.
    7. Select "Key" and generate a new key. Add a description, a duration, and take note of the generated key. You won't be able to see it again.

    To authenticate against a Gen 1 DataLakeStore account, the following properties are required:

    • Schema: Set this to ADLSGen1.
    • Account: Set this to the name of the account.
    • OAuthClientId: Set this to the application Id of the app you created.
    • OAuthClientSecret: Set this to the key generated for the app you created.
    • TenantId: Set this to the tenant Id. See the property for more information on how to acquire this.
    • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.

    Authenticating to a Gen 2 DataLakeStore Account

    To authenticate against a Gen 2 DataLakeStore account, the following properties are required:

    • Schema: Set this to ADLSGen2.
    • Account: Set this to the name of the account.
    • FileSystem: Set this to the file system which will be used for this account.
    • AccessKey: Set this to the access key which will be used to authenticate the calls to the API. See the property for more information on how to acquire this.
    • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.
  4. Click Save & Test
  5. Navigate to the Permissions tab in the Add Azure Data Lake Storage Connection page and update the User-based permissions.

Add a Personal Access Token

A Personal Access Token (PAT) is used to authenticate the connection to Connect AI from Mastra. It is best practice to create a separate PAT for each service to maintain granularity of access.

  1. Click on the Gear icon () at the top right of the Connect AI app to open the settings page.
  2. On the Settings page, go to the Access Tokens section and click Create PAT.
  3. Give the PAT a name and click Create.
  4. The personal access token is only visible at creation, so be sure to copy it and store it securely for future use.

With the connection configured and a PAT generated, we are ready to connect to Azure Data Lake Storage data from Mastra.

Step 2: Set up the Mastra project

  • Open a terminal and navigate to your desired folder
  • Create a new project:
    npm create mastra@latest
  • Open the folder in VS Code
  • Install the required Mastra dependencies:
    npm install @mastra/core @mastra/libsql @mastra/memory
  • Then install the MCP integration package separately:
    npm install @mastra/mcp
  • Step 3: Configure environment variables

    Create a .env file at the project root with the following keys:

    OPENAI_API_KEY=sk-...
    [email protected]
    CDATA_CONNECT_AI_PASSWORD=your_PAT
    

    Restart your dev server after saving changes:

    npm run dev

    Step 4: Add the CData Connect AI agent

    Create a file src/mastra/agents/connect-ai-agent.ts with the following code:

    import { Agent } from "@mastra/core/agent";
    import { Memory } from "@mastra/memory";
    import { LibSQLStore } from "@mastra/libsql";
    import { MCPClient } from "@mastra/mcp";
    
    const mcpClient = new MCPClient({
      servers: {
        cdataConnectAI: {
          url: new URL("https://connect.cdata.com/mcp/"),
          requestInit: {
            headers: {
              Authorization: `Basic ${Buffer.from(
                `${process.env.CDATA_CONNECT_AI_USER}:${process.env.CDATA_CONNECT_AI_PASSWORD}`
              ).toString("base64")}`,
            },
          },
        },
      },
    });
    
    export const connectAIAgent = new Agent({
      name: "Connect AI Agent",
      instructions: "You are a data exploration and analysis assistant with access to CData Connect AI.",
      model: "openai/gpt-4o-mini",
      tools: await mcpClient.getTools(),
      memory: new Memory({
        storage: new LibSQLStore({ url: "file:../mastra.db" }),
      }),
    });
    

    Step 5: Update index.ts to register the agent

    Replace the contents of src/mastra/index.ts with:

    import { Mastra } from "@mastra/core/mastra";
    import { PinoLogger } from "@mastra/loggers";
    import { LibSQLStore } from "@mastra/libsql";
    import { connectAIAgent } from "./agents/connect-ai-agent.js";
    
    export const mastra = new Mastra({
      agents: { connectAIAgent },
      storage: new LibSQLStore({ url: "file:../mastra.db" }),
      logger: new PinoLogger({ name: "Mastra", level: "info" }),
      observability: { default: { enabled: true } },
    });
    

    Step 6: Run and verify the connection

    Start your Mastra server:

    npm run dev

    Step 7: Run a live query in Mastra Studio

    In Mastra Studio, open the chat interface and enter one of the following sample prompts:

    List available catalogs from my connected data sources.

    Build real-time, data-aware agents with Mastra and CData

    Mastra and CData Connect AI together enable powerful AI-driven workflows where agents have live access to enterprise data and act intelligently without sync pipelines or manual integration logic.

    Start your free trial today to see how CData can empower Mastra with live, secure access to 300+ external systems.

Ready to get started?

Learn more about CData Connect AI or sign up for free trial access:

Free Trial