Advanced Developer Guide - Multi-Agent Account Research with CrewAI

Build multi-agent AI crews that query live enterprise data, analyze account health, and generate executive briefs. This guide walks through creating a Python application that combines CrewAI multi-agent orchestration with CData Connect AI to automate account research using three specialized agents.

NOTE: While this guide uses Google Sheets as the data source, the same principles apply to any of the 350+ data sources CData Connect AI supports.

By the end of this guide, the application can:

Connect a CrewAI agent to 300+ enterprise data sources through CData Connect AI
Define specialized agents with distinct roles, goals, and tool assignments
Build MCP tool wrappers using the @tool decorator from crewai.tools
Chain tasks with context dependencies to create a sequential data pipeline
Assemble a Crew that orchestrates agents through a research-to-report workflow
Generate structured Markdown account briefs from a single CLI command

Architecture Overview

The application uses the Model Context Protocol (MCP) to bridge CrewAI agents with enterprise data sources:

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│                 │     │                  │     │                 │
│  CrewAI Agents  │---->│  CData Connect   │---->│  Data Sources   │
│  (3 Agents)     │     │  AI MCP Server   │     │  (350+ types)   │
│                 │<----│                  │<----│                 │
└─────────────────┘     └──────────────────┘     └─────────────────┘
        |                        |
        |    Tool Discovery      |
        |    & Execution         |
        v                        |
┌─────────────────┐              |
│                 │              |
│  OpenAI API     │--------------┘
│  (GPT-4o)       │   Natural Language
│                 │   to SQL Translation
└─────────────────┘

How it works:

The Python application connects to CData Connect AI
The CData Connect AI MCP exposes tools for data discovery and querying
CrewAI agents use these tools to execute SQL queries against live data sources
Each agent processes its assigned data domain and passes results downstream
The final agent produces a formatted Markdown executive brief

Prerequisites

This guide requires the following:

Python 3.10+ installed on the system (Download Python) (CrewAI requires 3.10 or later)
pip package installer (included with Python 3.4+). Verify with pip --version
An OpenAI API key (requires a paid account)
A CData Connect AI account (free trial available)
A Google account for the sample Google Sheets data
Git for cloning the repository

Getting Started

Overview

Here is a quick overview of the steps:

Set up sample data in Google Sheets
Configure CData Connect AI and create a Personal Access Token
Set up the Python project and install dependencies
Understand the code architecture
Build and run the crew application

Step 1: Set up sample data in Google Sheets

The sample Google Sheet contains customer data to demonstrate the capabilities. This dataset includes accounts, sales opportunities, support tickets, and usage metrics.

Navigate to the sample customer health spreadsheet
Click File > Make a copy to save it to Google Drive
Give it a memorable name (e.g., "demo_organization"). This name is needed later when configuring the connection.

The spreadsheet contains four sheets:

account: Company information (name, industry, revenue, employees)
opportunity: Sales pipeline data (stage, amount, probability)
tickets: Support tickets (priority, status, description)
usage: Product usage metrics (job runs, records processed)

Step 2: Configure CData Connect AI

2.1 Sign Up or Log In

Navigate to https://www.cdata.com/ai/signup/ to create a new account, or https://cloud.cdata.com/ to log in
Complete the registration process if creating a new account

2.2 Add a Google Sheets connection

Once logged in, click Sources in the left navigation menu and click Add Connection
Select Google Sheets from the Add Connection panel
Configure the connection:
- Set the Spreadsheet property to the name of the copied sheet (e.g., "demo_organization")
- Click Sign in to authenticate with Google OAuth
After authentication, navigate to the Permissions tab and verify the user has access

2.3 Create a Personal Access Token

The Python application uses a Personal Access Token (PAT) to authenticate with Connect AI.

Click the Gear icon in the top right to open Settings
Go to the Access Tokens section
Click Create PAT
Give the token a name (e.g., "CrewAI Account Research") and click Create
Important: Copy the token immediately. It is only shown once!

Step 3: Set up the Python project

3.1 Clone from GitHub (recommended)

Clone the complete project with all source files:


git clone https://github.com/CDataSoftware/connectai-crewai-account-research.git
cd connectai-crewai-account-research
pip install -r requirements.txt

3.2 Alternative: Create from scratch

Create a new project directory and install dependencies:


mkdir connectai-crewai-account-research
cd connectai-crewai-account-research
pip install crewai crewai-tools langchain langchain-openai python-dotenv requests

Then create the source files described in Steps 4 and 5.

3.3 Configure environment variables

Create a .env file in the project root:


cp .env.example .env

Open .env in a text editor and fill in the credentials:


# OpenAI Configuration (required for CrewAI)
OPENAI_API_KEY=sk-proj-...

# CData Connect AI Configuration
[email protected]
CDATA_ACCESS_TOKEN=your-personal-access-token-here

Replace the placeholder values with actual credentials.

Step 4: Understanding the code architecture

The project consists of six Python modules that work together:

4.1 Config Class

Handles configuration and credential management:
from dataclasses import dataclass import os import base64 from dotenv import load_dotenv load_dotenv() @dataclass class Config: """Configuration for the Connect AI CrewAI Account Research crew.""" openai_api_key: str cdata_email: str cdata_access_token: str openai_model: str = "gpt-4o" mcp_server_url: str = "https://mcp.cloud.cdata.com/mcp" @classmethod def from_env(cls) -> "Config": """Create configuration from environment variables.""" openai_api_key = os.getenv("OPENAI_API_KEY") cdata_email = os.getenv("CDATA_EMAIL") cdata_access_token = os.getenv("CDATA_ACCESS_TOKEN") missing = [] if not openai_api_key: missing.append("OPENAI_API_KEY") if not cdata_email: missing.append("CDATA_EMAIL") if not cdata_access_token: missing.append("CDATA_ACCESS_TOKEN") if missing: raise ValueError( f"Missing required environment variables: {', '.join(missing)} " "Please set these in your .env file or environment. " "See .env.example for a template." ) return cls( openai_api_key=openai_api_key, cdata_email=cdata_email, cdata_access_token=cdata_access_token, openai_model=os.getenv("OPENAI_MODEL", "gpt-4o"), mcp_server_url=os.getenv( "MCP_SERVER_URL", "https://mcp.cloud.cdata.com/mcp" ), ) def get_auth_header(self) -> str: """Generate Base64 authentication header.""" credentials = f"{self.cdata_email}:{self.cdata_access_token}" return base64.b64encode(credentials.encode()).decode()
The Config class uses `@dataclass` to load and validate credentials from environment variables, and generates Base64-encoded Basic Auth headers for the MCP server.

4.2 MCP Client

Handles HTTP communication with the CData Connect AI MCP server:


import json
import sys
import requests
from config import Config


class MCPClient:
    """HTTP client for CData Connect AI MCP server."""

    def __init__(self, config: Config):
        self.config = config
        self._tools = []
        self.session = requests.Session()
        self.session.headers.update({
            "Content-Type": "application/json",
            "Accept": "application/json, text/event-stream",
            "Authorization": f"Basic {config.get_auth_header()}",
            "User-Agent": (
                "CDataConnectAI-CrewAIAgent "
                f"(Python/{sys.version_info.major}.{sys.version_info.minor})"
            ),
        })

    @staticmethod
    def _parse_sse_response(response_text: str) -> dict:
        """Parse Server-Sent Events response."""
        for line in response_text.split("
"):
            line = line.strip()
            if line.startswith("data: "):
                return json.loads(line[6:])
        raise ValueError(f"No data found in SSE response: {response_text[:200]}")

    def list_tools(self) -> list:
        """Discover available tools from the MCP server."""
        response = self.session.post(
            self.config.mcp_server_url,
            json={"jsonrpc": "2.0", "method": "tools/list", "params": {}, "id": 1},
        )
        response.raise_for_status()
        result = self._parse_sse_response(response.text)
        self._tools = result.get("result", {}).get("tools", [])
        return self._tools

    def call_tool(self, tool_name: str, arguments: dict) -> dict:
        """Execute a tool on the MCP server."""
        response = self.session.post(
            self.config.mcp_server_url,
            json={
                "jsonrpc": "2.0",
                "method": "tools/call",
                "params": {"name": tool_name, "arguments": arguments},
                "id": 2,
            },
        )
        response.raise_for_status()
        result = self._parse_sse_response(response.text)
        return result.get("result", {})

    def get_catalogs(self):
        """List available data source connections."""
        return self.call_tool("getCatalogs", {})

    def get_schemas(self, catalog_name: str):
        """Get schemas for a catalog."""
        return self.call_tool("getSchemas", {"catalogName": catalog_name})

    def get_tables(self, catalog_name: str, schema_name: str):
        """Get tables for a schema."""
        return self.call_tool("getTables", {
            "catalogName": catalog_name, "schemaName": schema_name
        })

    def get_columns(self, catalog_name: str, schema_name: str, table_name: str):
        """Get columns for a table."""
        return self.call_tool("getColumns", {
            "catalogName": catalog_name,
            "schemaName": schema_name,
            "tableName": table_name,
        })

    def query_data(self, query: str):
        """Execute a SQL query."""
        return self.call_tool("queryData", {"query": query})

The MCPClient communicates via JSON-RPC 2.0, parses Server-Sent Events (SSE) responses, and provides convenience methods for common MCP operations.

4.3 Tool Wrappers

Each function wraps a CData Connect AI MCP call so CrewAI agents can query enterprise data:


from crewai.tools import tool
from config import Config
from mcp_client import MCPClient

# Shared client instance - created once, reused across all tool calls
_config = Config.from_env()
_client = MCPClient(_config)

# Connection identifiers - update these to match the Connect AI setup
CATALOG = "GoogleSheets1"
SCHEMA = "GoogleSheets"


@tool("get_account_info")
def get_account_info(account_name: str):
    """Query the account table for company profile information.

    Returns account details including name, industry, annual_revenue,
    employee_count, and region.
    """
    query = (
        f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] "
        f"WHERE [Name] = '{account_name}'"
    )
    return _client.query_data(query)


@tool("get_opportunities")
def get_opportunities(account_name: str):
    """Query the opportunity table for active sales pipeline data.

    Returns open opportunities including stage, amount, probability,
    and expected close_date for the given account.
    """
    query = (
        f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_opportunity] "
        f"WHERE [AccountId] IN "
        f"(SELECT [Id1] FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] "
        f"WHERE [Name] = '{account_name}')"
    )
    return _client.query_data(query)


@tool("get_support_tickets")
def get_support_tickets(account_name: str):
    """Query the tickets table for open support requests.

    Returns ticket details including priority, status, subject,
    and created_date for the given account.
    """
    query = (
        f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_tickets] "
        f"WHERE [AccountId] IN "
        f"(SELECT [Id1] FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] "
        f"WHERE [Name] = '{account_name}')"
    )
    return _client.query_data(query)


@tool("get_usage_metrics")
def get_usage_metrics(account_name: str):
    """Query the usage table for product adoption metrics.

    Returns usage data including date, jobs_run, and
    records_processed for the given account.
    """
    query = (
        f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_usage] "
        f"WHERE [ACCOUNT_ID] IN "
        f"(SELECT [Id1] FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] "
        f"WHERE [Name] = '{account_name}')"
    )
    return _client.query_data(query)

The @tool decorator converts plain Python functions into BaseTool instances that CrewAI agents can use, with docstrings guiding tool selection.

Learn more about CrewAI tools.

4.4 Agent definitions

Three specialized agents, each with a distinct role:


from crewai import Agent
from tools import (
    get_account_info,
    get_opportunities,
    get_support_tickets,
    get_usage_metrics,
)

# Agent 1: Researcher - gathers raw account & opportunity data
researcher = Agent(
    role="Account Researcher",
    goal="Gather comprehensive account data from CRM",
    backstory=(
        "You are an expert at finding all relevant customer information "
        "from CRM systems. You excel at retrieving account profiles and "
        "sales pipeline data. You always present data in a clear, "
        "structured format so other team members can quickly understand "
        "the account landscape."
    ),
    tools=[get_account_info, get_opportunities],
    verbose=True,
    allow_delegation=False,
)

# Agent 2: Analyst - assesses health signals and risk
analyst = Agent(
    role="Health Analyst",
    goal="Assess account health and identify risk signals",
    backstory=(
        "You are skilled at analyzing customer health metrics. You "
        "identify patterns in support tickets and usage data to determine "
        "account risk levels. You classify risk as LOW, MEDIUM, or HIGH "
        "and always back up your assessment with specific metrics."
    ),
    tools=[get_support_tickets, get_usage_metrics],
    verbose=True,
    allow_delegation=False,
)

# Agent 3: Writer - produces the final executive brief
writer = Agent(
    role="Brief Writer",
    goal="Create concise executive summaries",
    backstory=(
        "You are a clear communicator who distills complex customer "
        "insights into actionable executive briefs. You excel at "
        "identifying key metrics and recommending next steps. Your "
        "documents follow a consistent structure: executive summary, "
        "key metrics, risk assessment, and recommended actions."
    ),
    tools=[],  # No tools - synthesizes other agents' work
    verbose=True,
    allow_delegation=False,
)

Each agent has a distinct role, goal, and backstory that guide how CrewAI's underlying LLM approaches the task. The Researcher gathers CRM data, the Analyst assesses health signals, and the Writer synthesizes everything into a deliverable.

Learn more about CrewAI agents.

4.5 Task definitions with context chaining

Tasks define what each agent does and how data flows between them:


from crewai import Task
from agents import researcher, analyst, writer

# Task 1: Research - gather account profile + opportunities
research_task = Task(
    description=(
        "Research the account named {account_name}.
"
        "Gather:
"
        "- Account profile (name, industry, revenue, employees, region)
"
        "- Active opportunities and total pipeline value

"
        "Provide a comprehensive account profile with all available data."
    ),
    agent=researcher,
    expected_output="Complete account profile with opportunity summary",
)

# Task 2: Analysis - assess health signals and risk
analysis_task = Task(
    description=(
        "Analyze health signals for {account_name}.
"
        "Assess:
"
        "- Support ticket volume and severity
"
        "- Product usage trends and license utilization
"
        "- Overall risk level (LOW / MEDIUM / HIGH)

"
        "Provide a risk assessment with supporting metrics."
    ),
    agent=analyst,
    expected_output="Risk assessment with key health metrics",
    context=[research_task],  # Receives Researcher's output
)

# Task 3: Writing - produce the final executive brief
writing_task = Task(
    description=(
        "Create an executive brief for {account_name}.
"
        "Include:
"
        "- Executive summary (2-3 sentences)
"
        "- Key metrics (ARR, usage, support health)
"
        "- Risk assessment
"
        "- 3-5 recommended actions (prioritized)

"
        "Format the output as a clean Markdown document.
"
        "Use clear, concise language suitable for executives.

"
        "IMPORTANT: Output raw markdown only. Do NOT wrap your response "
        "in ```markdown code blocks."
    ),
    agent=writer,
    expected_output="Formatted markdown document",
    context=[research_task, analysis_task],  # Receives both outputs
    output_file="output/{account_name}-brief.md",
)

The context parameter chains tasks together: the Analyst receives the Researcher's output, and the Writer receives both. The {account_name} placeholder is filled at runtime via crew.kickoff().

Learn more about CrewAI tasks.

Context chaining flow:

research_task (no context)
    ↓
analysis_task (context: [research_task])
    ↓
writing_task (context: [research_task, analysis_task])
    ↓
output/{account_name}-brief.md

Step 5: Build the application

The main entry point assembles agents and tasks into a crew with a CLI interface:


import argparse
import sys
from crewai import Crew, Process
from agents import researcher, analyst, writer
from tasks import research_task, analysis_task, writing_task


def create_crew() -> Crew:
    """Create the account research crew."""
    crew = Crew(
        agents=[researcher, analyst, writer],
        tasks=[research_task, analysis_task, writing_task],
        process=Process.sequential,  # Run tasks in order
        verbose=True,
    )
    return crew


def main():
    """Parse CLI arguments and run the crew."""
    parser = argparse.ArgumentParser(
        description="Automated Account Research with CrewAI + CData Connect AI"
    )
    parser.add_argument(
        "--account",
        required=True,
        help="Account name to research (e.g. 'Acme Corp')",
    )
    args = parser.parse_args()

    print(f"
{'='*60}")
    print(f"  Account Research Crew")
    print(f"  Target: {args.account}")
    print(f"{'='*60}
")

    crew = create_crew()

    try:
        result = crew.kickoff(inputs={"account_name": args.account})
    except Exception as exc:
        print(f"
[ERROR] Crew execution failed: {exc}", file=sys.stderr)
        sys.exit(1)

    print(f"
{'='*60}")
    print(f"  Account brief generated: output/{args.account}-brief.md")
    print(f"{'='*60}
")
    print(result)


if __name__ == "__main__":
    main()

Process.sequential ensures tasks run in order, and crew.kickoff(inputs={"account_name": ...}) fills the placeholder in all task descriptions.

Learn more about CrewAI processes.

Step 6: Run the application

With everything configured, run the crew:


python src/crew.py --account "Acme Corp"

Expected output:

============================================================
  Account Research Crew
  Target: Acme Corp
============================================================

[Researcher Agent]: Gathering account data...
  ✓ Account: Acme Corp
  ✓ ARR: $450K
  ✓ Industry: Manufacturing
  ✓ Open opportunities: $180K

[Analyst Agent]: Analyzing health signals...
  ✓ Support tickets: 12 open (3 high priority)
  ✓ Product usage: 94% license utilization
  ⚠️ Risk score: MEDIUM

[Writer Agent]: Creating account brief...
  ✓ Executive summary written
  ✓ Key metrics compiled
  ✓ Action items identified

============================================================
  Account brief generated: output/Acme Corp-brief.md
============================================================

Note: With verbose=True, the console output includes detailed agent reasoning and thought chains. The output above is a simplified summary of what each agent produces.

Step 7: Example

Run the crew against different accounts:


python src/crew.py --account "Acme Corp"
python src/crew.py --account "TechStart Inc"
python src/crew.py --account "Global Manufacturing Ltd"

Example generated brief

The output file follows this structure (see output/sample_briefs/acme-corp-brief.md for a full example):

# Acme Corp - Account Health Brief
**Generated**: 2025-02-02 | **Risk Level**: MEDIUM

## Executive Summary
Acme Corp is a $450K manufacturing customer showing strong product
adoption but concerning support trends. Immediate attention required
on escalated tickets before renewal discussion.

## Key Metrics
- ARR: $450,000
- Contract Renewal: 45 days
- Support Health: ⚠️ 3 P1 tickets
- Usage: ✅ 94% license utilization
- Pipeline: $180K expansion opportunity

## Risk Assessment
**Overall Risk**: MEDIUM

## Recommended Actions
1. **[HIGH]** Schedule technical review for P1 tickets
2. **[MEDIUM]** Executive business review within 2 weeks
3. **[LOW]** Qualify expansion opportunity

Step 8: Available MCP Tools

The CData Connect AI MCP server exposes these tools:

Tool	Description
`getCatalogs`	List available data source connections
`getSchemas`	Get schemas for a specific catalog
`getTables`	Get tables in a schema
`getColumns`	Get column metadata for a table
`queryData`	Execute SQL queries
`getProcedures`	List stored procedures
`getProcedureParameters`	Get procedure parameter details
`executeProcedure`	Execute stored procedures

The four CrewAI tool wrappers in this project map to specific agents:

CrewAI Tool	MCP Tool Used	Assigned To	Data Domain
`get_account_info`	`queryData`	Researcher	Account profiles
`get_opportunities`	`queryData`	Researcher	Sales pipeline
`get_support_tickets`	`queryData`	Analyst	Support requests
`get_usage_metrics`	`queryData`	Analyst	Product usage

Step 9: SQL Query format

When querying data, use fully qualified table names:


SELECT [column1], [column2]
FROM [CatalogName].[SchemaName].[TableName]
WHERE [column1] = 'value'
ORDER BY [column2]

For example, to query the account table:


SELECT [Name], [annual_revenue], [industry]
FROM [GoogleSheets1].[GoogleSheets].[demo_organization_account]
ORDER BY [annual_revenue] DESC

To query related tables using subqueries:


SELECT * FROM [GoogleSheets1].[GoogleSheets].[demo_organization_opportunity]
WHERE [AccountId] IN (
    SELECT [Id1] FROM [GoogleSheets1].[GoogleSheets].[demo_organization_account]
    WHERE [Name] = 'Acme Corp'
)

Tip: To discover catalog and schema names programmatically:


from config import Config
from mcp_client import MCPClient

config = Config.from_env()
client = MCPClient(config)
print(client.get_catalogs())
print(client.get_schemas("GoogleSheets1"))
print(client.get_tables("GoogleSheets1", "GoogleSheets"))

Troubleshooting

CrewAI installation errors

CrewAI requires Python 3.10 or later. Check the installed version:


python --version

For older versions, install Python 3.10+ and create a new virtual environment.

OpenAI API Errors

Verify the OPENAI_API_KEY is valid and has available credits.
CrewAI works best with GPT-4o. A different model can be set via the OPENAI_MODEL environment variable.

Authentication errors

Verify the CData email and PAT are correct in .env.
Ensure the PAT has not expired.
Check that the Connect AI account is active.

No tools available

Confirm at least one data source is connected in Connect AI.
Check that the user has permissions to access the connection.

Tool calling failures

Ensure the CData Connect AI instance has at least one active data source connected.
Use fully qualified table names: [Catalog].[Schema].[Table].
Check that the catalog and schema names in src/tools.py match the Connect AI configuration.
Verify column names exist using client.get_columns().
Review the verbose output for specific error messages from the Connect AI MCP server.

Output file not generated

Ensure the output/ directory exists (created by git clone).
Check file permissions on the output/ directory.
Review the agent's verbose output. If an upstream task fails, downstream tasks may not execute.

Next Steps

Now that the multi-agent crew is up and running:

Add more agents: Create additional specialists (competitive intelligence, financial analyst) following the pattern in src/agents.py.
Connect more data sources: Add Salesforce, HubSpot, Snowflake, or any of 350+ supported sources through the CData Connect AI dashboard.
Switch process type: Change from Process.sequential to Process.parallel for independent tasks.
Change output format: Modify the Writer task description to produce JSON instead of Markdown.
Add memory and caching: Explore CrewAI's memory features for multi-session research.
Explore advanced patterns: The CrewAI documentation covers delegation, hierarchical processes, and human-in-the-loop workflows.

Resources

GitHub Repository - Complete source code
CrewAI Documentation - Advanced crew patterns, memory, and agent collaboration
CData Connect AI Documentation - Connect more data sources and configure governed access
CData Prompt Library - Example prompts for various use cases
OpenAI API Documentation - OpenAI models and API reference
Model Context Protocol - MCP specification and documentation
CData Community Forums - Community support and discussions

Get Started with CData Connect AI

Ready to build AI-powered data applications? CData Connect AI provides governed, secure access to 350+ enterprise data sources for AI applications. Multi-agent crews can query live business data from Salesforce, Snowflake, HubSpot, Google Sheets, databases, and more through a single MCP interface.

Feedback

CData is the data layer that makes AI work in production—live connectivity and replication across 350+ sources, semantic context, and built-in governance. Powering AI for Databricks, Microsoft, Google, Palantir, and 10,000+ customers worldwide.