Advanced Developer Guide - Multi-Agent Account Research with CrewAI

Build multi-agent AI crews that query live enterprise data, analyze account health, and generate executive briefs. This guide walks through creating a Python application that combines CrewAI multi-agent orchestration with CData Connect AI to automate account research using three specialized agents.

NOTE: While this guide uses Google Sheets as the data source, the same principles apply to any of the 350+ data sources CData Connect AI supports.

By the end of this guide, the application can:

  • Connect a CrewAI agent to 300+ enterprise data sources through CData Connect AI
  • Define specialized agents with distinct roles, goals, and tool assignments
  • Build MCP tool wrappers using the @tool decorator from crewai.tools
  • Chain tasks with context dependencies to create a sequential data pipeline
  • Assemble a Crew that orchestrates agents through a research-to-report workflow
  • Generate structured Markdown account briefs from a single CLI command

Architecture Overview

The application uses the Model Context Protocol (MCP) to bridge CrewAI agents with enterprise data sources:

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│                 │     │                  │     │                 │
│  CrewAI Agents  │---->│  CData Connect   │---->│  Data Sources   │
│  (3 Agents)     │     │  AI MCP Server   │     │  (350+ types)   │
│                 │<----│                  │<----│                 │
└─────────────────┘     └──────────────────┘     └─────────────────┘
        |                        |
        |    Tool Discovery      |
        |    & Execution         |
        v                        |
┌─────────────────┐              |
│                 │              |
│  OpenAI API     │--------------┘
│  (GPT-4o)       │   Natural Language
│                 │   to SQL Translation
└─────────────────┘

How it works:

  1. The Python application connects to CData Connect AI
  2. The CData Connect AI MCP exposes tools for data discovery and querying
  3. CrewAI agents use these tools to execute SQL queries against live data sources
  4. Each agent processes its assigned data domain and passes results downstream
  5. The final agent produces a formatted Markdown executive brief

Prerequisites

This guide requires the following:

  • Python 3.10+ installed on the system (Download Python) (CrewAI requires 3.10 or later)
  • pip package installer (included with Python 3.4+). Verify with pip --version
  • An OpenAI API key (requires a paid account)
  • A CData Connect AI account (free trial available)
  • A Google account for the sample Google Sheets data
  • Git for cloning the repository

Getting Started

Overview

Here is a quick overview of the steps:

  1. Set up sample data in Google Sheets
  2. Configure CData Connect AI and create a Personal Access Token
  3. Set up the Python project and install dependencies
  4. Understand the code architecture
  5. Build and run the crew application

Step 1: Set up sample data in Google Sheets

The sample Google Sheet contains customer data to demonstrate the capabilities. This dataset includes accounts, sales opportunities, support tickets, and usage metrics.

  1. Navigate to the sample customer health spreadsheet
  2. Click File > Make a copy to save it to Google Drive
  3. Give it a memorable name (e.g., "demo_organization"). This name is needed later when configuring the connection.

The spreadsheet contains four sheets:

  • account: Company information (name, industry, revenue, employees)
  • opportunity: Sales pipeline data (stage, amount, probability)
  • tickets: Support tickets (priority, status, description)
  • usage: Product usage metrics (job runs, records processed)

Step 2: Configure CData Connect AI

2.1 Sign Up or Log In

  1. Navigate to https://www.cdata.com/ai/signup/ to create a new account, or https://cloud.cdata.com/ to log in
  2. Complete the registration process if creating a new account

2.2 Add a Google Sheets connection

  1. Once logged in, click Sources in the left navigation menu and click Add Connection
  2. Select Google Sheets from the Add Connection panel
  3. Configure the connection:
    • Set the Spreadsheet property to the name of the copied sheet (e.g., "demo_organization")
    • Click Sign in to authenticate with Google OAuth
  4. After authentication, navigate to the Permissions tab and verify the user has access

2.3 Create a Personal Access Token

The Python application uses a Personal Access Token (PAT) to authenticate with Connect AI.

  1. Click the Gear icon in the top right to open Settings
  2. Go to the Access Tokens section
  3. Click Create PAT
  4. Give the token a name (e.g., "CrewAI Account Research") and click Create
  5. Important: Copy the token immediately. It is only shown once!

Step 3: Set up the Python project

3.1 Clone from GitHub (recommended)

Clone the complete project with all source files:

git clone https://github.com/CDataSoftware/connectai-crewai-account-research.git cd connectai-crewai-account-research pip install -r requirements.txt

3.2 Alternative: Create from scratch

Create a new project directory and install dependencies:

mkdir connectai-crewai-account-research cd connectai-crewai-account-research pip install crewai crewai-tools langchain langchain-openai python-dotenv requests

Then create the source files described in Steps 4 and 5.

3.3 Configure environment variables

Create a .env file in the project root:

cp .env.example .env

Open .env in a text editor and fill in the credentials:

# OpenAI Configuration (required for CrewAI) OPENAI_API_KEY=sk-proj-... # CData Connect AI Configuration [email protected] CDATA_ACCESS_TOKEN=your-personal-access-token-here

Replace the placeholder values with actual credentials.


Step 4: Understanding the code architecture

The project consists of six Python modules that work together:

4.1 Config Class

Handles configuration and credential management:

from dataclasses import dataclass import os import base64 from dotenv import load_dotenv load_dotenv() @dataclass class Config: """Configuration for the Connect AI CrewAI Account Research crew.""" openai_api_key: str cdata_email: str cdata_access_token: str openai_model: str = "gpt-4o" mcp_server_url: str = "https://mcp.cloud.cdata.com/mcp" @classmethod def from_env(cls) -> "Config": """Create configuration from environment variables.""" openai_api_key = os.getenv("OPENAI_API_KEY") cdata_email = os.getenv("CDATA_EMAIL") cdata_access_token = os.getenv("CDATA_ACCESS_TOKEN") missing = [] if not openai_api_key: missing.append("OPENAI_API_KEY") if not cdata_email: missing.append("CDATA_EMAIL") if not cdata_access_token: missing.append("CDATA_ACCESS_TOKEN") if missing: raise ValueError( f"Missing required environment variables: {', '.join(missing)} " "Please set these in your .env file or environment. " "See .env.example for a template." ) return cls( openai_api_key=openai_api_key, cdata_email=cdata_email, cdata_access_token=cdata_access_token, openai_model=os.getenv("OPENAI_MODEL", "gpt-4o"), mcp_server_url=os.getenv( "MCP_SERVER_URL", "https://mcp.cloud.cdata.com/mcp" ), ) def get_auth_header(self) -> str: """Generate Base64 authentication header.""" credentials = f"{self.cdata_email}:{self.cdata_access_token}" return base64.b64encode(credentials.encode()).decode()

The Config class uses @dataclass to load and validate credentials from environment variables, and generates Base64-encoded Basic Auth headers for the MCP server.

4.2 MCP Client

Handles HTTP communication with the CData Connect AI MCP server:

import json import sys import requests from config import Config class MCPClient: """HTTP client for CData Connect AI MCP server.""" def __init__(self, config: Config): self.config = config self._tools = [] self.session = requests.Session() self.session.headers.update({ "Content-Type": "application/json", "Accept": "application/json, text/event-stream", "Authorization": f"Basic {config.get_auth_header()}", "User-Agent": ( "CDataConnectAI-CrewAIAgent " f"(Python/{sys.version_info.major}.{sys.version_info.minor})" ), }) @staticmethod def _parse_sse_response(response_text: str) -> dict: """Parse Server-Sent Events response.""" for line in response_text.split(" "): line = line.strip() if line.startswith("data: "): return json.loads(line[6:]) raise ValueError(f"No data found in SSE response: {response_text[:200]}") def list_tools(self) -> list: """Discover available tools from the MCP server.""" response = self.session.post( self.config.mcp_server_url, json={"jsonrpc": "2.0", "method": "tools/list", "params": {}, "id": 1}, ) response.raise_for_status() result = self._parse_sse_response(response.text) self._tools = result.get("result", {}).get("tools", []) return self._tools def call_tool(self, tool_name: str, arguments: dict) -> dict: """Execute a tool on the MCP server.""" response = self.session.post( self.config.mcp_server_url, json={ "jsonrpc": "2.0", "method": "tools/call", "params": {"name": tool_name, "arguments": arguments}, "id": 2, }, ) response.raise_for_status() result = self._parse_sse_response(response.text) return result.get("result", {}) def get_catalogs(self): """List available data source connections.""" return self.call_tool("getCatalogs", {}) def get_schemas(self, catalog_name: str): """Get schemas for a catalog.""" return self.call_tool("getSchemas", {"catalogName": catalog_name}) def get_tables(self, catalog_name: str, schema_name: str): """Get tables for a schema.""" return self.call_tool("getTables", { "catalogName": catalog_name, "schemaName": schema_name }) def get_columns(self, catalog_name: str, schema_name: str, table_name: str): """Get columns for a table.""" return self.call_tool("getColumns", { "catalogName": catalog_name, "schemaName": schema_name, "tableName": table_name, }) def query_data(self, query: str): """Execute a SQL query.""" return self.call_tool("queryData", {"query": query})

The MCPClient communicates via JSON-RPC 2.0, parses Server-Sent Events (SSE) responses, and provides convenience methods for common MCP operations.

4.3 Tool Wrappers

Each function wraps a CData Connect AI MCP call so CrewAI agents can query enterprise data:

from crewai.tools import tool from config import Config from mcp_client import MCPClient # Shared client instance - created once, reused across all tool calls _config = Config.from_env() _client = MCPClient(_config) # Connection identifiers - update these to match the Connect AI setup CATALOG = "GoogleSheets1" SCHEMA = "GoogleSheets" @tool("get_account_info") def get_account_info(account_name: str): """Query the account table for company profile information. Returns account details including name, industry, annual_revenue, employee_count, and region. """ query = ( f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] " f"WHERE [Name] = '{account_name}'" ) return _client.query_data(query) @tool("get_opportunities") def get_opportunities(account_name: str): """Query the opportunity table for active sales pipeline data. Returns open opportunities including stage, amount, probability, and expected close_date for the given account. """ query = ( f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_opportunity] " f"WHERE [AccountId] IN " f"(SELECT [Id1] FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] " f"WHERE [Name] = '{account_name}')" ) return _client.query_data(query) @tool("get_support_tickets") def get_support_tickets(account_name: str): """Query the tickets table for open support requests. Returns ticket details including priority, status, subject, and created_date for the given account. """ query = ( f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_tickets] " f"WHERE [AccountId] IN " f"(SELECT [Id1] FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] " f"WHERE [Name] = '{account_name}')" ) return _client.query_data(query) @tool("get_usage_metrics") def get_usage_metrics(account_name: str): """Query the usage table for product adoption metrics. Returns usage data including date, jobs_run, and records_processed for the given account. """ query = ( f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_usage] " f"WHERE [ACCOUNT_ID] IN " f"(SELECT [Id1] FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] " f"WHERE [Name] = '{account_name}')" ) return _client.query_data(query)

The @tool decorator converts plain Python functions into BaseTool instances that CrewAI agents can use, with docstrings guiding tool selection.

Learn more about CrewAI tools.

4.4 Agent definitions

Three specialized agents, each with a distinct role:

from crewai import Agent from tools import ( get_account_info, get_opportunities, get_support_tickets, get_usage_metrics, ) # Agent 1: Researcher - gathers raw account & opportunity data researcher = Agent( role="Account Researcher", goal="Gather comprehensive account data from CRM", backstory=( "You are an expert at finding all relevant customer information " "from CRM systems. You excel at retrieving account profiles and " "sales pipeline data. You always present data in a clear, " "structured format so other team members can quickly understand " "the account landscape." ), tools=[get_account_info, get_opportunities], verbose=True, allow_delegation=False, ) # Agent 2: Analyst - assesses health signals and risk analyst = Agent( role="Health Analyst", goal="Assess account health and identify risk signals", backstory=( "You are skilled at analyzing customer health metrics. You " "identify patterns in support tickets and usage data to determine " "account risk levels. You classify risk as LOW, MEDIUM, or HIGH " "and always back up your assessment with specific metrics." ), tools=[get_support_tickets, get_usage_metrics], verbose=True, allow_delegation=False, ) # Agent 3: Writer - produces the final executive brief writer = Agent( role="Brief Writer", goal="Create concise executive summaries", backstory=( "You are a clear communicator who distills complex customer " "insights into actionable executive briefs. You excel at " "identifying key metrics and recommending next steps. Your " "documents follow a consistent structure: executive summary, " "key metrics, risk assessment, and recommended actions." ), tools=[], # No tools - synthesizes other agents' work verbose=True, allow_delegation=False, )

Each agent has a distinct role, goal, and backstory that guide how CrewAI's underlying LLM approaches the task. The Researcher gathers CRM data, the Analyst assesses health signals, and the Writer synthesizes everything into a deliverable.

Learn more about CrewAI agents.

4.5 Task definitions with context chaining

Tasks define what each agent does and how data flows between them:

from crewai import Task from agents import researcher, analyst, writer # Task 1: Research - gather account profile + opportunities research_task = Task( description=( "Research the account named {account_name}. " "Gather: " "- Account profile (name, industry, revenue, employees, region) " "- Active opportunities and total pipeline value " "Provide a comprehensive account profile with all available data." ), agent=researcher, expected_output="Complete account profile with opportunity summary", ) # Task 2: Analysis - assess health signals and risk analysis_task = Task( description=( "Analyze health signals for {account_name}. " "Assess: " "- Support ticket volume and severity " "- Product usage trends and license utilization " "- Overall risk level (LOW / MEDIUM / HIGH) " "Provide a risk assessment with supporting metrics." ), agent=analyst, expected_output="Risk assessment with key health metrics", context=[research_task], # Receives Researcher's output ) # Task 3: Writing - produce the final executive brief writing_task = Task( description=( "Create an executive brief for {account_name}. " "Include: " "- Executive summary (2-3 sentences) " "- Key metrics (ARR, usage, support health) " "- Risk assessment " "- 3-5 recommended actions (prioritized) " "Format the output as a clean Markdown document. " "Use clear, concise language suitable for executives. " "IMPORTANT: Output raw markdown only. Do NOT wrap your response " "in ```markdown code blocks." ), agent=writer, expected_output="Formatted markdown document", context=[research_task, analysis_task], # Receives both outputs output_file="output/{account_name}-brief.md", )

The context parameter chains tasks together: the Analyst receives the Researcher's output, and the Writer receives both. The {account_name} placeholder is filled at runtime via crew.kickoff().

Learn more about CrewAI tasks.

Context chaining flow:

research_task (no context)
    ↓
analysis_task (context: [research_task])
    ↓
writing_task (context: [research_task, analysis_task])
    ↓
output/{account_name}-brief.md

Step 5: Build the application

The main entry point assembles agents and tasks into a crew with a CLI interface:

import argparse import sys from crewai import Crew, Process from agents import researcher, analyst, writer from tasks import research_task, analysis_task, writing_task def create_crew() -> Crew: """Create the account research crew.""" crew = Crew( agents=[researcher, analyst, writer], tasks=[research_task, analysis_task, writing_task], process=Process.sequential, # Run tasks in order verbose=True, ) return crew def main(): """Parse CLI arguments and run the crew.""" parser = argparse.ArgumentParser( description="Automated Account Research with CrewAI + CData Connect AI" ) parser.add_argument( "--account", required=True, help="Account name to research (e.g. 'Acme Corp')", ) args = parser.parse_args() print(f" {'='*60}") print(f" Account Research Crew") print(f" Target: {args.account}") print(f"{'='*60} ") crew = create_crew() try: result = crew.kickoff(inputs={"account_name": args.account}) except Exception as exc: print(f" [ERROR] Crew execution failed: {exc}", file=sys.stderr) sys.exit(1) print(f" {'='*60}") print(f" Account brief generated: output/{args.account}-brief.md") print(f"{'='*60} ") print(result) if __name__ == "__main__": main()

Process.sequential ensures tasks run in order, and crew.kickoff(inputs={"account_name": ...}) fills the placeholder in all task descriptions.

Learn more about CrewAI processes.


Step 6: Run the application

With everything configured, run the crew:

python src/crew.py --account "Acme Corp"

Expected output:

============================================================
  Account Research Crew
  Target: Acme Corp
============================================================

[Researcher Agent]: Gathering account data...
  ✓ Account: Acme Corp
  ✓ ARR: $450K
  ✓ Industry: Manufacturing
  ✓ Open opportunities: $180K

[Analyst Agent]: Analyzing health signals...
  ✓ Support tickets: 12 open (3 high priority)
  ✓ Product usage: 94% license utilization
  ⚠️ Risk score: MEDIUM

[Writer Agent]: Creating account brief...
  ✓ Executive summary written
  ✓ Key metrics compiled
  ✓ Action items identified

============================================================
  Account brief generated: output/Acme Corp-brief.md
============================================================

Note: With verbose=True, the console output includes detailed agent reasoning and thought chains. The output above is a simplified summary of what each agent produces.


Step 7: Example

Run the crew against different accounts:

python src/crew.py --account "Acme Corp" python src/crew.py --account "TechStart Inc" python src/crew.py --account "Global Manufacturing Ltd"

Example generated brief

The output file follows this structure (see output/sample_briefs/acme-corp-brief.md for a full example):

# Acme Corp - Account Health Brief
**Generated**: 2025-02-02 | **Risk Level**: MEDIUM

## Executive Summary
Acme Corp is a $450K manufacturing customer showing strong product
adoption but concerning support trends. Immediate attention required
on escalated tickets before renewal discussion.

## Key Metrics
- ARR: $450,000
- Contract Renewal: 45 days
- Support Health: ⚠️ 3 P1 tickets
- Usage: ✅ 94% license utilization
- Pipeline: $180K expansion opportunity

## Risk Assessment
**Overall Risk**: MEDIUM

## Recommended Actions
1. **[HIGH]** Schedule technical review for P1 tickets
2. **[MEDIUM]** Executive business review within 2 weeks
3. **[LOW]** Qualify expansion opportunity

Step 8: Available MCP Tools

The CData Connect AI MCP server exposes these tools:

ToolDescription
getCatalogsList available data source connections
getSchemasGet schemas for a specific catalog
getTablesGet tables in a schema
getColumnsGet column metadata for a table
queryDataExecute SQL queries
getProceduresList stored procedures
getProcedureParametersGet procedure parameter details
executeProcedureExecute stored procedures

The four CrewAI tool wrappers in this project map to specific agents:

CrewAI ToolMCP Tool UsedAssigned ToData Domain
get_account_infoqueryDataResearcherAccount profiles
get_opportunitiesqueryDataResearcherSales pipeline
get_support_ticketsqueryDataAnalystSupport requests
get_usage_metricsqueryDataAnalystProduct usage

Step 9: SQL Query format

When querying data, use fully qualified table names:

SELECT [column1], [column2] FROM [CatalogName].[SchemaName].[TableName] WHERE [column1] = 'value' ORDER BY [column2]

For example, to query the account table:

SELECT [Name], [annual_revenue], [industry] FROM [GoogleSheets1].[GoogleSheets].[demo_organization_account] ORDER BY [annual_revenue] DESC

To query related tables using subqueries:

SELECT * FROM [GoogleSheets1].[GoogleSheets].[demo_organization_opportunity] WHERE [AccountId] IN ( SELECT [Id1] FROM [GoogleSheets1].[GoogleSheets].[demo_organization_account] WHERE [Name] = 'Acme Corp' )

Tip: To discover catalog and schema names programmatically:

from config import Config from mcp_client import MCPClient config = Config.from_env() client = MCPClient(config) print(client.get_catalogs()) print(client.get_schemas("GoogleSheets1")) print(client.get_tables("GoogleSheets1", "GoogleSheets"))

Troubleshooting

CrewAI installation errors

CrewAI requires Python 3.10 or later. Check the installed version:

python --version

For older versions, install Python 3.10+ and create a new virtual environment.

OpenAI API Errors

  • Verify the OPENAI_API_KEY is valid and has available credits.
  • CrewAI works best with GPT-4o. A different model can be set via the OPENAI_MODEL environment variable.

Authentication errors

  • Verify the CData email and PAT are correct in .env.
  • Ensure the PAT has not expired.
  • Check that the Connect AI account is active.

No tools available

  • Confirm at least one data source is connected in Connect AI.
  • Check that the user has permissions to access the connection.

Tool calling failures

  • Ensure the CData Connect AI instance has at least one active data source connected.
  • Use fully qualified table names: [Catalog].[Schema].[Table].
  • Check that the catalog and schema names in src/tools.py match the Connect AI configuration.
  • Verify column names exist using client.get_columns().
  • Review the verbose output for specific error messages from the Connect AI MCP server.

Output file not generated

  • Ensure the output/ directory exists (created by git clone).
  • Check file permissions on the output/ directory.
  • Review the agent's verbose output. If an upstream task fails, downstream tasks may not execute.

Next Steps

Now that the multi-agent crew is up and running:

  • Add more agents: Create additional specialists (competitive intelligence, financial analyst) following the pattern in src/agents.py.
  • Connect more data sources: Add Salesforce, HubSpot, Snowflake, or any of 350+ supported sources through the CData Connect AI dashboard.
  • Switch process type: Change from Process.sequential to Process.parallel for independent tasks.
  • Change output format: Modify the Writer task description to produce JSON instead of Markdown.
  • Add memory and caching: Explore CrewAI's memory features for multi-session research.
  • Explore advanced patterns: The CrewAI documentation covers delegation, hierarchical processes, and human-in-the-loop workflows.

Resources


Get Started with CData Connect AI

Ready to build AI-powered data applications? CData Connect AI provides governed, secure access to 350+ enterprise data sources for AI applications. Multi-agent crews can query live business data from Salesforce, Snowflake, HubSpot, Google Sheets, databases, and more through a single MCP interface.

Sign up for a free trial and start building intelligent multi-agent crews today!