Advanced Developer Guide - Multi-Agent Account Research with CrewAI
Build multi-agent AI crews that query live enterprise data, analyze account health, and generate executive briefs. This guide walks through creating a Python application that combines CrewAI multi-agent orchestration with CData Connect AI to automate account research using three specialized agents.
NOTE: While this guide uses Google Sheets as the data source, the same principles apply to any of the 350+ data sources CData Connect AI supports.
By the end of this guide, the application can:
- Connect a CrewAI agent to 300+ enterprise data sources through CData Connect AI
- Define specialized agents with distinct roles, goals, and tool assignments
- Build MCP tool wrappers using the @tool decorator from crewai.tools
- Chain tasks with context dependencies to create a sequential data pipeline
- Assemble a Crew that orchestrates agents through a research-to-report workflow
- Generate structured Markdown account briefs from a single CLI command
Architecture Overview
The application uses the Model Context Protocol (MCP) to bridge CrewAI agents with enterprise data sources:
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ │ │ │ │ │
│ CrewAI Agents │---->│ CData Connect │---->│ Data Sources │
│ (3 Agents) │ │ AI MCP Server │ │ (350+ types) │
│ │<----│ │<----│ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
| |
| Tool Discovery |
| & Execution |
v |
┌─────────────────┐ |
│ │ |
│ OpenAI API │--------------┘
│ (GPT-4o) │ Natural Language
│ │ to SQL Translation
└─────────────────┘
How it works:
- The Python application connects to CData Connect AI
- The CData Connect AI MCP exposes tools for data discovery and querying
- CrewAI agents use these tools to execute SQL queries against live data sources
- Each agent processes its assigned data domain and passes results downstream
- The final agent produces a formatted Markdown executive brief
Prerequisites
This guide requires the following:
- Python 3.10+ installed on the system (Download Python) (CrewAI requires 3.10 or later)
- pip package installer (included with Python 3.4+). Verify with pip --version
- An OpenAI API key (requires a paid account)
- A CData Connect AI account (free trial available)
- A Google account for the sample Google Sheets data
- Git for cloning the repository
Getting Started
Overview
Here is a quick overview of the steps:
- Set up sample data in Google Sheets
- Configure CData Connect AI and create a Personal Access Token
- Set up the Python project and install dependencies
- Understand the code architecture
- Build and run the crew application
Step 1: Set up sample data in Google Sheets
The sample Google Sheet contains customer data to demonstrate the capabilities. This dataset includes accounts, sales opportunities, support tickets, and usage metrics.
- Navigate to the sample customer health spreadsheet
- Click File > Make a copy to save it to Google Drive
- Give it a memorable name (e.g., "demo_organization"). This name is needed later when configuring the connection.
The spreadsheet contains four sheets:
- account: Company information (name, industry, revenue, employees)
- opportunity: Sales pipeline data (stage, amount, probability)
- tickets: Support tickets (priority, status, description)
- usage: Product usage metrics (job runs, records processed)
Step 2: Configure CData Connect AI
2.1 Sign Up or Log In
- Navigate to https://www.cdata.com/ai/signup/ to create a new account, or https://cloud.cdata.com/ to log in
- Complete the registration process if creating a new account
2.2 Add a Google Sheets connection
-
Once logged in, click Sources in the left navigation menu and click Add Connection
-
Select Google Sheets from the Add Connection panel
-
Configure the connection:
- Set the Spreadsheet property to the name of the copied sheet (e.g., "demo_organization")
- Click Sign in to authenticate with Google OAuth
-
After authentication, navigate to the Permissions tab and verify the user has access
2.3 Create a Personal Access Token
The Python application uses a Personal Access Token (PAT) to authenticate with Connect AI.
- Click the Gear icon in the top right to open Settings
- Go to the Access Tokens section
- Click Create PAT
-
Give the token a name (e.g., "CrewAI Account Research") and click Create
- Important: Copy the token immediately. It is only shown once!
Step 3: Set up the Python project
3.1 Clone from GitHub (recommended)
Clone the complete project with all source files:
git clone https://github.com/CDataSoftware/connectai-crewai-account-research.git
cd connectai-crewai-account-research
pip install -r requirements.txt
3.2 Alternative: Create from scratch
Create a new project directory and install dependencies:
mkdir connectai-crewai-account-research
cd connectai-crewai-account-research
pip install crewai crewai-tools langchain langchain-openai python-dotenv requests
Then create the source files described in Steps 4 and 5.
3.3 Configure environment variables
Create a .env file in the project root:
cp .env.example .env
Open .env in a text editor and fill in the credentials:
# OpenAI Configuration (required for CrewAI)
OPENAI_API_KEY=sk-proj-...
# CData Connect AI Configuration
[email protected]
CDATA_ACCESS_TOKEN=your-personal-access-token-here
Replace the placeholder values with actual credentials.
Step 4: Understanding the code architecture
The project consists of six Python modules that work together:
4.1 Config Class
Handles configuration and credential management:
from dataclasses import dataclass
import os
import base64
from dotenv import load_dotenv
load_dotenv()
@dataclass
class Config:
"""Configuration for the Connect AI CrewAI Account Research crew."""
openai_api_key: str
cdata_email: str
cdata_access_token: str
openai_model: str = "gpt-4o"
mcp_server_url: str = "https://mcp.cloud.cdata.com/mcp"
@classmethod
def from_env(cls) -> "Config":
"""Create configuration from environment variables."""
openai_api_key = os.getenv("OPENAI_API_KEY")
cdata_email = os.getenv("CDATA_EMAIL")
cdata_access_token = os.getenv("CDATA_ACCESS_TOKEN")
missing = []
if not openai_api_key:
missing.append("OPENAI_API_KEY")
if not cdata_email:
missing.append("CDATA_EMAIL")
if not cdata_access_token:
missing.append("CDATA_ACCESS_TOKEN")
if missing:
raise ValueError(
f"Missing required environment variables: {', '.join(missing)}
"
"Please set these in your .env file or environment.
"
"See .env.example for a template."
)
return cls(
openai_api_key=openai_api_key,
cdata_email=cdata_email,
cdata_access_token=cdata_access_token,
openai_model=os.getenv("OPENAI_MODEL", "gpt-4o"),
mcp_server_url=os.getenv(
"MCP_SERVER_URL", "https://mcp.cloud.cdata.com/mcp"
),
)
def get_auth_header(self) -> str:
"""Generate Base64 authentication header."""
credentials = f"{self.cdata_email}:{self.cdata_access_token}"
return base64.b64encode(credentials.encode()).decode()
The Config class uses @dataclass to load and validate credentials from environment variables, and generates Base64-encoded Basic Auth headers for the MCP server.
4.2 MCP Client
Handles HTTP communication with the CData Connect AI MCP server:
import json
import sys
import requests
from config import Config
class MCPClient:
"""HTTP client for CData Connect AI MCP server."""
def __init__(self, config: Config):
self.config = config
self._tools = []
self.session = requests.Session()
self.session.headers.update({
"Content-Type": "application/json",
"Accept": "application/json, text/event-stream",
"Authorization": f"Basic {config.get_auth_header()}",
"User-Agent": (
"CDataConnectAI-CrewAIAgent "
f"(Python/{sys.version_info.major}.{sys.version_info.minor})"
),
})
@staticmethod
def _parse_sse_response(response_text: str) -> dict:
"""Parse Server-Sent Events response."""
for line in response_text.split("
"):
line = line.strip()
if line.startswith("data: "):
return json.loads(line[6:])
raise ValueError(f"No data found in SSE response: {response_text[:200]}")
def list_tools(self) -> list:
"""Discover available tools from the MCP server."""
response = self.session.post(
self.config.mcp_server_url,
json={"jsonrpc": "2.0", "method": "tools/list", "params": {}, "id": 1},
)
response.raise_for_status()
result = self._parse_sse_response(response.text)
self._tools = result.get("result", {}).get("tools", [])
return self._tools
def call_tool(self, tool_name: str, arguments: dict) -> dict:
"""Execute a tool on the MCP server."""
response = self.session.post(
self.config.mcp_server_url,
json={
"jsonrpc": "2.0",
"method": "tools/call",
"params": {"name": tool_name, "arguments": arguments},
"id": 2,
},
)
response.raise_for_status()
result = self._parse_sse_response(response.text)
return result.get("result", {})
def get_catalogs(self):
"""List available data source connections."""
return self.call_tool("getCatalogs", {})
def get_schemas(self, catalog_name: str):
"""Get schemas for a catalog."""
return self.call_tool("getSchemas", {"catalogName": catalog_name})
def get_tables(self, catalog_name: str, schema_name: str):
"""Get tables for a schema."""
return self.call_tool("getTables", {
"catalogName": catalog_name, "schemaName": schema_name
})
def get_columns(self, catalog_name: str, schema_name: str, table_name: str):
"""Get columns for a table."""
return self.call_tool("getColumns", {
"catalogName": catalog_name,
"schemaName": schema_name,
"tableName": table_name,
})
def query_data(self, query: str):
"""Execute a SQL query."""
return self.call_tool("queryData", {"query": query})
The MCPClient communicates via JSON-RPC 2.0, parses Server-Sent Events (SSE) responses, and provides convenience methods for common MCP operations.
4.3 Tool Wrappers
Each function wraps a CData Connect AI MCP call so CrewAI agents can query enterprise data:
from crewai.tools import tool
from config import Config
from mcp_client import MCPClient
# Shared client instance - created once, reused across all tool calls
_config = Config.from_env()
_client = MCPClient(_config)
# Connection identifiers - update these to match the Connect AI setup
CATALOG = "GoogleSheets1"
SCHEMA = "GoogleSheets"
@tool("get_account_info")
def get_account_info(account_name: str):
"""Query the account table for company profile information.
Returns account details including name, industry, annual_revenue,
employee_count, and region.
"""
query = (
f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] "
f"WHERE [Name] = '{account_name}'"
)
return _client.query_data(query)
@tool("get_opportunities")
def get_opportunities(account_name: str):
"""Query the opportunity table for active sales pipeline data.
Returns open opportunities including stage, amount, probability,
and expected close_date for the given account.
"""
query = (
f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_opportunity] "
f"WHERE [AccountId] IN "
f"(SELECT [Id1] FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] "
f"WHERE [Name] = '{account_name}')"
)
return _client.query_data(query)
@tool("get_support_tickets")
def get_support_tickets(account_name: str):
"""Query the tickets table for open support requests.
Returns ticket details including priority, status, subject,
and created_date for the given account.
"""
query = (
f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_tickets] "
f"WHERE [AccountId] IN "
f"(SELECT [Id1] FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] "
f"WHERE [Name] = '{account_name}')"
)
return _client.query_data(query)
@tool("get_usage_metrics")
def get_usage_metrics(account_name: str):
"""Query the usage table for product adoption metrics.
Returns usage data including date, jobs_run, and
records_processed for the given account.
"""
query = (
f"SELECT * FROM [{CATALOG}].[{SCHEMA}].[demo_organization_usage] "
f"WHERE [ACCOUNT_ID] IN "
f"(SELECT [Id1] FROM [{CATALOG}].[{SCHEMA}].[demo_organization_account] "
f"WHERE [Name] = '{account_name}')"
)
return _client.query_data(query)
The @tool decorator converts plain Python functions into BaseTool instances that CrewAI agents can use, with docstrings guiding tool selection.
Learn more about CrewAI tools.
4.4 Agent definitions
Three specialized agents, each with a distinct role:
from crewai import Agent
from tools import (
get_account_info,
get_opportunities,
get_support_tickets,
get_usage_metrics,
)
# Agent 1: Researcher - gathers raw account & opportunity data
researcher = Agent(
role="Account Researcher",
goal="Gather comprehensive account data from CRM",
backstory=(
"You are an expert at finding all relevant customer information "
"from CRM systems. You excel at retrieving account profiles and "
"sales pipeline data. You always present data in a clear, "
"structured format so other team members can quickly understand "
"the account landscape."
),
tools=[get_account_info, get_opportunities],
verbose=True,
allow_delegation=False,
)
# Agent 2: Analyst - assesses health signals and risk
analyst = Agent(
role="Health Analyst",
goal="Assess account health and identify risk signals",
backstory=(
"You are skilled at analyzing customer health metrics. You "
"identify patterns in support tickets and usage data to determine "
"account risk levels. You classify risk as LOW, MEDIUM, or HIGH "
"and always back up your assessment with specific metrics."
),
tools=[get_support_tickets, get_usage_metrics],
verbose=True,
allow_delegation=False,
)
# Agent 3: Writer - produces the final executive brief
writer = Agent(
role="Brief Writer",
goal="Create concise executive summaries",
backstory=(
"You are a clear communicator who distills complex customer "
"insights into actionable executive briefs. You excel at "
"identifying key metrics and recommending next steps. Your "
"documents follow a consistent structure: executive summary, "
"key metrics, risk assessment, and recommended actions."
),
tools=[], # No tools - synthesizes other agents' work
verbose=True,
allow_delegation=False,
)
Each agent has a distinct role, goal, and backstory that guide how CrewAI's underlying LLM approaches the task. The Researcher gathers CRM data, the Analyst assesses health signals, and the Writer synthesizes everything into a deliverable.
Learn more about CrewAI agents.
4.5 Task definitions with context chaining
Tasks define what each agent does and how data flows between them:
from crewai import Task
from agents import researcher, analyst, writer
# Task 1: Research - gather account profile + opportunities
research_task = Task(
description=(
"Research the account named {account_name}.
"
"Gather:
"
"- Account profile (name, industry, revenue, employees, region)
"
"- Active opportunities and total pipeline value
"
"Provide a comprehensive account profile with all available data."
),
agent=researcher,
expected_output="Complete account profile with opportunity summary",
)
# Task 2: Analysis - assess health signals and risk
analysis_task = Task(
description=(
"Analyze health signals for {account_name}.
"
"Assess:
"
"- Support ticket volume and severity
"
"- Product usage trends and license utilization
"
"- Overall risk level (LOW / MEDIUM / HIGH)
"
"Provide a risk assessment with supporting metrics."
),
agent=analyst,
expected_output="Risk assessment with key health metrics",
context=[research_task], # Receives Researcher's output
)
# Task 3: Writing - produce the final executive brief
writing_task = Task(
description=(
"Create an executive brief for {account_name}.
"
"Include:
"
"- Executive summary (2-3 sentences)
"
"- Key metrics (ARR, usage, support health)
"
"- Risk assessment
"
"- 3-5 recommended actions (prioritized)
"
"Format the output as a clean Markdown document.
"
"Use clear, concise language suitable for executives.
"
"IMPORTANT: Output raw markdown only. Do NOT wrap your response "
"in ```markdown code blocks."
),
agent=writer,
expected_output="Formatted markdown document",
context=[research_task, analysis_task], # Receives both outputs
output_file="output/{account_name}-brief.md",
)
The context parameter chains tasks together: the Analyst receives the Researcher's output, and the Writer receives both. The {account_name} placeholder is filled at runtime via crew.kickoff().
Learn more about CrewAI tasks.
Context chaining flow:
research_task (no context)
↓
analysis_task (context: [research_task])
↓
writing_task (context: [research_task, analysis_task])
↓
output/{account_name}-brief.md
Step 5: Build the application
The main entry point assembles agents and tasks into a crew with a CLI interface:
import argparse
import sys
from crewai import Crew, Process
from agents import researcher, analyst, writer
from tasks import research_task, analysis_task, writing_task
def create_crew() -> Crew:
"""Create the account research crew."""
crew = Crew(
agents=[researcher, analyst, writer],
tasks=[research_task, analysis_task, writing_task],
process=Process.sequential, # Run tasks in order
verbose=True,
)
return crew
def main():
"""Parse CLI arguments and run the crew."""
parser = argparse.ArgumentParser(
description="Automated Account Research with CrewAI + CData Connect AI"
)
parser.add_argument(
"--account",
required=True,
help="Account name to research (e.g. 'Acme Corp')",
)
args = parser.parse_args()
print(f"
{'='*60}")
print(f" Account Research Crew")
print(f" Target: {args.account}")
print(f"{'='*60}
")
crew = create_crew()
try:
result = crew.kickoff(inputs={"account_name": args.account})
except Exception as exc:
print(f"
[ERROR] Crew execution failed: {exc}", file=sys.stderr)
sys.exit(1)
print(f"
{'='*60}")
print(f" Account brief generated: output/{args.account}-brief.md")
print(f"{'='*60}
")
print(result)
if __name__ == "__main__":
main()
Process.sequential ensures tasks run in order, and crew.kickoff(inputs={"account_name": ...}) fills the placeholder in all task descriptions.
Learn more about CrewAI processes.
Step 6: Run the application
With everything configured, run the crew:
python src/crew.py --account "Acme Corp"
Expected output:
============================================================ Account Research Crew Target: Acme Corp ============================================================ [Researcher Agent]: Gathering account data... ✓ Account: Acme Corp ✓ ARR: $450K ✓ Industry: Manufacturing ✓ Open opportunities: $180K [Analyst Agent]: Analyzing health signals... ✓ Support tickets: 12 open (3 high priority) ✓ Product usage: 94% license utilization ⚠️ Risk score: MEDIUM [Writer Agent]: Creating account brief... ✓ Executive summary written ✓ Key metrics compiled ✓ Action items identified ============================================================ Account brief generated: output/Acme Corp-brief.md ============================================================
Note: With verbose=True, the console output includes detailed agent reasoning and thought chains. The output above is a simplified summary of what each agent produces.
Step 7: Example
Run the crew against different accounts:
python src/crew.py --account "Acme Corp"
python src/crew.py --account "TechStart Inc"
python src/crew.py --account "Global Manufacturing Ltd"
Example generated brief
The output file follows this structure (see output/sample_briefs/acme-corp-brief.md for a full example):
# Acme Corp - Account Health Brief **Generated**: 2025-02-02 | **Risk Level**: MEDIUM ## Executive Summary Acme Corp is a $450K manufacturing customer showing strong product adoption but concerning support trends. Immediate attention required on escalated tickets before renewal discussion. ## Key Metrics - ARR: $450,000 - Contract Renewal: 45 days - Support Health: ⚠️ 3 P1 tickets - Usage: ✅ 94% license utilization - Pipeline: $180K expansion opportunity ## Risk Assessment **Overall Risk**: MEDIUM ## Recommended Actions 1. **[HIGH]** Schedule technical review for P1 tickets 2. **[MEDIUM]** Executive business review within 2 weeks 3. **[LOW]** Qualify expansion opportunity
Step 8: Available MCP Tools
The CData Connect AI MCP server exposes these tools:
| Tool | Description |
|---|---|
| getCatalogs | List available data source connections |
| getSchemas | Get schemas for a specific catalog |
| getTables | Get tables in a schema |
| getColumns | Get column metadata for a table |
| queryData | Execute SQL queries |
| getProcedures | List stored procedures |
| getProcedureParameters | Get procedure parameter details |
| executeProcedure | Execute stored procedures |
The four CrewAI tool wrappers in this project map to specific agents:
| CrewAI Tool | MCP Tool Used | Assigned To | Data Domain |
|---|---|---|---|
| get_account_info | queryData | Researcher | Account profiles |
| get_opportunities | queryData | Researcher | Sales pipeline |
| get_support_tickets | queryData | Analyst | Support requests |
| get_usage_metrics | queryData | Analyst | Product usage |
Step 9: SQL Query format
When querying data, use fully qualified table names:
SELECT [column1], [column2]
FROM [CatalogName].[SchemaName].[TableName]
WHERE [column1] = 'value'
ORDER BY [column2]
For example, to query the account table:
SELECT [Name], [annual_revenue], [industry]
FROM [GoogleSheets1].[GoogleSheets].[demo_organization_account]
ORDER BY [annual_revenue] DESC
To query related tables using subqueries:
SELECT * FROM [GoogleSheets1].[GoogleSheets].[demo_organization_opportunity]
WHERE [AccountId] IN (
SELECT [Id1] FROM [GoogleSheets1].[GoogleSheets].[demo_organization_account]
WHERE [Name] = 'Acme Corp'
)
Tip: To discover catalog and schema names programmatically:
from config import Config
from mcp_client import MCPClient
config = Config.from_env()
client = MCPClient(config)
print(client.get_catalogs())
print(client.get_schemas("GoogleSheets1"))
print(client.get_tables("GoogleSheets1", "GoogleSheets"))
Troubleshooting
CrewAI installation errors
CrewAI requires Python 3.10 or later. Check the installed version:
python --version
For older versions, install Python 3.10+ and create a new virtual environment.
OpenAI API Errors
- Verify the OPENAI_API_KEY is valid and has available credits.
- CrewAI works best with GPT-4o. A different model can be set via the OPENAI_MODEL environment variable.
Authentication errors
- Verify the CData email and PAT are correct in .env.
- Ensure the PAT has not expired.
- Check that the Connect AI account is active.
No tools available
- Confirm at least one data source is connected in Connect AI.
- Check that the user has permissions to access the connection.
Tool calling failures
- Ensure the CData Connect AI instance has at least one active data source connected.
- Use fully qualified table names: [Catalog].[Schema].[Table].
- Check that the catalog and schema names in src/tools.py match the Connect AI configuration.
- Verify column names exist using client.get_columns().
- Review the verbose output for specific error messages from the Connect AI MCP server.
Output file not generated
- Ensure the output/ directory exists (created by git clone).
- Check file permissions on the output/ directory.
- Review the agent's verbose output. If an upstream task fails, downstream tasks may not execute.
Next Steps
Now that the multi-agent crew is up and running:
- Add more agents: Create additional specialists (competitive intelligence, financial analyst) following the pattern in src/agents.py.
- Connect more data sources: Add Salesforce, HubSpot, Snowflake, or any of 350+ supported sources through the CData Connect AI dashboard.
- Switch process type: Change from Process.sequential to Process.parallel for independent tasks.
- Change output format: Modify the Writer task description to produce JSON instead of Markdown.
- Add memory and caching: Explore CrewAI's memory features for multi-session research.
- Explore advanced patterns: The CrewAI documentation covers delegation, hierarchical processes, and human-in-the-loop workflows.
Resources
- GitHub Repository - Complete source code
- CrewAI Documentation - Advanced crew patterns, memory, and agent collaboration
- CData Connect AI Documentation - Connect more data sources and configure governed access
- CData Prompt Library - Example prompts for various use cases
- OpenAI API Documentation - OpenAI models and API reference
- Model Context Protocol - MCP specification and documentation
- CData Community Forums - Community support and discussions
Get Started with CData Connect AI
Ready to build AI-powered data applications? CData Connect AI provides governed, secure access to 350+ enterprise data sources for AI applications. Multi-agent crews can query live business data from Salesforce, Snowflake, HubSpot, Google Sheets, databases, and more through a single MCP interface.
Sign up for a free trial and start building intelligent multi-agent crews today!