How to Connect to Live Amazon Athena Data from Sourcegraph Amp (via CData Connect AI)
Sourcegraph Amp is a modern AI agent environment designed for building intelligent, production-ready assistants capable of stateful reasoning, automatic context management, and native MCP (Model Context Protocol) integration. When combined with CData Connect AI, you can leverage Amp to create agents that interact with your Amazon Athena data in real time using natural language or SQL-based queries.
CData Connect AI provides a secure, cloud-to-cloud interface for accessing Amazon Athena data. Through the Connect AI Remote MCP Server, Amp connects directly to Amazon Athena, enabling live data queries and operations without replication. With optimized pushdown capabilities, CData Connect AI executes SQL operations including filters, aggregations, and joins directly in Amazon Athena for fast, real-time performance.
In this article, we demonstrate how to configure the Amp agent to conversationally explore your Amazon Athena data using natural language or SQL. With Connect AI, you can easily build agents that have secure, live access to Amazon Athena along with hundreds of other enterprise data sources.
Prerequisites
- An active CData Connect AI
- The Sourcegraph Amp VS Code extension or Amp CLI installed
- Node.js v20 or higher installed
- Access to Amazon Athena
About Amazon Athena Data Integration
CData provides the easiest way to access and integrate live data from Amazon Athena. Customers use CData connectivity to:
- Authenticate securely using a variety of methods, including IAM credentials, access keys, and Instance Profiles, catering to diverse security needs and simplifying the authentication process.
- Streamline their setup and quickly resolve issue with detailed error messaging.
- Enhance performance and minimize strain on client resources with server-side query execution.
Users frequently integrate Athena with analytics tools like Tableau, Power BI, and Excel for in-depth analytics from their preferred tools.
To learn more about unique Amazon Athena use cases with CData, check out our blog post: https://www.cdata.com/blog/amazon-athena-use-cases.
Getting Started
Step 1: Configure Amazon Athena Connectivity for Sourcegraph Amp
Connectivity to Amazon Athena from Amp is made possible through CData Connect AI Remote MCP. To interact with Amazon Athena data from Amp, we start by creating and configuring a Amazon Athena connection in CData Connect AI.
- Log into Connect AI, click Sources, and then click Add Connection
- Select "Amazon Athena" from the Add Connection panel
-
Enter the necessary authentication properties to connect to Amazon Athena.
Authenticating to Amazon Athena
To authorize Amazon Athena requests, provide the credentials for an administrator account or for an IAM user with custom permissions: Set AccessKey to the access key Id. Set SecretKey to the secret access key.
Note: Though you can connect as the AWS account administrator, it is recommended to use IAM user credentials to access AWS services.
Obtaining the Access Key
To obtain the credentials for an IAM user, follow the steps below:
- Sign into the IAM console.
- In the navigation pane, select Users.
- To create or manage the access keys for a user, select the user and then select the Security Credentials tab.
To obtain the credentials for your AWS root account, follow the steps below:
- Sign into the AWS Management console with the credentials for your root account.
- Select your account name or number and select My Security Credentials in the menu that is displayed.
- Click Continue to Security Credentials and expand the Access Keys section to manage or create root account access keys.
Authenticating from an EC2 Instance
If you are using the CData Data Provider for Amazon Athena 2018 from an EC2 Instance and have an IAM Role assigned to the instance, you can use the IAM Role to authenticate. To do so, set UseEC2Roles to true and leave AccessKey and SecretKey empty. The CData Data Provider for Amazon Athena 2018 will automatically obtain your IAM Role credentials and authenticate with them.
Authenticating as an AWS Role
In many situations it may be preferable to use an IAM role for authentication instead of the direct security credentials of an AWS root user. An AWS role may be used instead by specifying the RoleARN. This will cause the CData Data Provider for Amazon Athena 2018 to attempt to retrieve credentials for the specified role. If you are connecting to AWS (instead of already being connected such as on an EC2 instance), you must additionally specify the AccessKey and SecretKey of an IAM user to assume the role for. Roles may not be used when specifying the AccessKey and SecretKey of an AWS root user.
Authenticating with MFA
For users and roles that require Multi-factor Authentication, specify the MFASerialNumber and MFAToken connection properties. This will cause the CData Data Provider for Amazon Athena 2018 to submit the MFA credentials in a request to retrieve temporary authentication credentials. Note that the duration of the temporary credentials may be controlled via the TemporaryTokenDuration (default 3600 seconds).
Connecting to Amazon Athena
In addition to the AccessKey and SecretKey properties, specify Database, S3StagingDirectory and Region. Set Region to the region where your Amazon Athena data is hosted. Set S3StagingDirectory to a folder in S3 where you would like to store the results of queries.
If Database is not set in the connection, the data provider connects to the default database set in Amazon Athena.
- Click Save & Test
Step 2: Set Up Amp for CData Connect AI
Copy the MCP Endpoint
Amp communicates with Connect AI through the hosted MCP endpoint:
https://mcp.cloud.cdata.com/mcp
This endpoint provides secure, cloud-to-cloud communication between Amp and your Connect AI workspace.
Generate Base64 Credentials
To authenticate Amp with Connect AI, generate your Base64-encoded credentials. For example, in PowerShell:
{Convert}::ToBase64String{(Text.Encoding)}::ASCII.GetBytes("[email protected]:yourPAT")
Replace [email protected] with your Connect AI email and yourPAT with your Personal Access Token.
Register the MCP Server in Amp
Once you have your Base64 string, register the CData Connect AI MCP server with Amp using the following command:
amp mcp add cdata-connect-ai -- npx -y mcp-remote@latest https://mcp.cloud.cdata.com/mcp --header "Authorization: Basic "
This adds your Connect AI configuration to Amp's settings file, enabling communication with CData Connect AI.
Verify Your Connection and Explore Data
- Create a New Thread
- Enter the Interactive Chat
- Verify MCP Servers
- Confirm Your Data Source
Start a new Amp session to begin interacting with your data:
amp thread new
Connect to the new thread using:
amp.
Inside the Amp shell, check your registered MCP servers:
list mcp.
Confirm that your connected Amazon Athena data appears as a catalog by running
getCatalogs.
Step 3: Build Intelligent Agents with Live Amazon Athena Data Access
With your Amp application configured and connected to CData Connect AI, you can now build sophisticated agents that interact with your Amazon Athena data using natural language. The MCP integration provides your agents with powerful data access capabilities.
Available MCP Tools for your Agent
Your Amp application has access to the following CData Connect AI MCP tools:
- getCatalogs: Lists all data source catalogs (e.g., AmazonAthena1)
- getSchemas: Returns database schemas within the connected catalog
- getTables: Lists all tables and views available under a given schema
- getColumns: Returns column definitions for a specific table or view
- queryData: Executes SQL queries (SELECT, INSERT, UPDATE, DELETE)
- getProcedures: Lists stored procedures or API endpoints
- getProcedureParameters: Returns metadata for stored procedure parameters
- executeProcedure: Invokes stored procedures (e.g., Amazon Athena actions)
Key Features of Amp
Amp provides several production-ready capabilities that make it ideal for building intelligent, data-aware AI agents:
- Automatic Context Management: Amp maintains and recalls conversational context automatically, enabling seamless multi-turn interactions without manual state tracking.
- Stateful Conversations: Preserve context and memory across multiple queries to create natural, human-like conversations.
- Native MCP Integration: Amp natively supports the Model Context Protocol (MCP), allowing secure, real-time access to live data from CData Connect AI and other MCP-compatible servers.
- Tool-Oriented Architecture: Tools are treated as first-class components with managed invocation, input validation, and error handling.
- Efficient Context Handling: Amp optimizes prompts dynamically, ensuring relevant information is preserved even when approaching model token limits.
- Cross-Source Querying: Combine and query multiple connected data sources within a single conversational workflow.
- Fine-Grained Permission Controls: Define and enforce tool access levels to maintain data governance and secure integrations.
- Developer-Friendly CLI and SDK: Manage MCP connections, configure agents, and test workflows easily from the Amp CLI or VS Code extension.
Example Use Cases
Here are some examples of what your Amp agents can do with live data access through CData Connect AI:
- Data Analysis Agent: Identify trends and anomalies in Amazon Athena data.
- Report Generation Agent: Generate reports from natural language prompts.
- Interactive Chatbot: Explain insights conversationally using live data.
- Data Quality Agent: Monitor and flag real-time data inconsistencies.
- Automated Workflow Agent: Trigger alerts based on defined data conditions.
Testing Your Agent
Once your agent is running, you can interact with it through natural language queries. For example:
- "Show me all new leads from the past 30 days."
- "What are the top-performing campaigns this quarter?"
- "Analyze revenue growth and highlight anomalies."
- "Generate a summary report of current opportunities."
- "Find all records where status is pending approval."
Get CData Connect AI
To get live data access to 300+ SaaS, Big Data, and NoSQL sources directly from your Amp agent environment, try CData Connect AI today!