Getting Started with the CData Python Connector for Amazon Athena
This guide walks you through installing, licensing, and connecting the CData Python Connector to live Amazon Athena data. You will learn to:
- Install the connector and apply the license.
- Configure a connection to Amazon Athena.
- Explore steps to integrate live Amazon Athena data within your Python applications.
Let's begin.
Prerequisites
- A Python installation (v3.8 or higher) configured for your machine. Download and install here.
- The CData Python Connector for Amazon Athena with a valid license. Download and install here.
- An active Amazon Athena account with valid credentials.
About Amazon Athena Data Integration
CData provides the easiest way to access and integrate live data from Amazon Athena. Customers use CData connectivity to:
- Authenticate securely using a variety of methods, including IAM credentials, access keys, and Instance Profiles, catering to diverse security needs and simplifying the authentication process.
- Streamline their setup and quickly resolve issue with detailed error messaging.
- Enhance performance and minimize strain on client resources with server-side query execution.
Users frequently integrate Athena with analytics tools like Tableau, Power BI, and Excel for in-depth analytics from their preferred tools.
To learn more about unique Amazon Athena use cases with CData, check out our blog post: https://www.cdata.com/blog/amazon-athena-use-cases.
Getting Started
Step 1: Installation and Licensing
1.1 Install the Connector
Python Dependencies Note: Make sure you have Python installed. The CData Python Connector supports Python versions 3.8, 3.9, 3.10, 3.11, and 3.12. If you are using a version outside this range, you may need to create a virtual environment with virtualenv.
Windows Installation
- Download and extract the Connector ZIP to your desired location.
-
Open a terminal or command prompt and navigate to the directory where the .whl
file is located (inside the /win/ directory).
Example:
~\Downloads\AmazonAthenaPythonConnector\CData.Python.AmazonAthena\win\Python312\64 -
Install the .whl file using pip.
Ensure it matches your Python version and architecture. Example:
pip install cdata_amazonathena_connector-24.0.9111-cp312-cp312-win_amd64.whl - Confirm the installation by running pip list. If cdata-amazonathena-connector appears, the installation is successful.
Linux/Mac Installation
- Download and extract the Connector ZIP to your desired location.
-
Open a terminal and navigate to the extracted installation directory to locate the
.tar.gz file inside the /unix/ or /mac/ folder.
Example:
~/Downloads/AmazonAthenaPythonConnector/CData.Python.AmazonAthena/unix/
or
CData.Python.AmazonAthena/mac/ -
Install the .tar.gz file using pip. Example:
pip install cdata_amazonathena_connector-24.0.####-python3.tar.gz - Confirm the installation by running pip list. If cdata-amazonathena-connector appears, the installation is successful.
1.2 Activate the License
After your purchase, you should have received your license key via email from the CData Orders Team. The license key is a 25-character code that looks like this: XXXXX-XXXXX-XXXXX-XXXXX-XXXXX
Windows License Activation
- Open a terminal or command prompt and navigate to the cdata folder inside your site-packages directory of your Python installation.
-
Example path:
C:\Users\Username\AppData\Local\Programs\Python\Python312\Lib\site-packages\cdata\installlic_amazonathena -
Run the license-installer.exe file with your license key:
.\license-installer.exe [YOUR LICENSE KEY HERE] - When prompted, enter your registered name and email to complete the activation.
Linux/MacOS License Activation
-
Open a terminal and navigate to the cdata folder inside your
Python site-packages directory.
This directory is typically located under:
~/Library/Python/3.12/lib/python/site-packages/cdata/installlic_amazonathena or
/usr/local/lib/python3.12/site-packages/cdata/installlic_amazonathena -
Run the Linux/Mac license installer script:
./install-license.sh [YOUR LICENSE KEY HERE] - Enter your registered name and email when prompted to complete the activation.
Common Licensing Questions
Can I use my license on multiple machines?
Yes, depending on your subscription tier. Check your order confirmation email or contact your account representative for details.
If you are unsure who your account representative is, contact [email protected].
I lost my license key. How do I retrieve it?
Email [email protected] with your order number, and we will resend your license key.
Can I transfer my license to a different machine?
Yes. You will need to submit a License Transfer Request using our license transfer request page linked below:
https://www.cdata.com/lic/transfer/
After your License Transfer Request is submitted and processed, an additional activation will be added to your Product Key.
You will then be able to activate the full license on the new machine.
Once this process is complete, the license on the previous machine will become invalid.
For additional licensing questions, contact [email protected]. You can view and manage your license through our self-service portal at portal.cdata.com.
Step 2: Connection Configuration
After the installation and license activation are complete, you can establish a connection using the CData Python Connector.
2.1 Establish a Connection
The CData Python Connector for Amazon Athena is exposed as a Python module that you can import using the standard import statement and then build your application code around it.
The Connector also includes built-in metadata tools such as sys_tables and sys_tablecolumns, which allow you to perform schema discovery — including available tables, columns, and structural metadata for Amazon Athena data.
The following example establishes a connection to Amazon Athena using your authentication properties and retrieves column names from a specific table.
Replace or modify the connection string values below with your actual credentials, and update your table name in '[TABLE NAME]' as needed.
If your Amazon Athena instance uses MFA or additional security requirements, you may need to include properties such as Passcode or SecurityToken in your connection string. Refer to the Connection String Options section in the Connector Help documentation (also available inside the help directory of the Connector) for a complete list of supported properties.
import cdata.amazonathena as mod
# Establish the connection using your configured properties
conn = mod.connect(
"AWSAccessKey='a123';"
"AWSSecretKey='s123';"
"AWSRegion='IRELAND';"
"Database='sampledb';"
"S3StagingDirectory='s3://bucket/staging/';"
)
# Query column names for the specified table
cur = conn.cursor()
cur.execute("SELECT ColumnName FROM sys_tablecolumns WHERE TableName = '[TABLE NAME]'")
print("Columns in your table:")
for row in cur.fetchall():
print(row[0])
cur.close()
conn.close()
This code connects to Amazon Athena, queries the metadata catalog, and prints all column names for the table you specify. Check out the complete Connector documentation to learn how to modify the SQL query to explore additional schemas, tables, or other supported metadata views.
2.2 Available Connection Configuration
Authenticating to Amazon Athena
To authorize Amazon Athena requests, provide the credentials for an administrator account or for an IAM user with custom permissions: Set AccessKey to the access key Id. Set SecretKey to the secret access key.
Note: Though you can connect as the AWS account administrator, it is recommended to use IAM user credentials to access AWS services.
Obtaining the Access Key
To obtain the credentials for an IAM user, follow the steps below:
- Sign into the IAM console.
- In the navigation pane, select Users.
- To create or manage the access keys for a user, select the user and then select the Security Credentials tab.
To obtain the credentials for your AWS root account, follow the steps below:
- Sign into the AWS Management console with the credentials for your root account.
- Select your account name or number and select My Security Credentials in the menu that is displayed.
- Click Continue to Security Credentials and expand the Access Keys section to manage or create root account access keys.
Authenticating from an EC2 Instance
If you are using the CData Data Provider for Amazon Athena 2018 from an EC2 Instance and have an IAM Role assigned to the instance, you can use the IAM Role to authenticate. To do so, set UseEC2Roles to true and leave AccessKey and SecretKey empty. The CData Data Provider for Amazon Athena 2018 will automatically obtain your IAM Role credentials and authenticate with them.
Authenticating as an AWS Role
In many situations it may be preferable to use an IAM role for authentication instead of the direct security credentials of an AWS root user. An AWS role may be used instead by specifying the RoleARN. This will cause the CData Data Provider for Amazon Athena 2018 to attempt to retrieve credentials for the specified role. If you are connecting to AWS (instead of already being connected such as on an EC2 instance), you must additionally specify the AccessKey and SecretKey of an IAM user to assume the role for. Roles may not be used when specifying the AccessKey and SecretKey of an AWS root user.
Authenticating with MFA
For users and roles that require Multi-factor Authentication, specify the MFASerialNumber and MFAToken connection properties. This will cause the CData Data Provider for Amazon Athena 2018 to submit the MFA credentials in a request to retrieve temporary authentication credentials. Note that the duration of the temporary credentials may be controlled via the TemporaryTokenDuration (default 3600 seconds).
Connecting to Amazon Athena
In addition to the AccessKey and SecretKey properties, specify Database, S3StagingDirectory and Region. Set Region to the region where your Amazon Athena data is hosted. Set S3StagingDirectory to a folder in S3 where you would like to store the results of queries.
If Database is not set in the connection, the data provider connects to the default database set in Amazon Athena.
2.3 Common Connection Issues
Authentication Failed
Solution: Verify that your User, Password, and any additional authentication properties required by Amazon Athena are correct. If your data source enforces MFA, SSO, or passcodes, ensure the correct properties are included in the connection string. Refer to the complete Connector documentation for the full list of supported authentication properties, or contact [email protected] for assistance validating authentication settings.
Cannot Reach Server
Solution: Confirm that the endpoint URL in your connection string is correct and that outbound HTTPS traffic is allowed from your environment. If you are behind a firewall or proxy, ensure that Python is permitted to reach the service URL. For network configuration details or port requirements, contact [email protected].
Table Not Found
Solution: Verify the Database, Schema, and table name in your SQL query. Use metadata views such as sys_tables and sys_tablecolumns to confirm the exact table and column names exposed by Amazon Athena data. If the table name is case-sensitive, ensure you are using the correct casing in your query.
Module Not Found or Import Errors
Solution: Ensure the Python Connector is installed in the correct environment. Run pip list to verify that the connector (cdata-amazonathena-connector) is present. If you are using virtual environments, activate the correct environment before executing your script.
Connection String Errors
Solution: Incorrect property formatting or missing semicolons can prevent the connector from parsing your connection settings. Review your connection string to ensure each property follows the correct Key=Value; format. Refer to the Python Connector documentation for property names supported by Amazon Athena.
For additional connection troubleshooting, contact [email protected] with your full error message (masking sensitive credentials before sending).
Step 3: Explore Next Steps
With the connector installed and your connection configured, you can now begin working with live Amazon Athena data in Python. Explore the resources below to extend your integration and build complex workflows.
| Python Client | Article Title |
|---|---|
| Python MCP Server | Connect Amazon Athena to AI Assistants With the CData Python MCP Server |
| pandas | Use pandas to Visualize Amazon Athena in Python |
| Dash | Use Dash & Python to Build Web Apps on Amazon Athena |
| SQLAlchemy | Use SQLAlchemy ORMs to Access Amazon Athena in Python |
| petl | Extract, Transform, and Load Amazon Athena in Python |
Get Support
If you need assistance at any point:
- Technical Support: [email protected]
- Community Forum: CData Community Site
- Help Documentation: Installed locally and available online
FAQs
Installation & Licensing
-
Do I need administrator rights to install the connector?
Administrator rights are not required for installing the Python Connector, but they may be needed when applying the license or installing into system-wide Python environments. -
Can I install the connector in multiple Python environments?
Yes. Install the connector once per environment (venv, Conda, system Python). Each environment maintains its own packages and will use the machine license once activated.
Connecting
-
How do I provide authentication details?
Pass authentication properties in the connection string when calling mod.connect(). Refer to the Connector help documentation for the full list of supported properties. -
How do I discover tables and columns?
Use metadata views such as sys_tables and sys_tablecolumns. Example:SELECT * FROM sys_tables;
-
Can I connect through a proxy server?
Yes. Include proxy properties in your connection string, such as ProxyServer, ProxyPort, and ProxyUser. See the Firewall & Proxy section in the documentation.
Performance & Troubleshooting
-
Why are my queries slow?
Check the following:- Add filters (WHERE clauses) to reduce result size.
- Use Caching for data that does not change frequently.
- Ensure your Database and Schema selection information are correct.
- Contact [email protected] for optimization assistance.
-
How do I enable logging?
Add logging properties directly to your connection string, for example: Logfile=/path/log.txt;Verbosity=5;
Verbosity controls the detail level. Send this log file to [email protected] when requesting help. -
What ports must be open?
Most cloud services require outbound HTTPS on port 443. For source-specific requirements, contact [email protected]. -
Why do I see "Module not found" when importing?
Ensure the connector is installed in the same Python environment where the script is executed. Use pip list or pip show to confirm installation.
General
-
Where can I find supported SQL syntax?
See the SQL Compliance section of the Connector documentation. -
How often is the connector updated?
CData releases major updates annually and provides periodic patches. Check your account portal or contact Support for the latest version. -
Where can I find more code examples?
The online documentation includes examples for connecting, querying, filtering, paging, and working with metadata views.
If your question is not covered in this FAQ, contact [email protected].