Extract, Transform, and Load Adobe Analytics Data in Python

Ready to get started?

Download for a free trial:

Download Now

Learn more:

Adobe Analytics Python Connector

Python Connector Libraries for Adobe Analytics Data Connectivity. Integrate Adobe Analytics with popular Python tools like Pandas, SQLAlchemy, Dash & petl.



The CData Python Connector for Adobe Analytics enables you to create ETL applications and pipelines for Adobe Analytics data in Python with petl.

The rich ecosystem of Python modules lets you get to work quickly and integrate your systems more effectively. With the CData Python Connector for Adobe Analytics and the petl framework, you can build Adobe Analytics-connected applications and pipelines for extracting, transforming, and loading Adobe Analytics data. This article shows how to connect to Adobe Analytics with the CData Python Connector and use petl and pandas to extract, transform, and load Adobe Analytics data.

With built-in, optimized data processing, the CData Python Connector offers unmatched performance for interacting with live Adobe Analytics data in Python. When you issue complex SQL queries from Adobe Analytics, the driver pushes supported SQL operations, like filters and aggregations, directly to Adobe Analytics and utilizes the embedded SQL engine to process unsupported operations client-side (often SQL functions and JOIN operations).

Connecting to Adobe Analytics Data

Connecting to Adobe Analytics data looks just like connecting to any relational data source. Create a connection string using the required connection properties. For this article, you will pass the connection string as a parameter to the create_engine function.

Adobe Analytics uses the OAuth authentication standard. To authenticate using OAuth, you will need to create an app to obtain the OAuthClientId, OAuthClientSecret, and CallbackURL connection properties. See the "Getting Started" section of the help documentation for a guide.

Retrieving GlobalCompanyId

GlobalCompanyId is a required connection property. If you do not know your Global Company ID, you can find it in the request URL for the users/me endpoint on the Swagger UI. After logging into the Swagger UI Url, expand the users endpoint and then click the GET users/me button. Click the Try it out and Execute buttons. Note your Global Company ID shown in the Request URL immediately preceding the users/me endpoint.

Retrieving Report Suite Id

Report Suite ID (RSID) is also a required connection property. In the Adobe Analytics UI, navigate to Admin -> Report Suites and you will get a list of your report suites along with their identifiers next to the name.

After setting the GlobalCompanyId, RSID and OAuth connection properties, you are ready to connect to Adobe Analytics.

After installing the CData Adobe Analytics Connector, follow the procedure below to install the other required modules and start accessing Adobe Analytics through Python objects.

Install Required Modules

Use the pip utility to install the required modules and frameworks:

pip install petl
pip install pandas

Build an ETL App for Adobe Analytics Data in Python

Once the required modules and frameworks are installed, we are ready to build our ETL app. Code snippets follow, but the full source code is available at the end of the article.

First, be sure to import the modules (including the CData Connector) with the following:

import petl as etl
import pandas as pd
import cdata.adobeanalytics as mod

You can now connect with a connection string. Use the connect function for the CData Adobe Analytics Connector to create a connection for working with Adobe Analytics data.

cnxn = mod.connect("GlobalCompanyId=myGlobalCompanyId; RSID=myRSID; OAuthClientId=myOauthClientId; OauthClientSecret=myOAuthClientSecret; CallbackURL=myCallbackURL;")

Create a SQL Statement to Query Adobe Analytics

Use SQL to create a statement for querying Adobe Analytics. In this article, we read data from the AdsReport entity.

sql = "SELECT Page, PageViews FROM AdsReport WHERE City = 'Chapel Hill'"

Extract, Transform, and Load the Adobe Analytics Data

With the query results stored in a DataFrame, we can use petl to extract, transform, and load the Adobe Analytics data. In this example, we extract Adobe Analytics data, sort the data by the PageViews column, and load the data into a CSV file.

Loading Adobe Analytics Data into a CSV File

table1 = etl.fromdb(cnxn,sql)

table2 = etl.sort(table1,'PageViews')

etl.tocsv(table2,'adsreport_data.csv')

With the CData Python Connector for Adobe Analytics, you can work with Adobe Analytics data just like you would with any database, including direct access to data in ETL packages like petl.

Free Trial & More Information

Download a free, 30-day trial of the Adobe Analytics Python Connector to start building Python apps and scripts with connectivity to Adobe Analytics data. Reach out to our Support Team if you have any questions.



Full Source Code


import petl as etl
import pandas as pd
import cdata.adobeanalytics as mod

cnxn = mod.connect("GlobalCompanyId=myGlobalCompanyId; RSID=myRSID; OAuthClientId=myOauthClientId; OauthClientSecret=myOAuthClientSecret; CallbackURL=myCallbackURL;")

sql = "SELECT Page, PageViews FROM AdsReport WHERE City = 'Chapel Hill'"

table1 = etl.fromdb(cnxn,sql)

table2 = etl.sort(table1,'PageViews')

etl.tocsv(table2,'adsreport_data.csv')