Model Context Protocol (MCP) finally gives AI models a way to access the business data needed to make them really useful at work. CData MCP Servers have the depth and performance to make sure AI has access to all of the answers.
Try them now for free →Integrate Harvest Data in Pentaho Data Integration
Build ETL pipelines based on Harvest data in the Pentaho Data Integration tool.
The CData API Driver for JDBC enables access to live data from data pipelines. Pentaho Data Integration is an Extraction, Transformation, and Loading (ETL) engine that data, cleanses the data, and stores data using a uniform format that is accessible.This article shows how to connect to Harvest data as a JDBC data source and build jobs and transformations based on Harvest data in Pentaho Data Integration.
Configure to Harvest Connectivity
Start by setting the Profile connection property to the location of the Harvest Profile on disk (e.g. C:\profiles\Harvest.apip). Next, set the ProfileSettings connection property to the connection string for Harvest (see below).
Harvest API Profile Settings
To authenticate to Harvest, you can use either Token authentication or the OAuth standard. Use Basic authentication to connect to your own data. Use OAuth to allow other users to connect to their data.
Using Token Authentication
To use Token Authentication, set the APIKey to your Harvest Personal Access Token in the ProfileSettings connection property. In addition to APIKey, set your AccountId in ProfileSettings to connect.
Using OAuth Authentication
First, register an OAuth2 application with Harvest. The application can be created from the "Developers" section of Harvest ID.
After setting the following connection properties, you are ready to connect:
- ProfileSettings: Set your AccountId in ProfileSettings.
- AuthScheme: Set this to OAuth.
- OAuthClientId: Set this to the client ID that you specified in your app settings.
- OAuthClientSecret: Set this to the client secret that you specified in your app settings.
- CallbackURL: Set this to the Redirect URI that you specified in your app settings.
- InitiateOAuth: Set this to GETANDREFRESH. You can use InitiateOAuth to manage how the driver obtains and refreshes the OAuthAccessToken.
Built-in Connection String Designer
For assistance in constructing the JDBC URL, use the connection string designer built into the Harvest JDBC Driver. Either double-click the JAR file or execute the jar file from the command-line.
java -jar cdata.jdbc.api.jar
Fill in the connection properties and copy the connection string to the clipboard.

When you configure the JDBC URL, you may also want to set the Max Rows connection property. This will limit the number of rows returned, which is especially helpful for improving performance when designing reports and visualizations.
Below is a typical JDBC URL:
jdbc:api:Profile=C:\profiles\Harvest.apip;ProfileSettings='APIKey=my_personal_key;AccountId=_your_account_id';InitiateOAuth=GETANDREFRESH
Save your connection string for use in Pentaho Data Integration.
Connect to Harvest from Pentaho DI
Open Pentaho Data Integration and select "Database Connection" to configure a connection to the CData API Driver for JDBC
- Click "General"
- Set Connection name (e.g. Harvest Connection)
- Set Connection type to "Generic database"
- Set Access to "Native (JDBC)"
- Set Custom connection URL to your Harvest connection string (e.g.
jdbc:api:Profile=C:\profiles\Harvest.apip;ProfileSettings='APIKey=my_personal_key;AccountId=_your_account_id';InitiateOAuth=GETANDREFRESH
- Set Custom driver class name to "cdata.jdbc.api.APIDriver"
- Test the connection and click "OK" to save.
Create a Data Pipeline for Harvest
Once the connection to Harvest is configured using the CData JDBC Driver, you are ready to create a new transformation or job.
- Click "File" >> "New" >> "Transformation/job"
- Drag a "Table input" object into the workflow panel and select your Harvest connection.
- Click "Get SQL select statement" and use the Database Explorer to view the available tables and views.
- Select a table and optionally preview the data for verification.
At this point, you can continue your transformation or jb by selecting a suitable destination and adding any transformations to modify, filter, or otherwise alter the data during replication.

Free Trial & More Information
Download a free, 30-day trial of the CData API Driver for JDBC and start working with your live Harvest data in Pentaho Data Integration today.