We are proud to share our inclusion in the 2024 Gartner Magic Quadrant for Data Integration Tools. We believe this recognition reflects the differentiated business outcomes CData delivers to our customers.
Get the Report →Integrate MongoDB Data in Pentaho Data Integration
Build ETL pipelines based on MongoDB data in the Pentaho Data Integration tool.
The CData JDBC Driver for MongoDB enables access to live data from data pipelines. Pentaho Data Integration is an Extraction, Transformation, and Loading (ETL) engine that data, cleanses the data, and stores data using a uniform format that is accessible.This article shows how to connect to MongoDB data as a JDBC data source and build jobs and transformations based on MongoDB data in Pentaho Data Integration.
About MongoDB Data Integration
Accessing and integrating live data from MongoDB has never been easier with CData. Customers rely on CData connectivity to:
- Access data from MongoDB 2.6 and above, ensuring broad usability across various MongoDB versions.
- Easily manage unstructured data thanks to flexible NoSQL (learn more here: Leading-Edge Drivers for NoSQL Integration).
- Leverage feature advantages over other NoSQL drivers and realize functional benefits when working with MongoDB data (learn more here: A Feature Comparison of Drivers for NoSQL).
MongoDB's flexibility means that it can be used as a transactional, operational, or analytical database. That means CData customers use our solutions to integrate their business data with MongoDB or integrate their MongoDB data with their data warehouse (or both). Customers also leverage our live connectivity options to analyze and report on MongoDB directly from their preferred tools, like Power BI and Tableau.
For more details on MongoDB use case and how CData enhances your MongoDB experience, check out our blog post: The Top 10 Real-World MongoDB Use Cases You Should Know in 2024.
Getting Started
Configure to MongoDB Connectivity
Set the Server, Database, User, and Password connection properties to connect to MongoDB. To access MongoDB collections as tables you can use automatic schema discovery or write your own schema definitions. Schemas are defined in .rsd files, which have a simple format. You can also execute free-form queries that are not tied to the schema.
Built-in Connection String Designer
For assistance in constructing the JDBC URL, use the connection string designer built into the MongoDB JDBC Driver. Either double-click the JAR file or execute the jar file from the command-line.
java -jar cdata.jdbc.mongodb.jar
Fill in the connection properties and copy the connection string to the clipboard.

When you configure the JDBC URL, you may also want to set the Max Rows connection property. This will limit the number of rows returned, which is especially helpful for improving performance when designing reports and visualizations.
Below is a typical JDBC URL:
jdbc:mongodb:Server=MyServer;Port=27017;Database=test;User=test;Password=Password;
Save your connection string for use in Pentaho Data Integration.
Connect to MongoDB from Pentaho DI
Open Pentaho Data Integration and select "Database Connection" to configure a connection to the CData JDBC Driver for MongoDB
- Click "General"
- Set Connection name (e.g. MongoDB Connection)
- Set Connection type to "Generic database"
- Set Access to "Native (JDBC)"
- Set Custom connection URL to your MongoDB connection string (e.g.
jdbc:mongodb:Server=MyServer;Port=27017;Database=test;User=test;Password=Password;
- Set Custom driver class name to "cdata.jdbc.mongodb.MongoDBDriver"
- Test the connection and click "OK" to save.
Create a Data Pipeline for MongoDB
Once the connection to MongoDB is configured using the CData JDBC Driver, you are ready to create a new transformation or job.
- Click "File" >> "New" >> "Transformation/job"
- Drag a "Table input" object into the workflow panel and select your MongoDB connection.
- Click "Get SQL select statement" and use the Database Explorer to view the available tables and views.
- Select a table and optionally preview the data for verification.
At this point, you can continue your transformation or jb by selecting a suitable destination and adding any transformations to modify, filter, or otherwise alter the data during replication.

Free Trial & More Information
Download a free, 30-day trial of the CData JDBC Driver for MongoDB and start working with your live MongoDB data in Pentaho Data Integration today.