A Performance Comparison of Drivers for MongoDB



The metrics in this article were found using the most up-to-date drivers available as of June 2019.

In this article, we compare the performance of the CData Drivers for MongoDB to the same technologies produced by two companies (Competitor 1 and Competitor 2), as well as the matching "drivers" produced by MongoDB, Inc. We compared read performance, measuring the amount of time that it takes to query MongoDB for data and process the result set in some way.

Since the drivers are being compared side-by-side, the performance of the machine itself is relatively unimportant; what matters is how the drivers compare relative to one another.

The Data



In order to provide a reproducible comparison, we copied the sample restaurants dataset (made publicly available by MongoDB, Inc.) and then built successively larger datasets based on the sample data. The relevant details for the table(s) queried are below:

Table Number of Rows
restaurants 25,360
restaurants_2 2,003,362
restaurants_3 10,016,805

Queries



The main goal of this investigation was to compare the related performance of the drivers. We did this by running the same queries with each driver. To simulate actual processing of the data beyond simply reading from MongoDB, we stored the values of each row in a string variable (that was replaced for each row). The queries are listed below:

  1. SELECT borough, restaurant_id, _id, cuisine, name FROM restaurants
  2. SELECT borough, restaurant_id, _id, cuisine, name FROM restaurants_2
  3. SELECT borough, restaurant_id, _id, cuisine, name FROM restaurants_3

Results



Below, you can see the performance of the various queries, based on the driver/platform.

JDBC / Java Drivers

All four the companies compared produce a JDBC driver or other technology that provide a native experience with MongoDB data in Java applications. The results of processing query results in a simple Java application are below.

JDBC/Java Query Times by Company (in milliseconds)
Query CData Software Competitor 1 Competitor 2 MongoDB, Inc.
1 (~25,000 rows) 59.3 (-15% - +28%) 62.8 50.4 75.7
2 (~2,000,000 rows) 1,548.9 (up to +200%) 1,555.6 3,035.5 4,646.7
3 (~10,000,000 rows) 10,195.1 (+6% - +174%) 10,795.3 17,798.8 27,991.9

As can be seen in the results, the CData drivers were able to work with large result sets at least as fast as, if not faster than, the other drivers, regularly retrieving and processing results over twice as fast as the slowest competitor. In the case where the CData drivers are slower, the margins are barely noticeable and are due to performing a live schema discovery.

The average runtime for each query (of the larger datasets) is compared in the charts below:

Results for ~2,000,000 Rows

Results for ~10,000,000 Rows

Conclusion



The CData Software drivers regularly prove to be faster than the competitors' equivalent products, particularly when dealing with large data sets. When the drivers are slower, the difference is barely noticeable (less than 10 milliseconds) and is the trade-off for live schema discovery that includes deeper drill-down into NoSQL data that most other providers. You can read more about our innovative practices for working with NoSQL data (like that stored in MongoDB) in or NoSQL Drivers feature comparison.

We realize that speed is only one measurement, but the performance of our drivers is a strong indicator of the depth and technical prowess embedded in all of our drivers and data access technologies. Our developers have spent countless hours optimizing the performance in processing the results returned by the MongoDB database to the point that the drivers seem to only be hindered by web traffic and server processing times.

Related Articles