The metrics in this article are from the most up-to-date drivers available as of July 2019.
In this article, we compare the performance of the CData JDBC Driver for Amazon DynamoDB to the same technology produced by another company (Competitor 1). In our testing, we found that the CData Driver outperformed the Competitor driver, querying and processing data five times faster. The difference in performance is largely due to better client & server-side resource usage. Details of the comparisons follow.
Since the drivers are being compared side-by-side, the performance of the machine itself is relatively unimportant; what matters is how the drivers compare relative to one another.
The Data
To provide a reproducible comparison, we use the sample restaurants dataset made publicly available by MongoDB, Inc. To create a large data set (around 10 million records), we added the original dataset to Amazon DynamoDB multiple times.
Table
Number of Rows
restaurants
25,360
restaurants_2
10,020,921
JDBC Driver Read Performance
First, we compared the related performance of the drivers by running the same queries with each driver using the JDBC drivers in a simple Java application. To simulate actual data processing beyond, we read and process the values of every field in each row. The exact queries tested are listed below:
SELECT borough, restaurant_id, cuisine, name FROM restaurants
SELECT borough, restaurant_id, cuisine, name FROM restaurants_2
We set the provisioned Read Capacity in DyanmoDB for the tables queried to 1000 for the duration of our tests. The results of processing the query results are below.
JDBC Query Times by Company (in milliseconds)
Query
CData Software
Competitor 1
1 (~25k rows)
2728 (+217.9%)
8673
2 (~10m rows)
139818 (+462.4%)
786368
Note that these performance numbers are from a non-default configuration of the Competitor driver. By default, the Competitor driver issues each query using a single thread. In contrast, the CData Driver uses up to four threads by default and can be configured to use as many as needed. At installation, the CData Driver shows a better performance comparison than our tests indicate. For this article, we tested only after increasing the thread count for the Competitor driver to four to match the default setting of the CData Driver. As can be seen in the results, the CData Driver retrieves and processes result sets significantly faster than the Competitor driver.
JDBC Driver Resource Usage
While testing the read performance, we also measured client & server-side resource usage, looking specifically at client-side memory & CPU usage and allocated Read capacity Unit (RU) consumption. The charts below were found by running a sample Java program and using Java VisualVM to capture the CPU and memory usage. We used Java version 8 update 211 with a maximum heap size of 4.27 Gigabytes.
Querying with High Read Capacity
For this comparison, we ran a query for a large number of rows, with a high read capacity allocation for the DynamoDB table: SELECT borough, restaurant_id, cuisine, name FROM restaurants_2
CData Driver
Competitor 1 Driver
When we provision a high read capacity (1000 read units), the differences in how each driver utilizes available client-side resources are stark. Based on the graph, the CData Driver maintains a high client-side resource usage, using around 37% of the CPU and averaging near 700 MBs of heap usage.
In contrast, the Competitor driver only uses around 4% of the CPU and averages around 110 MBs of RAM usage. By making better use of client-side resources, the CData Driver requests and processes data more than five times faster than the Competitor driver. Finishing the read process faster not only saves on time, but it means that you are making the best use of resources provisioned for the DynamoDB table.
DynamoDB Read Capacity Consumption
While testing the client-side resource usage, we also captured the read capacity consumption for each driver (with 1000 read units provisioned). The graph below shows the read capacity consumed by each driver for the same query. The first spike in consumed read capacity represents the CData Driver, where the second bump represents the Competitor driver.
DynamoDB CloudWatch Metrics
Based on the graph, we can see that the CData Driver makes significantly better use of the provisioned read capacity, utilizing around 70% of the available read capacity units. The Competitor driver, meanwhile, uses less than 20% of the available read capacity (despite being configured in the JDBC URL to use 100%), further explaining why the competitor driver takes longer to request and process the table data.
Conclusion
The CData Software Drivers regularly prove to be faster than the equivalent competitor product, particularly when dealing with large data sets. We realize that speed is only one measurement, but the performance of our drivers is a reliable indicator of the depth and technical prowess embedded in all of our drivers and data access technologies. Our developers have spent countless hours optimizing the performance in processing the results returned by the DynamoDB database to the point that the drivers seem only to be hindered by web traffic and server processing times.
Download a free, 30-day trial of any of our DynamoDB drivers and experience the CData difference for yourself.
This website stores cookies on your computer. These cookies are used to collect information about how you interact with our website and allow us to remember you. We use this information in order to improve and customize your browsing experience and for analytics and metrics about our visitors both on this website and other media. To find out more about the cookies we use, see our Privacy Policy.