HDFS Drivers & Connectors for Data Integration

Connect to files on HDFS from BI, analytics, and reporting tools through standards-based drivers. Easily integrate HDFS data with BI, Reporting, Analytics, ETL Tools, and Custom Solutions.


Decorative Icon HDFS Logo

Other Technologies



BI & Analytics



Our drivers offer the fastest and easiest way to connect real-time HDFS data with BI, analytics, reporting and data visualization technologies. They provide unmatched query performance, comprehensive access to HDFS data and metadata, and seamlessly integrate with your favorite analytics tools.

LEARN MORE: Connectivity for BI & Analytics

Popular BI & Analytics Integrations



Alteryx Designer: Prepare, Blend, and Analyze HDFS in Alteryx Designer (ODBC) Amazon QuickSight: Build Interactive Dashboards from HDFS Data in Amazon QuickSight Aqua Data Studio: Connect to HDFS in Aqua Data Studio AWS Databricks: Process & Analyze HDFS Data in Databricks (AWS) Birst: Build Visualizations of HDFS in Birst BIRT: Design BIRT Reports on HDFS Clear Analytics: Build Charts with HDFS in Clear Analytics DBxtra: Build Dashboards with HDFS in DBxtra Domo: Create Datasets from HDFS in Domo Workbench Dundas BI: Build Dashboards with HDFS in Dundas BI Excel (on Mac OS): Work with HDFS Data in MS Excel on Mac OS X FineReport: Feed HDFS into FineReport IBM Cognos BI: Create Data Visualizations in Cognos BI with HDFS Infragistics Reveal: Analyze HDFS Data in Infragistics Reval JasperServer: Create HDFS Reports on JasperReports Server Jaspersoft BI Suite: Connect to HDFS in Jaspersoft Studio JReport Designer: Integrate with HDFS in JReport Designer Klipfolio: Create HDFS-Connected Visualizations in Klipfolio KNIME: Enable the HDFS JDBC Driver in KNIME LINQPad: Working with HDFS in LINQPad Microsoft SSAS: Build an OLAP Cube in SSAS from HDFS MicroStrategy: Connect to Live HDFS Data in MicroStrategy through Connect Server MicroStrategy: Use the CData JDBC Driver for HDFS in MicroStrategy Microstrategy Desktop: Use the CData JDBC Driver for HDFS in MicroStrategy Desktop Microstrategy Web: Use the CData JDBC Driver for HDFS in MicroStrategy Web OBIEE: HDFS Reporting in OBIEE with the HDFS JDBC Driver pandas: Use pandas to Visualize HDFS in Python Pentaho Report Designer: Integrate HDFS in the Pentaho Report Designer Power BI Desktop: Author Power BI Reports on Real-Time HDFS Power BI Service: Visualize Live HDFS Data in the Power BI Service Power Pivot: Access HDFS Data in Microsoft Power Pivot Power Query: Access HDFS Data in Microsoft Power Query Qlik Cloud: Create Apps from HDFS Data in Qlik Sense Cloud QlikView: Connect to and Query HDFS in QlikView over ODBC R: Analyze HDFS in R (JDBC) R: Analyze HDFS in R (ODBC) RapidMiner: Connect to HDFS in RapidMiner Redash: Query, Visualize, and Share live HDFS Data in Redash SAP Analytics Cloud: Analyze HDFS Data in SAP Analytics Cloud SAP Business Objects: Create an SAP BusinessObjects Universe on the CData JDBC Driver for HDFS SAP Crystal Reports: Publish Reports with HDFS in Crystal Reports (JDBC) SAS: Use the CData ODBC Driver for HDFS in SAS for Real-Time Reporting and Analytics SAS JMP: Use the CData ODBC Driver for HDFS in SAS JMP Sisense: Visualize Live HDFS in Sisense Spago BI: Connect to HDFS in SpagoBI Tableau: Visualize HDFS in Tableau Desktop Tableau: Visualize HDFS in Tableau Desktop (Connect Server) Tableau Cloud: Build HDFS Visualizations in Tableau Cloud Tableau Server: Publish HDFS-Connected Dashboards in Tableau Server TIBCO Spotfire: Visualize HDFS in TIBCO Spotfire through ADO.NET TIBCO Spotfire: Visualize HDFS Data in TIBCO Spotfire TIBCO Spotfire Server: Operational Reporting on HDFS from Spotfire Server

ETL, Replication, & Warehousing



From drivers and adapters that extend your favorite ETL tools with HDFS connectivity to ETL/ELT tools for HDFS data integration — our HDFS integration solutions provide robust, reliable, and secure data movement.

Connect your RDBMS or data warehouse with HDFS to facilitate operational reporting, offload queries and increase performance, support data governance initiatives, archive data for disaster recovery, and more.


Popular Data Warehousing Integrations



Amazon Redshift: Automated Continuous HDFS Replication to Amazon Redshift Amazon S3: Automated Continuous HDFS Replication to Amazon S3 Apache Airflow: Bridge HDFS Connectivity with Apache Airflow Apache Camel: Integrate with HDFS using Apache Camel Apache Cassandra: Automated Continuous HDFS Replication to Apache Cassandra Apache Kafka: Automated Continuous HDFS Replication to Apache Kafka Apache NiFi: Bridge HDFS Connectivity with Apache NiFi Azure Data Lake: Automated Continuous HDFS Replication to Azure Data Lake Azure Synapse: Automated Continuous HDFS Replication to Azure Synapse BIML: Use Biml to Build SSIS Tasks to Replicate HDFS to SQL Server CloverDX: Connect to HDFS in CloverDX (formerly CloverETL) Couchbase: Automated Continuous HDFS Replication to Couchbase CSV: Automated Continuous HDFS Replication to Local Delimited Files Databricks: Automated Continuous HDFS Replication to Databricks ETL Validator: How to Work with HDFS in ETL Validator FoxPro: Work with HDFS in FoxPro Google AlloyDB: Automated Continuous HDFS Replication to Google AlloyDB Google BigQuery: Automated Continuous HDFS Replication to Google BigQuery Google Cloud SQL: Automated Continuous HDFS Replication to Google Cloud SQL Google Data Fusion: Build HDFS-Connected ETL Processes in Google Data Fusion Heroku / Salesforce Connect: Replicate HDFS for Use in Salesforce Connect HULFT Integrate: Connect to HDFS in HULFT Integrate IBM DB2: Automated Continuous HDFS Replication to IBM DB2 Informatica Cloud: Integrate HDFS in Your Informatica Cloud Instance Informatica PowerCenter: Create Informatica Mappings From/To a JDBC Data Source for HDFS Jaspersoft ETL: Connect to HDFS in Jaspersoft Studio Microsoft Access: Automated Continuous HDFS Replication to Microsoft Access Microsoft Azure Tables: Automated Continuous HDFS Replication to Azure SQL Microsoft Power Automate: Build HDFS-Connected Automated Tasks with Power Automate (Desktop) MongoDB: Automated Continuous HDFS Replication to MongoDB MySQL: Automated Continuous HDFS Replication to MySQL Oracle Data Integrator: ETL HDFS in Oracle Data Integrator Oracle Database: Automated Continuous HDFS Replication to Oracle petl: Extract, Transform, and Load HDFS in Python PostgreSQL: Automated Continuous HDFS Replication to PostgreSQL Replicate to MySQL: Replicate HDFS to MySQL with PowerShell SAP HANA: Automated Continuous HDFS Replication to SAP HANA SingleStore: Automated Continuous HDFS Replication to SingleStore SnapLogic: Integrate HDFS with External Services using SnapLogic (JDBC) Snowflake: Automated Continuous HDFS Replication to Snowflake SQL Server: Automated Continuous HDFS Replication to SQL Server SQL Server Linked Server: Connect to HDFS Data as a SQL Server Linked Server SQLite: Automated Continuous HDFS Replication to SQLite Talend: Connect to HDFS and Transfer Data in Talend UiPath Studio: Create an RPA Flow that Connects to HDFS in UiPath Studio Vertica: Automated Continuous HDFS Replication to a Vertica Database

Workflow & Automation Tools



Connect to HDFS from popular data migration, ESB, iPaaS, and BPM tools.

Our drivers and adapters provide straightforward access to HDFS data from popular applications like BizTalk, MuleSoft, SQL SSIS, Microsoft Flow, Power Apps, Talend, and many more.

Popular Workflow & Automation Tool Integrations



Developer Tools & Technologies



The easiest way to integrate with HDFS from anywhere. Our HDFS drivers offer a data-centric model for HDFS that dramatically simplifies integration — allowing developers to build higher quality applications, faster than ever before. Learn more about the benefits for developers:



Popular Developer Integrations



AWS Lambda: Access Live HDFS Data in AWS Lambda .NET Charts: DataBind Charts to HDFS .NET QueryBuilder: Rapidly Develop HDFS-Driven Apps with Active Query Builder Angular JS: Using AngularJS to Build Dynamic Web Pages with HDFS Apache Spark: Work with HDFS in Apache Spark Using SQL AppSheet: Create HDFS-Connected Business Apps in AppSheet C++Builder: DataBind Controls to HDFS Data in C++Builder ColdFusion: Query HDFS in ColdFusion Using JDBC ColdFusion: Query HDFS in ColdFusion Using ODBC Dash: Use Dash & Python to Build Web Apps on HDFS Delphi: DataBind Controls to HDFS Data in Delphi DevExpress: DataBind HDFS to the DevExpress Data Grid EF - Code First: Access HDFS with Entity Framework 6 EF - LINQ: LINQ to HDFS EF - MVC: Build MVC Applications with Connectivity to HDFS Filemaker Pro: Bidirectional Access to HDFS from FileMaker Pro Filemaker Pro (on Mac): Bidirectional Access to HDFS from FileMaker Pro (on Mac) Go: Write a Simple Go Application to work with HDFS on Linux Google Apps Script: Connect to HDFS Data in Google Apps Script Hibernate: Object-Relational Mapping (ORM) with HDFS Entities in Java IntelliJ: Connect to HDFS in IntelliJ JBoss: Connect to HDFS from a Connection Pool in JBoss JDBI: Create a Data Access Object for HDFS using JDBI JRuby: Connect to HDFS in JRuby Mendix: Build HDFS-Connected Apps in Mendix (JDBC) Microsoft Power Apps: Integrate Live HDFS Data into Custom Business Apps Built in Power Apps NodeJS: Query HDFS Data in Node.js (via Connect Server) NodeJS: Query HDFS through ODBC in Node.js PHP: Access HDFS in PHP through Connect Server PHP: Natively Connect to HDFS in PHP PowerBuilder: Connect to HDFS from PowerBuilder PowerShell: Pipe HDFS to CSV in PowerShell PyCharm: Using the CData ODBC Driver for HDFS in PyCharm Python: Connect to HDFS in Python on Linux/UNIX React: Build Dynamic React Apps with HDFS Data Ruby: Connect to HDFS in Ruby RunMyProcess: Connect to HDFS Data in RunMyProcess RunMyProcess DSEC: Connect to HDFS in DigitalSuite Studio through RunMyProcess DSEC SAP UI5: Integrate Real-Time Access to HDFS in SAPUI5 MVC Apps Servoy: Build HDFS-Connected Apps in Servoy Spring Boot: Access Live HDFS Data in Spring Boot Apps SQLAlchemy: Use SQLAlchemy ORMs to Access HDFS in Python Tomcat: Configure the CData JDBC Driver for HDFS in a Connection Pool in Tomcat Unqork: Create HDFS-Connected Applications in Unqork VCL App (RAD Studio): Build a Simple VCL Application for HDFS WebLogic: Connect to HDFS from a Connection Pool in WebLogic


When Only the Best HDFS Drivers Will Do

See what customers have to say about our products and support.



Frequently Asked HDFS Driver Questions

Learn more about HDFS drivers & connectors for data and analytics integration


The HDFS driver acts like a bridge that facilitates communication between various applications and HDFS, allowing the application to read data as if it were a relational database. The HDFS driver abstracts the complexities of HDFS APIs, authentication methods, and data types, making it simple for any application to connect to HDFS data in real-time via standard SQL queries.

Working with a HDFS Driver is different than connecting with HDFS through other means. HDFS API integrations require technical experience from a software developer or IT resources. Additionally, due to the constant evolution of APIs and services, once you build your integration you have to constantly maintain HDFS integration code moving forward.

By comparison, our HDFS Drivers offer codeless access to live HDFS data for both technical and non-technical users alike. Any user can install our drivers and begin working with live HDFS data from any client application. Because our drivers conform to standard data interfaces like ODBC, JDBC, ADO.NET etc. they offer a consistent, maintenance-free interface to HDFS data. We manage all of the complexities of HDFS integration within each driver and deploy updated drivers as systems evolve so your applications continue to run seamlessly.

Many organizations draw attention to their library of connectors. After all, data connectivity is a core capability needed for applications to maximize their business value. However, it is essential to understand exactly what you are getting when evaluating connectivity. Some vendors are happy to offer connectors that implement basic proof-of-concept level connectivity. These connectors may highlight the possibilities of working with HDFS, but often only provide a fraction of capability. Finding real value from these connectors usually requires additional IT or development resources.

Unlike these POC-quality connectors, every CData HDFS driver offers full-featured HDFS data connectivity. The CData HDFS drivers support extensive HDFS integration, providing access to all of the HDFS data and meta-data needed by enterprise integration or analytics projects. Each driver contains a powerful embedded SQL engine that offers applications easy and high-performance access to all HDFS data. In addition, our drivers offer robust authentication and security capabilities, allowing users to connect securely across a wide range of enterprise configurations. Compare drivers and connectors to read more about some of the benefits of CData's driver connectivity.

With our drivers and connectors, every data source is essentially SQL-based. The CData HDFS driver contains a full SQL-92 compliant engine that translates standard SQL queries into HDFS API calls dynamically. Queries are parsed and optimized for each data source, pushing down as much of the request to HDFS as possible. Any logic that can not be pushed to HDFS is handled transparently client-side by the driver/connector engine. Ultimately, this means that HDFS looks and acts exactly like a database to any client application or tool. Users can integrate live HDFS connectivity with ANY software solution that can talk to a standard database.

The HDFS drivers and connectors offer comprehensive access to HDFS data. Our HDFS driver exposes static and dynamic data and metadata, providing universal access to HDFS data for any enterprise analytics or data mangement use. To explore the HDFS driver data model, please review the edition-specific HDFS driver documentation.

Using the CData HDFS drivers and connectors, HDFS can be easily integrated with almost any application. Any software or technology that can integrate with a database or connect with standards-based drivers like ODBC, JDBC, ADO.NET, etc., can use our drivers for live HDFS data connectivity. Explore some of the more popular HDFS data integrations online.

HDFS Analytics is universally supported for BI and data science. In addition, CData provides native client connectors for popular analytics applications like Power BI, Tableau, and Excel that simplify HDFS data integration. Additionally, native Python connectors are widely available for data science and data engineering projects that integrate seamlessly with popular tools like Pandas, SQLAlchemy, Dash, and Petl.

HDFS data integration is typically enabled with CData Sync, a robust any-to-any data pipeline solution that is easy to set up, runs everywhere, and offers comprehensive enterprise-class features for data engineering. CData Sync makes it easy to replicate HDFS data any database or data warehouse, and maintain parity between systems with automated incremental HDFS replication. In addition, our HDFS drivers and connectors can be easily embedded into a wide range of data integration tools to augment existing solutions.

Absolutely. CData offers native Excel Add-Ins for HDFS integration. These Add-Ins provide live access to HDFS data directly from Microsoft Excel.