Open Delta Tables: A Stronger Foundation for Fabric and Databricks

by Andrew Petersen | October 13, 2025

Open Delta Tables It wasn’t that long ago that the concept of a data lake — low-cost, flexible ways of storing large amounts of data — seemed cutting edge.

First came the Hadoop ecosystem of the early 2010s and then the emergence of Parquet files and cloud file storage a decade ago. Data storage via “schema-on-read" was fast and cheap, but with some pretty large caveats around reliability and trust.

In the past half-decade, Delta Lake has emerged as the universal format for building modern data lakehouses. But too often, ingestion tools either drop raw files into storage or wrap data in proprietary layers that create lock-in.

With its latest release, CData Sync now delivers data directly into open Delta tables, giving organizations the freedom to replicate once and query anywhere.

Why Delta tables matter

Delta Lake isn’t just another file format. Where Parquet provides cost-effective storage, Delta extends that to make those datasets truly analytics ready.

Delta combines Parquet with the reliability of a transaction log, ensuring ACID guarantees:

Atomicity ensures operations fully succeed or fully fail, preventing partial writes.
Consistency enforces schema and integrity across all operations.
Isolation allows multiple reads and writes to happen safely in parallel.
Durability guarantees that committed data is preserved.

These properties matter because data pipelines aren’t static. New records arrive continuously, schemas evolve, and workloads scale up and down.

Without ACID guarantees, teams risk corrupted tables, unreliable queries, and hours spent reconciling inconsistencies. With Delta, every table is trustworthy by design.

Delta in 90 seconds
What is Delta Lake?
An open-table format built on Parquet that adds a transaction log for reliability and governance.
Why it matters
ACID guarantees
Atomicity: writes are all-or-nothing
Consistency: schema and rules enforced
Isolation: safe for concurrent reads/writes
Durability: committed data never lost
Replicate once, query anywhere
One open Delta table can be queried by Databricks, Fabric, Trino, Spark, and more — no conversion needed
Enterprise-ready
Handles schema evolution, deletes, and versioning
Built for both batch and streaming pipelines

Replicate once, query anywhere

The real power of Delta lies in interoperability. Because Delta is open and widely supported, the same table can be queried in Databricks, Microsoft Fabric, Trino, Spark, Presto, and more — no conversions or extra replication required.

Delta Parquet works seamlessly with the leading cloud storage services, including:

Amazon S3
Azure Blob Storage
Azure Data Lake Storage (ADLS)
Google Cloud Storage (GCS)

That means your pipeline only needs to land data once. From there, analysts, engineers, and AI models can all work against the same governed dataset using their tool of choice. It’s efficiency, consistency, and control rolled into one.

How CData Sync delivers open Delta

CData Sync writes directly to open Delta tables in cloud storage. Some data replication tools abstract the process or layer on multiple metadata systems. Conversely, Sync keeps it simple and transparent.

Key benefits:

Partitioning and layout control: Optimize Delta tables for your workloads rather than accept black-box defaults.
Schema evolution management: Decide how changes are handled, so downstream users get clean, predictable tables.
Deletes and compaction strategies: Control how historical data is maintained to balance governance and cost.
Target flexibility: Replicate to S3, Blob, ADLS, or GCS

This flexible approach yields pipelines you can govern and trust, producing open Delta tables that are immediately usable in any compatible engine.

For Databricks users

With Sync, every Delta table landed in cloud storage is immediately available to Databricks SQL, governed through Unity Catalog, and ready for Genie-powered AI exploration. That reduces pipeline friction and ensures data is ready for both analytics and machine learning.

For Fabric users

Microsoft Fabric is built on OneLake, which natively supports Delta. Sync offers two complementary ways to deliver data into Fabric:

Open Mirroring: Replicates SQL sources into OneLake as Delta Parquet files, automatically surfaced as Fabric SQL endpoints.
Delta replication to cloud file storage: Delivers open Delta tables into S3, Blob, ADLS, or GCS that can also be queried in Fabric and any other Delta-aware engines.

Together, these options give Fabric customers the flexibility to unify operational SQL replication with open-format Delta pipelines for analytics and AI.

What about other platforms?

Delta Lake is natively supported in Databricks, Fabric, Spark, and other Delta-aware engines. That’s where you get the full set of benefits — ACID transactions, schema enforcement, versioning, and time travel.

But even in platforms without native Delta support, the underlying Parquet files are still accessible. That means your data remains portable and usable, even if you don’t get the richer Delta functionality. With Sync, you land the data once in an open format, then decide how and where to consume it as your architecture evolves.

Delta Lake compatibility at a glance

Engine / Platform	Support Level	Notes on Usage
Databricks	Yes (native)	First-class support; integrates with Unity Catalog and Genie.
Microsoft Fabric	Yes (native)	OneLake stores Delta Parquet and exposes SQL endpoints.
Trino / Presto	Yes (native)	Query Delta tables directly via connectors.
Apache Spark	Yes (native)	Delta Lake open-source library provides full support.
Snowflake	Partial (Parquet)	Can read Parquet files but not Delta logs; requires external tables, connectors, or conversion.
BigQuery	Partial (Parquet)	Can load Parquet but ignores Delta logs; no ACID or versioning.
Athena	Partial (Parquet)	Reads Parquet in S3 but not Delta transaction log.

Building on an open foundation

Delta has become the default table format of the lakehouse era because it ensures data is both reliable and portable. By adding open Delta replication, CData Sync makes it easier for organizations to design pipelines that are efficient, governed, and ready for the future.

Watch this quick demo to see how a single Delta table pipeline can be queried in Databricks and Fabric.

For more information on how Delta tables work within Fabric (and other ways to get data into and out of OneLake), watch the Sync-to-Fabric demo.

Take the product tour to learn why 400+ customers chose CData for its flexibility, performance, and price-to-value.

Try CData Sync free

Download your free 30-day trial to see how CData Sync delivers seamless integration

Get the trial

Data Management CData Sync

CData is the data layer that makes AI work in production—live connectivity and replication across 350+ sources, semantic context, and built-in governance. Powering AI for Databricks, Microsoft, Google, Palantir, and 10,000+ customers worldwide.

Blog