CData Sync
Transform Data In-Flight for Cleaner, Analytics-Ready Data
Shape and standardize data as it moves between systems to reduce downstream modeling and cloud compute. CData Sync applies business logic and SQL-based transformations consistently across Snapshot, Delta Snapshot, Change Data Capture (CDC), and reverse ETL jobs, giving teams full control how data is delivered and stored.
Understanding transformations
Transformations in Sync reduce the need for separate data-preparation pipelines and lower cloud compute by applying business logic, filtering, and standardization as data moves into analytics and operational systems. These transformations are applied consistently across replication types, including Snapshot, Delta Snapshot, and CDC.
Why teams use transformations
Normalize schemas across ERP, CRM, and operational systems so downstream analytics start with consistent, usable table
Filter high-volume data at the source to reduce storage and compute consumption in cloud data warehouses
Prepare analytics-ready datasets during ingestion rather than relying on downstream modeling tools
Enrich reverse ETL syncs with calculated fields that improve CRM and ERP process automation
Apply governance rules,such as masking or hashing, before data reaches cloud platforms to reduce exposure risk
Use the same transformation logic across Snapshot, Delta Snapshot, CDC, and reverse ETL jobs to reduce maintenance and avoid duplicated SQL
How transformations work
Column expressions
Create cleaner, more usable data during ingestion by applying SQL logic directly within the replication process.
Apply SQL expressions during replication:
- Create derived metrics (for example, total_cost = qty * unit_price)
- Normalize date and time-zone adjustments
- Implement conditional logic using CASE expressions (for example, CASE WHEN…)
- Employ hashing or masking for governance of personally identifiable information (PII)
Row and column filtering
Reduce storage, compute, and processing overhead by filtering data early in the replication pipeline.
Keep only the data that matters:
- Exclude inactive rows or archived history
- Reduce wide source tables (700+ columns) to only the curated fields that are needed downstream
- Limit high-volume CDC or reverse ETL workloads to the specific data slices that are required
Joins, lookups, and enrichment
Eliminate downstream modeling steps by enriching datasets before they land in your warehouse or operational systems. Sync supports upstream enrichment by joining reference tables that already exist in the source system.
Common examples:
- Finance: Join cost-center tables during ERP warehouse loads
- Retail: Merge location metadata into point-of-sale feeds.
- Manufacturing / Energy: Attach asset metadata to high-frequency equipment readings
Schema remapping
Standardize and clean inbound data so that pipelines land in a predictable, analytics-ready form without manual cleanup.
Sync can perform these mapping functions:
- Rename columns
- Reorder fields
- Standardize naming conventions, such as snake_case or camelCase
- Convert datatypes for warehouse or SaaS compatibility
Where transformations apply
Apply a single transformation definition across all replication styles to reduce pipeline sprawl, eliminate duplicative SQL, and ensure consistent logic from ingestion through operational syncs.
Snapshot and Delta Snapshot replication
Transformations run as part of Sync's SQL-based change-detection engine, which uses the EXCEPT and MINUS SQL set operators for reverse ETL and warehouse loads.
CDC jobs
Transformations are applied as changes stream from transaction logs, enabling consistent modeling across inserts, updates, and deletesReverse ETL
Transformations create CRM- or ERP-ready fields before upserts, including external IDs, status indicators, and normalized attributes.Business use cases by industry
Organizations across industries use Sync transformations to standardize diverse data sources, reduce downstream modeling work, and prepare analytics- and operations-ready outputs during ingestion.
Energy and utilities
- Normalize Supervisory Control and Data Acquisition (SCADA) or operational logs into analytic structures
- Enrich asset telemetry with equipment metadata
- Downsample or filter high-frequency sensor data
Financial services
- Standardize transaction formats across banking systems
- Mask PII before data ingestion into Snowflake and Databricks
- Calculate derived regulatory metrics during ingestion
Manufacturing
- Build consistent production datasets across factories
- Enrich machine logs with asset master data
- Create feature sets that are ready for predictive maintenance
Retail/CPG
- Normalize point-of-sale, loyalty, and product-catalog data
- Join attribute tables to simplify merchandising analytics
- Prepare marketing-ready insights for reverse ETL