by Alex Pauncz | May 28, 2021

Connecting Salesforce to your Snowflake Data Warehouse

Salesforce is one of the world's most popular cloud-based platforms for business. For many organizations, Salesforce acts as a central hub for all customer data and customer-centric processes. As core infrastructure, the data stored within Salesforce must often be combined with other internal and external systems to fuel analytics and reporting. While real-time integration has its advantages, many organizations choose to consolidate data into a central data lake or data warehouse to facilitate analytics.

Snowflake is a widely-used relational data warehouse built on a cloud-native architecture, delivering ease of implementation, flexible deployment, security, and scalability. Whether the goal is simple data backup & archiving, providing access to data for custom app development, or enabling centralized reporting, BI, analytics, AI/ML, and more, data warehousing with Snowflake is core to enterprise IT.

So how can you integrate Salesforce with Snowflake? Given the scale and critical nature of data in your CRM, you need an automated, reliable way to pipeline Salesforce data into Snowflake. ETL/ELT processes are the most popular way to get the job done, and for good reason

The Data Pipeline: An Easy Way to Manage Salesforce-Snowflake Integration

Data pipelines enable users to automate and schedule data replications from various data sources, such as Salesforce, into a range of databases and data warehouse destinations, including Snowflake. Data Pipelines support automated data movement through ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) processes to replicate massive volumes of data to your data lake or data warehouse. A good ETL/ELT platform will dramatically simplify and speed up the process with several critical features:

  • Easy, point-and-click workflow builders to create replications
  • Abstraction to eliminate the need for custom coding
  • Automated scheduling to regularly replicate data into Snowflake
  • Built-in logging & monitoring to simplify troubleshooting and ensure data reaches its destination on-time, every time
  • Enterprise-grade security and reliability

At CData, we have built a leading Data Pipeline solution, CData Sync, which leverages the powerful data connectivity of our high-performance connector technology to help you automate your Salesforce-Snowflake replication proceses.

Simplifying Data Replication from Salesforce to Snowflake with CData Sync

Sync simplifies the process of cloud enterprise data replication, from 100+ data sources to 30+ destinations, including Salesforce and Snowflake. Schedule mass-scale workflows with the click of a button and ease your management burden with its purpose-built administration, security, and monitoring.

Ease-of-Use

To set up automated data replications, simply follow five easy steps:

  1. Spin up CData Sync on-demand as part of your cloud infrastructure (AWS, Azure, or Oracle)
  2. Authenticate your Salesforce and Snowflake accounts in CData Sync
  3. Select your Salesforce Tables from a drop-down list in Sync
  4. Select your destination folders from Snowflake
  5. Set up a scheduled interval, and Sync will replicate new or updated data from Salesforce to Snowflake by default

That's it. No coding necessary. It's as simple and straightforward a process as you will find to configure jobs that pull over the desired entities from Salesforce and insert them in your Snowflake database. For a tutorial on how to use Sync to set up this replication, click here.

Standard Tables

Salesforce stores opportunities, leads, and other data in “entities" that mimic a relational database table structure and allow you to use APIs to access your Salesforce data. Sync normalizes Salesforce's APIs, making it easy to replicate data from Salesforce into the Snowflake RDBMS data warehouse. By default, the table in Snowflake will be an exact replica of the data in Salesforce.

Automatic Data Model Matching to Custom Snowflake Tables

When replicating data into Snowflake, you may want to rename columns to support the naming scheme in your data warehouse. Sync makes it easy to rename columns in your destination. In addition, Salesforce customers frequently add columns to their tables. Sync automatically alters the destination schema based on any new columns that appear and will load the correct data into the altered table.

Incremental Data Transfer with Scheduler

Constantly moving over large amounts of data can wreak havoc on your bandwidth, Salesforce instance, and Snowflake instance. Sync allows you to replicate everything once, then schedule time intervals for moving updated data from Salesforce to Snowflake, which can range from every 10 minutes to once a day. To minimize the impact of data transfer on performance, Sync moves only the data that's been changed or deleted during each interval.

Capture Deleted Data

You can configure CData Sync such that anytime data is deleted in a Salesforce table, Sync reflects that deletion in the Snowflake destination table. You can either perform a hard delete that removes the data from the destination or a soft delete where the data remains but is marked as deleted in a separate column, allowing you to create reports about deleted data.

Circumvent API Limitations with Bulk Incremental Replications

Every time a report that queries Salesforce APIs is loaded or refreshed, that action counts toward your daily API limits in Salesforce. When you use Sync to load your data to Snowflake, only the Sync activities that move over incremental data count toward your API limitations. You can reload or refresh the report itself, which may pull all the data, as many times as you want.

Consolidate Data with Broad Connectivity

If you're like most organizations, you use a wide range of operational systems to run your business. Data warehouses are designed to consolidate data from multiple systems to provide a more holistic context for your decisions. Sync supports more than 200 different data sources, so you can easily schedule jobs that pull data from HubSpot, SAP, Salesforce, and many other applications, move that data to Snowflake, and then ultimately run queries against the consolidated data.

Notifications & Alerts for Simple Troubleshooting

Sync will send an email after each job completes that gives you the rundown of how many records were added to your database, how the job went, and whether there were any errors. We've built Sync to simplify troubleshooting and reporting, as you can always check logs to see each precise replication and record.

Deploy Anywhere, Protect Your Data

CData Sync can deploy anywhere in minutes – on-premise, in your private cloud, even inside AWS, Azure, and Oracle. As you can host Sync anywhere, NO third party has access to your data, including CData. You maintain total security and control.

Use Sync to Easily Replicate Data from Salesforce to Snowflake

With Sync, you no longer need to write complicated scripts to replicate Salesforce data into Snowflake. Data transfer is as easy as the click of a button. Sync provides sophisticated management capabilities as well. Find out more about how Sync can bring you faster access to data for analytics.

Get a 30-Day Free Trial of CData Sync