Ready to get started?

Download a free trial of the Apache Spark SSIS Component to get started:

 Download Now

Learn more:

Apache Spark Icon Apache Spark SSIS Components

Powerful SSIS Components that allows you to easily connect SQL Server with Apache Spark through SSIS Workflows.

Use the Spark Data Flow Components to synchronize with Apache Spark data. Perfect for data synchronization, local back-ups, workflow automation, and more!

Export Data from SQL Server to Spark through SSIS

Easily push SQL Server data to Spark using the CData SSIS Tasks for Spark.

SQL Server databases are commonly used to store enterprise records. It is often necessary to move this data to other locations. The CData SSIS Task for Spark allows you to easily transfer Spark data. In this article you will export data from SQL Server to Spark.

Add Source and Destination Components

To get started, add a new ADO.NET Source control and a new Spark Destination control to the data flow task.

Configure the ADO.NET Source

Follow the steps below to specify properties required to connect to the SQL Server instance.

  1. Open the ADO.NET Source and add a new connection. Enter your server and database information here.
  2. In the Data access mode menu, select "Table or view" and select the table or view to export into Spark.
  3. Close the ADO NET Source wizard and connect it to the destination component.

Create a New Connection Manager for Spark

Follow the steps below to set required connection properties in the Connection Manager.

  1. Create a new connection manager: In the Connection Manager window, right-click and then click New Connection. The Add SSIS Connection Manager dialog is displayed.
  2. Select CData SparkSQL Connection Manager in the menu.
  3. Configure the connection properties.

    Set the Server, Database, User, and Password connection properties to connect to SparkSQL.

Configure the Spark Destination

In the destination component Connection Manager, define mappings from the SQL Server source table into the Spark destination table and the action you want to perform on the Spark data. In this article, you will insert Customers entities to Spark.

  1. Double-click the Spark destination to open the destination component editor.
  2. In the Connection Managers tab, select the connection manager previously created.
  3. In the Use a Table, menu, select Customers. In the Action menu, select Insert.
  4. On the Column Mappings tab, configure the mappings from the input columns to the destination columns.

Run the Project

You can now run the project. After the SSIS Task has finished executing, data from your SQL table will be exported to the chosen table.