製品をチェック

無償トライアル:

無償トライアルへ

製品の情報と無償トライアルへ:

Amazon S3 SSIS Components

パワフルなSSIS Source & Destination Components で、SQL Server とAmazon S3 クラウドストレージをSSIS ワークフローで連携。

Amazon S3 Data Flow Components を使って、S3 のバケットやオブジェクトを簡単に同期。データ同期、ローカルバックアップ、ワークフロー自動化に最適!

データ連携でお困りですか?

お問い合わせ

Biml を使って、Amazon S3 Data をSQL Server に同期するSSIS タスクを作成


CData SSIS Components を使って、Biml で Amazon S3 data をSQL Server に同期するタスクを作成します。


古川えりか
コンテンツスペシャリスト

amazons3 ロゴ画像

SSIS

ssis ロゴ画像
BIML ロゴ画像

SQL Server に基幹業務データのバックアップを保管することは、ビジネス上のセーフティネットです。また、ユーザーはSQL Server のバックアップデータからレポーティングや分析を簡単に行うことができます。Biml は、SSIS パッケージのようなMicrosoft SQL Server BI Object で使えるXML の方言です。CData SSIS Components をBiml と組み合わせて使うことで、Amazon S3 data に連携するSSIS パッケージを簡単ンい作成できます。以下のメリットがあります:

  • Built-in metadata discovery — CData SSIS components expose metadata just like working with SQL Server, even dynamically generating schema for schema-less data sources
  • Dynamic SSIS task generation — Use code nuggets in Biml to build SSIS tasks by iterating over discovered metadata
  • Read from Amazon S3 — Native source components make Amazon S3 look just like a database

This article demonstrates how to use Biml with the CData SSIS Tasks for AmazonS3 to dynamically build SSIS tasks (one for each Amazon S3 entity) to replicate Amazon S3 data to a Microsoft SQL Server database. We step through the Biml file one section at a time but have included the complete Biml file at the end of the article.

Getting Started

In order to use Biml in an SSIS Project in Visual Studio, install BimlExpress. Once you install BimlExpress, open Visual Studio, create a new Integration Services project, and add a new Biml file.

Add a new Biml file to the SSIS project

Building the Biml File

With Biml, you can write scripting to dynamically generate SSIS projects, packages, and tasks. To see the Biml file for an existing project (and gain insights on using Biml with CData SSIS Tasks), simply create your tasks and then right-click the project and select Convert SSIS Packages to Biml.

C# Code

  1. Use directives <#@ .. #> to import necessary namespaces and the assembly for the CData SSIS Components for AmazonS3.
    <#@ template language="C#" hostspecific="true"#>
    <#@ import namespace="System.Data"#>
    <#@ import namespace="System.IO"#>
    <#@ import namespace="System.Collections"#>
    <#@ import namespace="System.Data.CData.AmazonS3"#>
    <#@ assembly name="C:\Program Files\CData\CData SSIS Components for AmazonS3 2018\lib\CData.SSIS2017.AmazonS3.dll"#>
    
  2. In a new control nugget <# ... #>, create variables for values that will be used throughout the Biml script, including a connection string for AmazonS3 and structures to store the Amazon S3 metadata.

    Amazon S3 リクエストを認可するには、管理者アカウントまたはカスタム権限を持つIAM ユーザーの認証情報を入力します。AccessKey をアクセスキーID に設定します。SecretKey をシークレットアクセスキーに設定します。

    Note: AWS アカウント管理者として接続できますが、AWS サービスにアクセスするにはIAM ユーザー認証情報を使用することをお勧めします。

    尚、本製品はAmazon S3 のファイルの一覧表示やユーザー管理情報の取得用です。S3 に保管されているExcel、CSV、JSON などのファイル内のデータを読み込みたい場合には、Excel DriverCSV DriverJSON Driver をご利用ください。

    アクセスキーの取得

    IAM ユーザーの資格情報を取得するには:

    1. IAM コンソールにサインインします。
    2. ナビゲーションペインで「ユーザー」を選択します。
    3. ユーザーのアクセスキーを作成または管理するには、ユーザーを選択してから「セキュリティ認証情報」タブを選択します。

    AWS ルートアカウントの資格情報を取得するには:

    1. ルートアカウントの資格情報を使用してAWS 管理コンソールにサインインします。
    2. アカウント名または番号を選択し、表示されたメニューで「My Security Credentials」を選択します。
    3. 「Continue to Security Credentials」をクリックし、「Access Keys」セクションを展開して、ルートアカウントのアクセスキーを管理または作成します。

    AWS ロールとして認証

    多くの場合、認証にはAWS ルートユーザーのダイレクトなセキュリティ認証情報ではなく、IAM ロールを使用することをお勧めします。RoleARN を指定することでAWS ロールを代わりに使用できます。これにより、本製品は指定されたロールの資格情報を取得しようと試みます。

    (すでにEC2 インスタンスなどで接続されているのではなく)AWS に接続している場合は、ロールを引き受けるIAM ユーザーのAccessKey とSecretKey を追加で指定する必要があります。AWS ルートユーザーのAccessKey および SecretKey を指定する場合、ロールは使用できません。

    SSO 認証

    SSO 認証を必要とするユーザーおよびロールには、RoleARN およびPrincipalArn 接続プロパティを指定してください。各Identity Provider に固有のSSOProperties を指定し、AccessKey とSecretKey を空のままにする必要があります。これにより、本製品は一時的な認証資格情報を取得するために、リクエストでSSO 認証情報を送信します。

    
    var amazons3ConnectionString = "AccessKey=a123;SecretKey=s123;";
    var replicationServer = "SERVER";
    var replicationCatalog = "CATALOG";
    var replicationUserID = "sqluser";
    var replicationPassword = "sqlpassword";
    
    List<string> allEntityNames = new List<string>();
    Hashtable entitySchema = new Hashtable();
    
  3. In the same control nugget used to defined variables, use ADO.NET code to programmatically query the Amazon S3 entities (tables) and fields (columns).
    using (AmazonS3Connection connection = new AmazonS3Connection(amazons3ConnectionString)) {
      connection.Open();
      var entities = connection.GetSchema("Tables").Rows;
      foreach (DataRow entity in entities)
      {
        allEntityNames.Add(entity["TABLE_NAME"].ToString());
      }
      foreach (string entity in allEntityNames){
        var columns = connection.GetSchema("Columns", new string [] {entity}).Rows;
        entitySchema.Add(entity,columns);
      }
    }
    

Class Nugget

In our Biml script to create the replication tasks, there are several places where repeated XML elements are created dynamically (mostly for columns in SSIS tasks). Instead of repeating the code, add a class nugget <#+ ... #> and create a helper class with methods to consolidate repeated code (full code at the end of the article).

  1. Add public static variables to determine which type of XML element to create.
    public static int OUTPUT_WITH_ERROR = 0;
    public static int EXTERNAL = 1;
    public static int OUTPUT = 2;
    public static int DATAOVERRIDE_COLUMN = 4;
    
  2. Add a public method to build a SQL statement for use in the ExecuteSQL task used to drop existing tables and create a new table for the replicated data.
    // Dynamically builds a DROP TABLE and CREATE statement
    // for each entity (table) in Amazon S3 using the table name and metadata.
    public static string GetDeleteAndCreateStatement(string tableName, DataRowCollection columns) {
      ...
    }
    
  3. Add a public method to build the collection of column-based XML elements.
     
    // Dynamically build various column-based XML elements
    // for each entity (table) in Amazon S3 based on the column 
    // metadata and the parent element
    public static string GetColumnDefs(DataRowCollection columns, int columnType){
      ...
    }
    

Biml Script

Now that you have the table metadata and a Helper class to reduce repeated code, write the Biml script to dynamically create your replication packages.

  1. Start by adding a CustomSsisConnection element for the CData SSIS Tasks. Note that the ObjectData attribute must be XML encoded. A typical connecting string looks similar to the following (note the use of the amazons3ConnectionString variable for the ConnectionString property:
    <AmazonS3ConnectionManager>
      <Property Name="ConnectionString"><#=amazons3ConnectionString#></Property>
    </AmazonS3ConnectionManager>
    

    After configuring the connection to the CData SSIS Task, configure a connection to the replication database. The completed Connections element looks like the following (note the use of text nuggets <#= ... #> to add variables for connection string values):

    <Connections>
      <CustomSsisConnection Name="CData AmazonS3 Connection Manager" CreationName = "CDATA_AMAZONS3" ObjectData = "&lt;AmazonS3ConnectionManager&gt; &lt;Property Name=&quot;ConnectionString&quot;&gt; <#=amazons3ConnectionString#>&lt;/Property&gt; &lt;/AmazonS3ConnectionManager&gt;" />
      <Connection Name="Destination" ConnectionString="Data Source=<#=replicationServer#>;User ID=<#=replicationUserID#>;Password=<#=replicationPassword#>;Initial Catalog=<#=replicationCatalog#>;Provider=SQLNCLI11.1;"/>
    </Connections>
    
  2. With the Connections element configured, you are ready to build our replication package. In the package, the Biml script create an ExecuteSQL task and a Dataflow task for each table to be replicated.

    To build each set of tasks, use a while loop in a control nugget to iterate through the entity (table) names:

    int entityCounter = 0; while(entityCounter < allEntityNames.Count){
    var tableName = allEntityNames[entityCounter].ToString();
    DataRowCollection columns = ((DataRowCollection)entitySchema[tableName]);
    
    • ExecuteSQL Task

      In the ExecuteSQL task, execute a SQL query to drop any existing tables that have the same name as our Amazon S3 entity (table) and create a new table based on the metadata discovered using the CData SSIS Component.

      To create the query dynamically, use the Helper.GetDeleteAndCreateStatement() helper function.

    • Dataflow Task

      Within the Dataflow use a CustomComponent as the source component and an OleDbDestination as the destination.

      • CustomComponent Element

        The CustomComponent element uses the CData SSIS Source component to retrieve Amazon S3 data. Start by configuring the component to use with the CData component.

        
        <CustomComponent Name="CData Amazon S3 Source" ComponentTypeName="CData.SSIS.AmazonS3.AmazonS3Source" Version="18" ContactInfo="support@cdata.com" UsesDispositions="true">
        ...
        </CustomComponent>
        

        DataflowOverrides and OutputPaths Elements

        The next step after configuring the connection is to add Columns elements to the OutputPath child element of the DataflowOverrides element. To do so, call the Helper.GetColumnDefs() helper function.

        Use the same Helper class to add columns to the OutputColumns and ExternalColumns child elements of the various OutputPaths elements.

        The definitions created provide information about the input, output, and error information for the SSIS component.

        
        <DataflowOverrides>
          <OutputPath OutputPathName="CData AmazonS3 Source Output">
            <Columns>
        <#=HelperClass.GetColumnDefs(columns,HelperClass.DATAOVERRIDE_COLUMN) #>
            </Columns>
          </OutputPath>
        </DataflowOverrides>
        ...
        <OutputPaths>
          <OutputPath Name="CData AmazonS3 Source Output">
            <OutputColumns>
        <#=HelperClass.GetColumnDefs(columns,HelperClass.OUTPUT_WITH_ERROR) #>
            </OutputColumns>
            <ExternalColumns>
        <#=HelperClass.GetColumnDefs(columns,HelperClass.EXTERNAL) #>
            </ExternalColumns>
          </OutputPath>
          <OutputPath Name="CData AmazonS3 Source Error Output" IsErrorOutput="true">
            <OutputColumns>
        <#=HelperClass.GetColumnDefs(columns,HelperClass.OUTPUT) #      
            </OutputColumns>
          </OutputPath>
        </OutputPaths>
        

        CustomProperties Element

        The CData SSIS tasks are surfaced in SSIS as custom components with a series of required CustomProperties:

        
        <CustomProperties>
          <CustomProperty Name="SQLStatement" DataType="Null" UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version= 10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" SupportsExpression="true"></CustomProperty>
          <CustomProperty Name="AccessMode" DataType="Int32" TypeConverter="CData.SSIS.AmazonS3.AccessModeToStringConverter">0</CustomProperty>
          <CustomProperty Name="TableOrView" DataType="String" UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version= 10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" SupportsExpression="true">[<#=tableName#>]</CustomProperty>
          <CustomProperty Name="ExecStoredProcedure" DataType="Boolean">false</CustomProperty>
        </CustomProperties>
        

        Connections Element

        The last element to add to the CustomComponent element is a Connections element, attaching the previously defined connection to the task:

        
        <Connections>
          <Connection Name="AmazonS3 2018 Connection" ConnectionName="CData AmazonS3 Connection Manager" />
        </Connections>
        
      • OleDbDestination Element

        The final piece of the Dataflow task is the OleDbDestination element. Attach the previously defined OleDbConnection to the element, set the InputPath and ExternalTableOutput:

        
        <OleDbDestination Name="OLE DB Destination" ConnectionName="Destination" CheckConstraints="false">
          <InputPath OutputPathName="CData AmazonS3 Source.CData AmazonS3 Source Output" />
          <ExternalTableOutput Table="[<#=tableName#>]" />
        </OleDbDestination>
        

  3. Use a control nugget to increment the counter used to iterate over the collection of entity (table) names. Do this within the Tasks element, after the end of the Dataflow element:
    
    ...
              </Dataflow>          
    <# entityCounter++;}#>
            </Tasks>
        </Package>
      </Packages>
    </Biml>
    

Build the SSIS Project

Once the Biml file is written, right-click on the Biml file in Server Explorer and select Generate SSIS Packages. At this point, Visual Studio and BimlExpress will translate the Biml file into SSIS package(s), ready to be run.

Generate SSIS Package(s) from Biml

Run the package to begin replicating your Amazon S3 data to a SQL Server database (or any other destination you choose).

Free Trial & More Information

With the CData SSIS Components for AmazonS3, you get SQL access to your Amazon S3 data directly from SSIS packages. And with Biml, you can automatically generate those packages. For more information about the CData SSIS Components for AmazonS3, refer to the product page. You can always get started with a free, 30-day trial. As always, our world-class CData Support Team is available if you have any questions.

Complete Biml File


<#@ template language="C#" hostspecific="true"#>
<#@ import namespace="System.Data"#>
<#@ import namespace="System.IO"#>
<#@ import namespace="System.Collections"#>
<#@ import namespace="System.Data.CData.AmazonS3"#>
<#@ assembly name="C:\Program Files\CData\CData SSIS Components for AmazonS3 2018\lib\CData.SSIS2017.AmazonS3.dll"#>
<#
var amazons3ConnectionString = ""AccessKey=a123;SecretKey=s123;";
var replicationServer = "JDG";
var replicationCatalog = "BIML";
var replicationUserID = "sqltest";
var replicationPassword = "sqltest";

List<string> allEntityNames = new List<string>();
Hashtable entitySchema = new Hashtable();
using (AmazonS3Connection connection = new AmazonS3Connection(amazons3ConnectionString)) {
    connection.Open();
    var entities = connection.GetSchema("Tables").Rows;
    foreach (DataRow entity in entities)
    {
        allEntityNames.Add(entity["TABLE_NAME"].ToString());
    }
    foreach (string entity in allEntityNames){
        var columns = connection.GetSchema("Columns", new string [] {entity}).Rows;
        entitySchema.Add(entity,columns);
    }
}#>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Connections>
    <CustomSsisConnection Name="CData AmazonS3 Connection Manager" CreationName="CDATA_AMAZONS3" ObjectData="&lt;AmazonS3ConnectionManager&gt;&lt;Property Name=&quot;ConnectionString"&gt;<#=amazons3ConnectionString#>&lt;/Property&gt;&lt;/AmazonS3ConnectionManager&gt;"/>
    <Connection Name="Destination" ConnectionString="Data Source=<#=replicationServer#>;User ID=<#=replicationUserID#>;Password=<#=replicationPassword#>;Initial Catalog=<#=replicationCatalog#>;Provider=SQLNCLI11.1;"/>
  </Connections>
  <Packages>
    <Package Name="Replicate AmazonS3 Package" Language="None" ConstraintMode="LinearOnCompletion" ProtectionLevel="EncryptSensitiveWithUserKey">
      <Tasks>
<# int entityCounter = 0; while(entityCounter < allEntityNames.Count){
   var tableName = allEntityNames[entityCounter].ToString();
   if (tableName.Equals("IdpEventLog")) break;
   DataRowCollection columns = ((DataRowCollection)entitySchema[tableName]);#>
        <ExecuteSQL Name="Create <#=tableName#> Replication Table" ConnectionName="Destination">
          <DirectInput>
<#=HelperClass.GetDeleteAndCreateStatement(tableName,columns)#>
          </DirectInput>
        </ExecuteSQL>
        <Dataflow Name="Replicate <#=tableName#>">
          <Transformations>
            <CustomComponent Name="CData AmazonS3 Source" ComponentTypeName="CData.SSIS.AmazonS3.AmazonS3Source" Version="18" ContactInfo="support@cdata.com" UsesDispositions="true">
              <DataflowOverrides>
                <OutputPath OutputPathName="CData AmazonS3 Source Output">
                  <Columns>
<#=HelperClass.GetColumnDefs(columns,HelperClass.DATAOVERRIDE_COLUMN) #>
                  </Columns>
                </OutputPath>
              </DataflowOverrides>
              <CustomProperties>
                <CustomProperty Name="SQLStatement" DataType="Null" UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version= 10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" SupportsExpression="true"></CustomProperty>
                <CustomProperty Name="AccessMode" DataType="Int32" TypeConverter="CData.SSIS.AmazonS3.AccessModeToStringConverter">0</CustomProperty>
                <CustomProperty Name="TableOrView" DataType="String" UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version= 10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" SupportsExpression="true">[<#=tableName#>]</CustomProperty>
                <CustomProperty Name="ExecStoredProcedure" DataType="Boolean">false</CustomProperty>
              </CustomProperties>
              <OutputPaths>
                <OutputPath Name="CData AmazonS3 Source Output">
                  <OutputColumns>
<#=HelperClass.GetColumnDefs(columns,HelperClass.OUTPUT_WITH_ERROR) #>
                  </OutputColumns>
                  <ExternalColumns>
<#=HelperClass.GetColumnDefs(columns,HelperClass.EXTERNAL) #>
                  </ExternalColumns>
                </OutputPath>
                <OutputPath Name="CData AmazonS3 Source Error Output" IsErrorOutput="true">
                  <OutputColumns>
<#=HelperClass.GetColumnDefs(columns,HelperClass.OUTPUT) #>                     
                  </OutputColumns>
                </OutputPath>
              </OutputPaths>
              <Connections>
                <Connection Name="AmazonS3 2018 Connection" ConnectionName="CData AmazonS3 Connection Manager" />
              </Connections>
            </CustomComponent>
            <OleDbDestination Name="OLE DB Destination" ConnectionName="Destination" CheckConstraints="false">
              <InputPath OutputPathName="CData AmazonS3 Source.CData AmazonS3 Source Output" />
              <ExternalTableOutput Table="[<#=tableName#>]" />
            </OleDbDestination>
          </Transformations>
        </Dataflow>          
<# entityCounter++;}#>
      </Tasks>
    </Package>
  </Packages>
</Biml>

<#+
public static class HelperClass {
    
    public static int OUTPUT_WITH_ERROR = 0;
    public static int EXTERNAL = 1;
    public static int OUTPUT = 2;
    public static int DATAOVERRIDE_COLUMN = 4;
    
    public static string GetDeleteAndCreateStatement(string tableName, DataRowCollection columns) {
        var dropAndCreateStatement = 
            "IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[{0}]') AND type IN (N'U'))\r\n" + 
            "DROP TABLE [{0}];\r\n" + 
            "CREATE TABLE [{0}]\r\n" + 
            "(\r\n" + 
            "{1}\r\n" + 
            ")\r\n" + 
            "ON \"default\";";
        string columnDefs = "";
        foreach (DataRow column in columns){
            string columnDef = "    [{0}] {1}";
            string dataType = column["DATA_TYPE"].ToString();
            if (dataType.ToLower().StartsWith("bool")) {
                dataType = "bit";
            } else if (dataType.ToLower().Equals("real")) {
                dataType = "float";
            } else if (dataType.ToLower().Contains("varchar")) {
                var columnLength = column["CHARACTER_MAXIMUM_LENGTH"];
                dataType = "nvarchar(" + ((int)columnLength > 4000 ? "MAX" : columnLength) + ")";         
            } 
            columnDefs += String.Format(columnDef,column["COLUMN_NAME"],dataType) + ",\r\n";
            
        }
        columnDefs = columnDefs.Remove(columnDefs.LastIndexOf(",\r\n"),",\r\n".Length);
        return String.Format(dropAndCreateStatement,tableName,columnDefs);
    }
    
    public static string GetColumnDefs(DataRowCollection columns, int columnType){
        var columnDefTemplate = "";
        var columnElements = "";
        
        if (columnType == DATAOVERRIDE_COLUMN) {
            columnDefTemplate = "                      <Column ErrorRowDisposition=\"FailComponent\" TruncationRowDisposition=\"FailComponent\" ColumnName=\"{0}\" />\r\n";
            foreach(DataRow column in columns) {
                var columnName = column["COLUMN_NAME"];
                columnElements += String.Format(columnDefTemplate,columnName);
            }
            return columnElements;
        } 
        if (columnType == OUTPUT_WITH_ERROR)
            columnDefTemplate = "                      <OutputColumn Name=\"{0}\" {1} ExternalMetadataColumnName=\"{0}\" ErrorRowDisposition=\"FailComponent\" TruncationRowDisposition=\"FailComponent\" />\r\n";
        else if (columnType == EXTERNAL)
            columnDefTemplate = "                      <ExternalColumn Name=\"{0}\" {1} />\r\n";
        else if (columnType == OUTPUT)
            columnDefTemplate = "                      <OutputColumn Name=\"{0}\" {1} />\r\n";
        
        foreach(DataRow column in columns){ 
            var columnName = column["COLUMN_NAME"];
            var dataTypeRaw = column["DATA_TYPE"].ToString().ToLower();
            var typeAndRelatedInfo = "";
            if (dataTypeRaw.Equals("bool")) {
                typeAndRelatedInfo = "DataType=\"Boolean\"";
            } else if (dataTypeRaw.Equals("date")) {
                typeAndRelatedInfo = "DataType=\"Date\" SsisDataTypeOverride=\"DT_DBDATE\"";
            } else if (dataTypeRaw.Equals("datetime")) {
                typeAndRelatedInfo = "DataType=\"DateTime\"";
            } else if (dataTypeRaw.Equals("real")) {
                typeAndRelatedInfo = ((int)column["NumericPrecision"] > 0 ? "Precision=\"18\" " : " ") + ((int)column["NumericScale"] > 0 ? "Scale=\"15\" " : " ") + "DataType=\"Decimal\"";
            } else if (dataTypeRaw.Equals("varchar")) {
                var columnLength = column["CHARACTER_MAXIMUM_LENGTH"];
                if ((int)columnLength > 4000) {
                    typeAndRelatedInfo = "DataType=\"String\"";
                } else {
                    typeAndRelatedInfo = "Length=\"" + columnLength + "\" DataType=\"String\" CodePage=\"1252\"";
                }
            }
            columnElements += String.Format(columnDefTemplate,columnName,typeAndRelatedInfo);
        }
        return columnElements;
    }
}
#>