Ready to get started?

Learn more about the CData JDBC Driver for Apache Spark or download a free trial:

Download Now

Informatica で、Apache Spark にJDBC データソースとして連携

標準のJDBC 接続プロセスを使用してInformatica にApache Spark データオブジェクトを作成します。JAR をコピーしてから接続します。

Informatica は、データを転送および変換するための強力で洗練された手段を提供します。CData JDBC Driver for Apache Spark を使用することで、Informatica の強力なデータ転送および操作機能とシームレスに統合されている業界で実証済みの標準に基づくドライバーにアクセスできます。このチュートリアルでは、Informatica PowerCenter でApache Spark data を転送および参照する方法を説明します。

ドライバーのデプロイ

To deploy the driver to the Informatica PowerCenter server, copy the CData JAR and .lic file, located in the lib subfolder in the installation directory, to the following folder: Informatica-installation-directory\services\shared\jars\thirdparty.

To work with Apache Spark data in the Developer tool, you will need to copy the CData JAR and .lic file, located in the lib subfolder in the installation directory, into the following folders:

  • Informatica-installation-directory\client\externaljdbcjars
  • Informatica-installation-directory\externaljdbcjars

JDBC 接続を作成

Follow the steps below to connect from Informatica Developer:

  1. In the Connection Explorer pane, right-click your domain and click Create a Connection.
  2. In the New Database Connection wizard that is displayed, enter a name and Id for the connection and in the Type menu select JDBC.
  3. In the JDBC Driver Class Name property, enter: cdata.jdbc.sparksql.SparkSQLDriver
  4. In the Connection String property, enter the JDBC URL, using the connection properties for Apache Spark.

    Set the Server, Database, User, and Password connection properties to connect to SparkSQL.

    ビルトイン接続文字列デザイナー

    For assistance in constructing the JDBC URL, use the connection string designer built into the Apache Spark JDBC Driver.Either double-click the JAR file or execute the jar file from the command-line.

    java -jar cdata.jdbc.sparksql.jar

    Fill in the connection properties and copy the connection string to the clipboard.

    A typical connection string is below:

    jdbc:sparksql:Server=127.0.0.1;

Apache Spark テーブルを閲覧

After you have added the driver JAR to the classpath and created a JDBC connection, you can now access Apache Spark entities in Informatica.Follow the steps below to connect to Apache Spark and browse Apache Spark tables:

  1. Connect to your repository.
  2. In the Connection Explorer, right-click the connection and click Connect.
  3. Clear the Show Default Schema Only option.

You can now browse Apache Spark tables in the Data Viewer:Right-click the node for the table and then click Open.On the Data Viewer view, click Run.

Apache Spark データオブジェクトを作成

Follow the steps below to add Apache Spark tables to your project:

  1. Select tables in Apache Spark, then right-click a table in Apache Spark, and click Add to Project.
  2. In the resulting dialog, select the option to create a data object for each resource.
  3. In the Select Location dialog, select your project.

    マッピングを作成

    Follow the steps below to add the Apache Spark source to a mapping:

    1. In the Object Explorer, right-click your project and then click New -> Mapping.
    2. Expand the node for the Apache Spark connection and then drag the data object for the table onto the editor.
    3. In the dialog that appears, select the Read option.

    Follow the steps below to map Apache Spark columns to a flat file:

    1. In the Object Explorer, right-click your project and then click New -> Data Object.
    2. Select Flat File Data Object -> Create as Empty -> Fixed Width.
    3. In the properties for the Apache Spark object, select the rows you want, right-click, and then click copy.Paste the rows into the flat file properties.
    4. Drag the flat file data object onto the mapping.In the dialog that appears, select the Write option.
    5. Click and drag to connect columns.

    To transfer Apache Spark data, right-click in the workspace and then click Run Mapping.

 
 
ダウンロード