Ready to get started?

Learn more:

TIBCO DV Adapters

Access Live Azure Data Lake Storage Data in TIBCO Data Virtualization



Use the CData TIBCO DV Adapter for Azure Data Lake Storage to create a Azure Data Lake Storage data source in TIBCO Data Virtualization Studio and gain access to live Azure Data Lake Storage data from your TDV Server.

TIBCO Data Virtualization (TDV) is an enterprise data virtualization solution that orchestrates access to multiple and varied data sources. When paired with the CData TIBCO DV Adapter for Azure Data Lake Storage, you get federated access to live Azure Data Lake Storage data directly within TIBCO Data Virtualization. This article walks through deploying an adapter and creating a new data source based on Azure Data Lake Storage.

With built-in optimized data processing, the CData TIBCO DV Adapter offers unmatched performance for interacting with live Azure Data Lake Storage data. When you issue complex SQL queries to Azure Data Lake Storage, the adapter pushes supported SQL operations, like filters and aggregations, directly to Azure Data Lake Storage. Its built-in dynamic metadata querying allows you to work with and analyze Azure Data Lake Storage data using native data types.

Deploy the Azure Data Lake Storage TIBCO DV Adapter

  1. In a console, navigate to the bin folder in the TDV Server installation directory. If there is a current version of the adapter installed, you will need to undeploy it.

    .\server_util.bat -server localhost -user admin -password ******** -undeploy -version 1 -name ADLS
    
  2. Extract the CData TIBCO DV Adapter to a local folder and deploy the JAR file (tdv.adls.jar) to the server from the extract location.

    .\server_util.bat -server localhost -user admin -password ******** -deploy -package /PATH/TO/tdv.adls.jar
    

You may need to restart the server to ensure the new JAR file is loaded properly, which can be accomplished by running the composite.bat script located at: C:\Program Files\TIBCO\TDV Server <version>\bin. Note that reauthenticating to the TDV Studio is required after restarting the server.

Sample Restart Call

.\composite.bat monitor restart

Authenticate with Azure Data Lake Storage Using OAuth

Since Azure Data Lake Storage authenticates using the OAuth protocol and TDV Studio does not support browser-based authentication internally, you will need to create and run a simple Java application to retrieve the OAuth tokens. Once retrieved, the tokens are used to connect to Azure Data Lake Storage directly from the adapter.

The following code sample shows how to authenticate with Azure Data Lake Storage. You will simply need to execute the Java application with the tdv.adls.jar file in the class path.

ADLSOAuth oauth = new ADLSOAuth();  
oauth.generateOAuthSettingsFile("InitiateOAuth=GETANDREFRESH;" + 
                                  "Schema=ADLSGen2;Account=myAccount;FileSystem=myFileSystem;AccessKey=myAccessKey;" + 
                                  "OAuthSettingsLocation=C:\adls\OAuthSettings.txt;");

Once you deploy the adapter and authenticate, you can create a new data source for Azure Data Lake Storage in TDV Studio.

Create a Azure Data Lake Storage Data Source in TDV Studio

With the CData TIBCO DV Adapter for Azure Data Lake Storage, you can easily create a data source for Azure Data Lake Storage and introspect the data source to add resources to TDV.

Create the Data Source

  1. Right-click on the folder you wish to add the data source to and select New -> New Data Source.
  2. Scroll until you find the adapter (e.g. Azure Data Lake Storage) and click Next.
  3. Name the data source (e.g. CData Azure Data Lake Storage Source).
  4. Fill in the required connection properties.

    Authenticating to a Gen 1 DataLakeStore Account

    Gen 1 uses OAuth 2.0 in Azure AD for authentication.

    For this, an Active Directory web application is required. You can create one as follows:

    1. Sign in to your Azure Account through the .
    2. Select "Azure Active Directory".
    3. Select "App registrations".
    4. Select "New application registration".
    5. Provide a name and URL for the application. Select Web app for the type of application you want to create.
    6. Select "Required permissions" and change the required permissions for this app. At a minimum, "Azure Data Lake" and "Windows Azure Service Management API" are required.
    7. Select "Key" and generate a new key. Add a description, a duration, and take note of the generated key. You won't be able to see it again.

    To authenticate against a Gen 1 DataLakeStore account, the following properties are required:

    • Schema: Set this to ADLSGen1.
    • Account: Set this to the name of the account.
    • OAuthClientId: Set this to the application Id of the app you created.
    • OAuthClientSecret: Set this to the key generated for the app you created.
    • TenantId: Set this to the tenant Id. See the property for more information on how to acquire this.
    • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.

    Authenticating to a Gen 2 DataLakeStore Account

    To authenticate against a Gen 2 DataLakeStore account, the following properties are required:

    • Schema: Set this to ADLSGen2.
    • Account: Set this to the name of the account.
    • FileSystem: Set this to the file system which will be used for this account.
    • AccessKey: Set this to the access key which will be used to authenticate the calls to the API. See the property for more information on how to acquire this.
    • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.

    NOTE: Set the OAuthSettingsLocation property in the DV Adapter to the same value you used when performing the OAuth authentication (see above).

  5. Click Create & Close.

Introspect the Data Source

Once the data source is created, you can introspect the data source by right-clicking and selecting Open. In the dashboard, click Add/Remove Resources and select the Tables, Views, and Stored Procedures to include as part of the data source. Click Next and Finish to add the selected Azure Data Lake Storage tables, views, and stored procedures as resources.

After creating and introspecting the data source, you are ready to work with Azure Data Lake Storage data in TIBCO Data Virtualization just like you would any other relational data source. You can create views, query using SQL, publish the data source, and more.