Connect to Azure Data Lake Storage Data as a Linked Server



Use the SQL Gateway and the ODBC Driver to set up a linked server for Azure Data Lake Storage data.

You can use the SQL Gateway to configure a TDS (SQL Server) remoting service and set up a linked server for Azure Data Lake Storage data. After you have started the service, you can use the UI in SQL Server Management Studio or call stored procedures to create the linked server. You can then work with Azure Data Lake Storage data just as you would a linked SQL Server instance.

Connect to Azure Data Lake Storage as an ODBC Data Source

If you have not already, first specify connection properties in an ODBC DSN (data source name). This is the last step of the driver installation. You can use the Microsoft ODBC Data Source Administrator to create and configure ODBC DSNs.

Authenticating to a Gen 1 DataLakeStore Account

Gen 1 uses OAuth 2.0 in Azure AD for authentication.

For this, an Active Directory web application is required. You can create one as follows:

  1. Sign in to your Azure Account through the .
  2. Select "Azure Active Directory".
  3. Select "App registrations".
  4. Select "New application registration".
  5. Provide a name and URL for the application. Select Web app for the type of application you want to create.
  6. Select "Required permissions" and change the required permissions for this app. At a minimum, "Azure Data Lake" and "Windows Azure Service Management API" are required.
  7. Select "Key" and generate a new key. Add a description, a duration, and take note of the generated key. You won't be able to see it again.

To authenticate against a Gen 1 DataLakeStore account, the following properties are required:

  • Schema: Set this to ADLSGen1.
  • Account: Set this to the name of the account.
  • OAuthClientId: Set this to the application Id of the app you created.
  • OAuthClientSecret: Set this to the key generated for the app you created.
  • TenantId: Set this to the tenant Id. See the property for more information on how to acquire this.
  • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.

Authenticating to a Gen 2 DataLakeStore Account

To authenticate against a Gen 2 DataLakeStore account, the following properties are required:

  • Schema: Set this to ADLSGen2.
  • Account: Set this to the name of the account.
  • FileSystem: Set this to the file system which will be used for this account.
  • AccessKey: Set this to the access key which will be used to authenticate the calls to the API. See the property for more information on how to acquire this.
  • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.

Configure the TDS Remoting Service

See the SQL Gateway Overview for a guide to configure a TDS remoting service in the SQL Gateway UI. The TDS remoting service is a daemon process that listens for TDS requests from clients.

Create a Linked Server for Azure Data Lake Storage Data

After you have configured and started the daemon, create the linked server and connect. You can use the UI in SQL Server Management Studio or call stored procedures.

Create a Linked Server from the UI

Follow the steps below to create a linked server from the Object Explorer.

  1. Open SQL Server Management Studio and connect to an instance of SQL Server.
  2. In the Object Explorer, expand the node for the SQL Server database. In the Server Objects node, right-click Linked Servers and click New Linked Server. The New Linked Server dialog is displayed.
  3. In the General section, click the Other Data Source option and enter the following information after naming the linked server:
    • Provider: Select "Microsoft ODBC Driver for SQL Server" or "Microsoft OLE DB Driver for SQL Server"
    • Data Source: Enter the host and port the TDS remoting service is running on, separated by a comma.

      Note that a value of "localhost" in this input refers to the machine where SQL Server is running so be careful when creating a linked server in Management Studio when not running on the same machine as SQL Server.

    • Catalog: Enter the CData system DSN, CData ADLS Sys.
  4. In the Security section, select the option to have the connection "made using this security context" and enter the username and password of a user you created in the Users tab of the SQL Gateway.

Create a Linked Server Programmatically

In addition to using the SQL Server Management Studio UI to create a linked server, you can use stored procedures. The following inputs are required:

  • server: The linked server name.
  • provider: Enter "MSOLEDBSQL", for the Microsoft OLE DB Driver for SQL Server.
  • datasrc: The host and port the service is running on, separated by a comma.

    Note that a value of "localhost" in the datasrc input refers to the machine where SQL Server is running, so be careful when creating a linked server in Management Studio when not running on the same machine as SQL Server.

  • catalog: Enter the system DSN configured for the service.
  • srvproduct: Enter the product name of the data source; this can be an arbitrary value, such as "CData SQL Gateway" or an empty string.
Follow the steps below to create the linked server and configure authentication to the SQL Gateway:
  1. Call sp_addlinkedserver to create the linked server:

    EXEC sp_addlinkedserver @server='ADLS', @provider='MSOLEDBSQL', @datasrc='< MachineIPAddress >,1434', @catalog='CData ADLS Sys', @srvproduct=''; GO
  2. Call the sp_addlinkedsrvlogin stored procedure to allow SQL Server users to connect with the credentials of an authorized user of the service. Note that the credentials you use to connect to the service must specify a user you configured on the Users tab of the SQL Gateway.

    EXEC sp_addlinkedsrvlogin @rmtsrvname='ADLS', @rmtuser='admin', @rmtpassword='test', @useself='FALSE', @locallogin=NULL; GO

Connect from SQL Server Management Studio

SQL Server Management Studio uses the SQL Server Client OLE DB provider, which requires the ODBC driver to be used inprocess. You must enable the "Allow inprocess" option for the SQL Server Native Client Provider in Management Studio to query the linked server from SQL Server Management Studio. To do this, open the properties for the provider you are using under Server Objects -> Linked Servers -> Providers. Check the "Allow inprocess" option and save the changes.

Execute Queries

You can now execute queries to the Azure Data Lake Storage linked server from any tool that can connect to SQL Server. Set the table name accordingly:

SELECT * FROM [linked server name].[CData ADLS Sys].[ADLS].[Resources]

Ready to get started?

Download a free trial of the Azure Data Lake Storage ODBC Driver to get started:

 Download Now

Learn more:

Azure Data Lake Storage Icon Azure Data Lake Storage ODBC Driver

The Azure Data Lake Storage ODBC Driver is a powerful tool that allows you to connect with live data from Azure Data Lake Storage, directly from any applications that support ODBC connectivity.

Access Azure Data Lake Storage data like you would a database - read, write, and update Azure Data Lake Storage ADLSData, etc. through a standard ODBC Driver interface.