Automate Azure Data Lake Storage Integration Tasks from PowerShell

Ready to get started?

Download for a free trial:

Download Now

Learn more:

Azure Data Lake Storage ADO.NET Provider

Rapidly create and deploy powerful .NET applications that integrate with Azure Data Lake Storage.



Are you looking for a quick and easy way to access Azure Data Lake Storage data from PowerShell? We show how to use the Cmdlets for Azure Data Lake Storage and the CData ADO.NET Provider for Azure Data Lake Storage to connect to Azure Data Lake Storage data and synchronize, automate, download, and more.

The CData Cmdlets for Azure Data Lake Storage are standard PowerShell cmdlets that make it easy to accomplish data cleansing, normalization, backup, and other integration tasks by enabling real-time access to Azure Data Lake Storage.

Cmdlets or ADO.NET?

The cmdlets are not only a PowerShell interface to the Azure Data Lake Storage API, but also an SQL interface; this tutorial shows how to use both to retrieve Azure Data Lake Storage data. We also show examples of the ADO.NET equivalent, which is possible with the CData ADO.NET Provider for Azure Data Lake Storage. To access Azure Data Lake Storage data from other .NET applications, like LINQPad, use the CData ADO.NET Provider for Azure Data Lake Storage.

After obtaining the needed connection properties, accessing Azure Data Lake Storage data in PowerShell consists of three basic steps.

Authenticating to a Gen 1 DataLakeStore Account

Gen 1 uses OAuth 2.0 in Azure AD for authentication.

For this, an Active Directory web application is required. You can create one as follows:

  1. Sign in to your Azure Account through the .
  2. Select "Azure Active Directory".
  3. Select "App registrations".
  4. Select "New application registration".
  5. Provide a name and URL for the application. Select Web app for the type of application you want to create.
  6. Select "Required permissions" and change the required permissions for this app. At a minimum, "Azure Data Lake" and "Windows Azure Service Management API" are required.
  7. Select "Key" and generate a new key. Add a description, a duration, and take note of the generated key. You won't be able to see it again.

To authenticate against a Gen 1 DataLakeStore account, the following properties are required:

  • Schema: Set this to ADLSGen1.
  • Account: Set this to the name of the account.
  • OAuthClientId: Set this to the application Id of the app you created.
  • OAuthClientSecret: Set this to the key generated for the app you created.
  • TenantId: Set this to the tenant Id. See the property for more information on how to acquire this.
  • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.

Authenticating to a Gen 2 DataLakeStore Account

To authenticate against a Gen 2 DataLakeStore account, the following properties are required:

  • Schema: Set this to ADLSGen2.
  • Account: Set this to the name of the account.
  • FileSystem: Set this to the file system which will be used for this account.
  • AccessKey: Set this to the access key which will be used to authenticate the calls to the API. See the property for more information on how to acquire this.
  • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.

PowerShell

  1. Install the module:

    Install-Module ADLSCmdlets
  2. Connect:

    $adls = Connect-ADLS -Schema "$Schema" -Account "$Account" -FileSystem "$FileSystem" -AccessKey "$AccessKey"
  3. Search for and retrieve data:

    $type = "FILE" $resources = Select-ADLS -Connection $adls -Table "Resources" -Where "Type = `'$Type`'" $resources

    You can also use the Invoke-ADLS cmdlet to execute SQL commands:

    $resources = Invoke-ADLS -Connection $adls -Query 'SELECT * FROM Resources WHERE Type = @Type' -Params @{'@Type'='FILE'}

ADO.NET

  1. Load the provider's assembly:

    [Reflection.Assembly]::LoadFile("C:\Program Files\CData\CData ADO.NET Provider for Azure Data Lake Storage\lib\System.Data.CData.ADLS.dll")
  2. Connect to Azure Data Lake Storage:

    $conn= New-Object System.Data.CData.ADLS.ADLSConnection("Schema=ADLSGen2;Account=myAccount;FileSystem=myFileSystem;AccessKey=myAccessKey;InitiateOAuth=GETANDREFRESH") $conn.Open()
  3. Instantiate the ADLSDataAdapter, execute an SQL query, and output the results:

    $sql="SELECT FullPath, Permission from Resources" $da= New-Object System.Data.CData.ADLS.ADLSDataAdapter($sql, $conn) $dt= New-Object System.Data.DataTable $da.Fill($dt) $dt.Rows | foreach { Write-Host $_.fullpath $_.permission }