Automate Parquet Integration Tasks from PowerShell

Ready to get started?

Download for a free trial:

Download Now

Learn more:

Parquet ADO.NET Provider

Rapidly create and deploy powerful .NET applications that integrate with Parquet.



Are you looking for a quick and easy way to access Parquet data from PowerShell? We show how to use the Cmdlets for Parquet and the CData ADO.NET Provider for Parquet to connect to Parquet data and synchronize, automate, download, and more.

The CData Cmdlets for Parquet are standard PowerShell cmdlets that make it easy to accomplish data cleansing, normalization, backup, and other integration tasks by enabling real-time access to Parquet.

Cmdlets or ADO.NET?

The cmdlets are not only a PowerShell interface to the Parquet API, but also an SQL interface; this tutorial shows how to use both to retrieve Parquet data. We also show examples of the ADO.NET equivalent, which is possible with the CData ADO.NET Provider for Parquet. To access Parquet data from other .NET applications, like LINQPad, use the CData ADO.NET Provider for Parquet.

After obtaining the needed connection properties, accessing Parquet data in PowerShell consists of three basic steps.

Connect to your local Parquet file(s) by setting the URI connection property to the location of the Parquet file.

PowerShell

  1. Install the module:

    Install-Module ParquetCmdlets
  2. Connect:

    $parquet = Connect-Parquet -URI "$URI"
  3. Search for and retrieve data:

    $column2 = "SAMPLE_VALUE" $sampletable_1 = Select-Parquet -Connection $parquet -Table "SampleTable_1" -Where "Column2 = `'$Column2`'" $sampletable_1

    You can also use the Invoke-Parquet cmdlet to execute SQL commands:

    $sampletable_1 = Invoke-Parquet -Connection $parquet -Query 'SELECT * FROM SampleTable_1 WHERE Column2 = @Column2' -Params @{'@Column2'='SAMPLE_VALUE'}

ADO.NET

  1. Load the provider's assembly:

    [Reflection.Assembly]::LoadFile("C:\Program Files\CData\CData ADO.NET Provider for Parquet\lib\System.Data.CData.Parquet.dll")
  2. Connect to Parquet:

    $conn= New-Object System.Data.CData.Parquet.ParquetConnection("URI=C:/folder/table.parquet;") $conn.Open()
  3. Instantiate the ParquetDataAdapter, execute an SQL query, and output the results:

    $sql="SELECT Id, Column1 from SampleTable_1" $da= New-Object System.Data.CData.Parquet.ParquetDataAdapter($sql, $conn) $dt= New-Object System.Data.DataTable $da.Fill($dt) $dt.Rows | foreach { Write-Host $_.id $_.column1 }