Ready to get started?

Learn more about the HDFS PowerShell Cmdlets or download a free trial:

Download Now

Pipe HDFS Data to CSV in PowerShell

Use standard PowerShell cmdlets to access HDFS tables.

The CData Cmdlets Module for HDFS is a standard PowerShell module offering straightforward integration with HDFS. Below, you will find examples of using our HDFS Cmdlets with native PowerShell cmdlets.

Creating a Connection to Your HDFS Data

In order to authenticate, set the following connection properties:

  • Host: Set this value to the host of your HDFS installation.
  • Port: Set this value to the port of your HDFS installation. Default port: 50070

$conn = Connect-HDFS  -Host "$Host" -Port "$Port" -Path "$Path" -User "$User"

Selecting Data

Follow the steps below to retrieve data from the Files table and pipe the result into to a CSV file:

Select-HDFS -Connection $conn -Table Files | Select -Property * -ExcludeProperty Connection,Table,Columns | Export-Csv -Path c:\myFilesData.csv -NoTypeInformation

You will notice that we piped the results from Select-HDFS into a Select-Object cmdlet and excluded some properties before piping them into an Export-Csv cmdlet. We do this because the CData Cmdlets append Connection, Table, and Columns information onto each "row" in the result set, and we do not necessarily want that information in our CSV file.

The Connection, Table, and Columns are appended to the results in order to facilitate piping results from one of the CData Cmdlets directly into another one.