Automate Google Data Catalog Integration Tasks from PowerShell

Ready to get started?

Download for a free trial:

Download Now

Learn more:

Google Data Catalog ADO.NET Provider

Rapidly create and deploy powerful .NET applications that integrate with Google Data Catalog.



Are you looking for a quick and easy way to access Google Data Catalog data from PowerShell? We show how to use the Cmdlets for Google Data Catalog and the CData ADO.NET Provider for Google Data Catalog to connect to Google Data Catalog data and synchronize, automate, download, and more.

The CData Cmdlets for Google Data Catalog are standard PowerShell cmdlets that make it easy to accomplish data cleansing, normalization, backup, and other integration tasks by enabling real-time access to Google Data Catalog.

Cmdlets or ADO.NET?

The cmdlets are not only a PowerShell interface to the Google Data Catalog API, but also an SQL interface; this tutorial shows how to use both to retrieve Google Data Catalog data. We also show examples of the ADO.NET equivalent, which is possible with the CData ADO.NET Provider for Google Data Catalog. To access Google Data Catalog data from other .NET applications, like LINQPad, use the CData ADO.NET Provider for Google Data Catalog.

After obtaining the needed connection properties, accessing Google Data Catalog data in PowerShell consists of three basic steps.

Google Data Catalog uses the OAuth authentication standard. Authorize access to Google APIs on behalf on individual users or on behalf of users in a domain.

Before connecting, specify the following to identify the organization and project you would like to connect to:

  • OrganizationId: The ID associated with the Google Cloud Platform organization resource you would like to connect to. Find this by navigating to the cloud console.

    Click the project selection drop-down, and select your organization from the list. Then, click More -> Settings. The organization ID is displayed on this page.

  • ProjectId: The ID associated with the Google Cloud Platform project resource you would like to connect to.

    Find this by navigating to the cloud console dashboard and selecting your project from the Select from drop-down. The project ID will be present in the Project info card.

When you connect, the OAuth endpoint opens in your default browser. Log in and grant permissions to the application to completes the OAuth process. For more information, refer to the OAuth section in the Help documentation.

PowerShell

  1. Install the module:

    Install-Module GoogleDataCatalogCmdlets
  2. Connect:

    $googledatacatalog = Connect-GoogleDataCatalog -ProjectId "$ProjectId"
  3. Search for and retrieve data:

    $projectid = "bigquery-public-data" $schemas = Select-GoogleDataCatalog -Connection $googledatacatalog -Table "Schemas" -Where "ProjectId = `'$ProjectId`'" $schemas

    You can also use the Invoke-GoogleDataCatalog cmdlet to execute SQL commands:

    $schemas = Invoke-GoogleDataCatalog -Connection $googledatacatalog -Query 'SELECT * FROM Schemas WHERE ProjectId = @ProjectId' -Params @{'@ProjectId'='bigquery-public-data'}

ADO.NET

  1. Load the provider's assembly:

    [Reflection.Assembly]::LoadFile("C:\Program Files\CData\CData ADO.NET Provider for Google Data Catalog\lib\System.Data.CData.GoogleDataCatalog.dll")
  2. Connect to Google Data Catalog:

    $conn= New-Object System.Data.CData.GoogleDataCatalog.GoogleDataCatalogConnection("ProjectId=YourProjectId;InitiateOAuth=GETANDREFRESH") $conn.Open()
  3. Instantiate the GoogleDataCatalogDataAdapter, execute an SQL query, and output the results:

    $sql="SELECT Type, DatasetName from Schemas" $da= New-Object System.Data.CData.GoogleDataCatalog.GoogleDataCatalogDataAdapter($sql, $conn) $dt= New-Object System.Data.DataTable $da.Fill($dt) $dt.Rows | foreach { Write-Host $_.type $_.datasetname }