Connect to Bitbucket Data in Airbyte ELT Pipelines
Airbyte empowers users to load your data into any data warehouse, data lake, or database. When combined with CData Connect AI, Airbyte users can create Extract, Load, Transform (ELT) pipelines directly from live Bitbucket data. This article illustrates the process of connecting to Bitbucket through Connect AI and constructing ELT pipelines for Bitbucket data within Airbyte.
CData Connect AI offers a dedicated SQL Server interface for Bitbucket, facilitating data querying without the need for data replication to a native database. With built-in optimized data processing capabilities, CData Connect AI efficiently directs all supported SQL operations, including filters and JOINs, straight to Bitbucket. This harnesses server-side processing to swiftly retrieve the desired Bitbucket data.
Configure Bitbucket Connectivity for Airbyte
Connectivity to Bitbucket from Airbyte is made possible through CData Connect AI. To work with Bitbucket data from Airbyte, we start by creating and configuring a Bitbucket connection.
- Log into Connect AI, click Sources, and then click Add Connection
- Select "Bitbucket" from the Add Connection panel
-
Enter the necessary authentication properties to connect to Bitbucket.
For most queries, you must set the Workspace. The only exception to this is the Workspaces table, which does not require this property to be set, as querying it provides a list of workspace slugs that can be used to set Workspace. To query this table, you must set Schema to 'Information' and execute the query SELECT * FROM Workspaces>.
Setting Schema to 'Information' displays general information. To connect to Bitbucket, set these parameters:
- Schema: To show general information about a workspace, such as its users, repositories, and projects, set this to Information. Otherwise, set this to the schema of the repository or project you are querying. To get a full set of available schemas, query the sys_schemas table.
- Workspace: Required if you are not querying the Workspaces table. This property is not required for querying the Workspaces table, as that query only returns a list of workspace slugs that can be used to set Workspace.
Authenticating to Bitbucket
Bitbucket supports OAuth authentication only. To enable this authentication from all OAuth flows, you must create a custom OAuth application, and set AuthScheme to OAuth.
Be sure to review the Help documentation for the required connection properties for you specific authentication needs (desktop applications, web applications, and headless machines).
Creating a custom OAuth application
From your Bitbucket account:
- Go to Settings (the gear icon) and select Workspace Settings.
- In the Apps and Features section, select OAuth Consumers.
- Click Add Consumer.
- Enter a name and description for your custom application.
- Set the callback URL:
- For desktop applications and headless machines, use http://localhost:33333 or another port number of your choice. The URI you set here becomes the CallbackURL property.
- For web applications, set the callback URL to a trusted redirect URL. This URL is the web location the user returns to with the token that verifies that your application has been granted access.
- If you plan to use client credentials to authenticate, you must select This is a private consumer. In the driver, you must set AuthScheme to client.
- Select which permissions to give your OAuth application. These determine what data you can read and write with it.
- To save the new custom application, click Save.
- After the application has been saved, you can select it to view its settings. The application's Key and Secret are displayed. Record these for future use. You will use the Key to set the OAuthClientId and the Secret to set the OAuthClientSecret.
- Click Save & Test
-
Navigate to the Permissions tab in the Add Bitbucket Connection page and update the User-based permissions.
Add a Personal Access Token
When connecting to Connect AI through the REST API, the OData API, or the Virtual SQL Server, a Personal Access Token (PAT) is used to authenticate the connection to Connect AI. It is best practice to create a separate PAT for each service to maintain granularity of access.
- Click on the Gear icon () at the top right of the Connect AI app to open the settings page.
- On the Settings page, go to the Access Tokens section and click Create PAT.
-
Give the PAT a name and click Create.
- The personal access token is only visible at creation, so be sure to copy it and store it securely for future use.
With the connection configured and a PAT generated, you are ready to connect to Bitbucket data from Airbyte.
Connect to Bitbucket from Airbyte
To establish a connection from Airbyte to CData Connect AI, follow these steps.
- Log in to your Airbyte account
- On the left panel, click Sources, then Add New Source
- Set Source Type to MSSQL Server to connect the TDS endpoint
- Set Source Name
- Set Host URL to tds.cdata.com
- Set Port to 14333
- Set Database to the name of the connection you previously configured, e.g. Bitbucket1.
- Set Username to your Connect AI username
- Set SSL Method to Encrypted (trust server certificate), leave the Replication Method as standard, and set SSH Tunnel Method to No Tunnel
- (Optional) Set Schema to anything you want to apply to the source
- Set Password to your Connect AI PAT
- (Optional) Enter any needed JBDC URL Params
- Click Test and Save to create the data source.
Create ELT Pipelines for Bitbucket Data
To connect Bitbucket data with a new destination, click Sources and then Set Up Connection to connect to your destination. Select the source created above and your desired destination, then allow Airbyte to process. When it is done, your connection is ready for use.
Get CData Connect AI
To get live data access to 300+ SaaS, Big Data, and NoSQL sources directly from Airbyte, try CData Connect AI today!