Edit and Search Databricks External Objects in Salesforce Connect (API Server)



Use the API Server to securely provide OData feeds of Databricks data to smart devices and cloud-based applications. Use the API Server and Salesforce Connect to create Databricks objects that you can access from apps and the dashboard.

The CData API Server enables you to access Databricks data from cloud-based applications like the Salesforce console and mobile applications like the Salesforce1 Mobile App. In this article, you will use the API Server and Salesforce Connect to access Databricks external objects alongside standard Salesforce objects.

About Databricks Data Integration

Accessing and integrating live data from Databricks has never been easier with CData. Customers rely on CData connectivity to:

  • Access all versions of Databricks from Runtime Versions 9.1 - 13.X to both the Pro and Classic Databricks SQL versions.
  • Leave Databricks in their preferred environment thanks to compatibility with any hosting solution.
  • Secure authenticate in a variety of ways, including personal access token, Azure Service Principal, and Azure AD.
  • Upload data to Databricks using Databricks File System, Azure Blog Storage, and AWS S3 Storage.

While many customers are using CData's solutions to migrate data from different systems into their Databricks data lakehouse, several customers use our live connectivity solutions to federate connectivity between their databases and Databricks. These customers are using SQL Server Linked Servers or Polybase to get live access to Databricks from within their existing RDBMs.

Read more about common Databricks use-cases and how CData's solutions help solve data problems in our blog: What is Databricks Used For? 6 Use Cases.


Getting Started


Set Up the API Server

If you have not already done so, download the CData API Server. Once you have installed the API Server, follow the steps below to begin producing secure Databricks OData services:

Connect to Databricks

To work with Databricks data from Salesforce Connect, we start by creating and configuring a Databricks connection. Follow the steps below to configure the API Server to connect to Databricks data:

  1. First, navigate to the Connections page.
  2. Click Add Connection and then search for and select the Databricks connection.
  3. Enter the necessary authentication properties to connect to Databricks.

    To connect to a Databricks cluster, set the properties as described below.

    Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.

    • Server: Set to the Server Hostname of your Databricks cluster.
    • HTTPPath: Set to the HTTP Path of your Databricks cluster.
    • Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).
  4. After configuring the connection, click Save & Test to confirm a successful connection.

Configure API Server Users

Next, create a user to access your Databricks data through the API Server. You can add and configure users on the Users page. Follow the steps below to configure and create a user:

  1. On the Users page, click Add User to open the Add User dialog.
  2. Next, set the Role, Username, and Privileges properties and then click Add User.
  3. An Authtoken is then generated for the user. You can find the Authtoken and other information for each user on the Users page:

Creating API Endpoints for Databricks

Having created a user, you are ready to create API endpoints for the Databricks tables:

  1. First, navigate to the API page and then click Add Table .
  2. Select the connection you wish to access and click Next.
  3. With the connection selected, create endpoints by selecting each table and then clicking Confirm.

Gather the OData Url

Having configured a connection to Databricks data, created a user, and added resources to the API Server, you now have an easily accessible REST API based on the OData protocol for those resources. From the API page in API Server, you can view and copy the API Endpoints for the API:

Connect to Databricks Data as an External Data Source

Follow the steps below to connect to the feed produced by the API Server.

  1. Log into Salesforce and click Setup -> Develop -> External Data Sources.
  2. Click New External Data Source.
  3. Enter values for the following properties:
    • External Data Source: Enter a label to be used in list views and reports.
    • Name: Enter a unique identifier.
    • Type: Select the option "Salesforce Connect: OData 4.0".
    • URL: Enter the URL to the OData endpoint of the API Server. The format of the OData URL is https://your-server:your-port/api.rsc.

      Note that plain-text is suitable for only testing; for production, use TLS.

  4. Select the Writable External Objects option.
  5. Select JSON in the Format menu.

  6. In the Authentication section, set the following properties:
    • Identity Type: If all members of your organization will use the same credentials to access the API Server, select "Named Principal". If the members of your organization will connect with their own credentials, select "Per User".
    • Authentication Protocol: Select Password Authentication to use basic authentication.
    • Certificate: Enter or browse to the certificate to be used to encrypt and authenticate communications from Salesforce to your server.
    • Username: Enter the username for a user known to the API Server.
    • Password: Enter the user's authtoken.

Synchronize Databricks Objects

After you have created the external data source, follow the steps below to create Databricks external objects that reflect any changes in the data source. You will synchronize the definitions for the Databricks external objects with the definitions for Databricks tables.

  1. Click the link for the external data source you created.
  2. Click Validate and Sync.
  3. Select the Databricks tables you want to work with as external objects.

Access Databricks Data as Salesforce Objects

After adding Databricks data as an external data source and syncing Databricks tables with Databricks external objects, you can use the external objects just as you would standard Salesforce objects.

  • Create a new tab with a filter list view:

  • Display related lists of Databricks external objects alongside standard Salesforce objects:

  • Create, read, update, and delete Databricks objects from tabs on the Salesforce dashboard:

Troubleshooting

You can use the following checklist to avoid typical connection problems:

  • Ensure that your server has a publicly accessible IP address. Related to this check, but one layer up, at the operating system layer, you will also need to ensure that your firewall has an opening for the port the API Server is running on. At the application layer, ensure that you have added trusted IP addresses on the Settings -> Security tab of the administration console.
  • Ensure that you are using a connection secured by an SSL certificate from a commercial, trusted CA. Salesforce does not currently accept self-signed certificates or internal CAs.
  • Ensure that the server you are hosting the API Server on is using TLS 1.1 or above. If you are using the .NET API Server, you can accomplish this by using the .NET API Server's embedded server.

    If you are using IIS, TLS 1.1 and 1.2 are supported but not enabled by default. To enable these protocols, refer to the how-to on MSDN and the Microsoft technical reference.

    If you are using the Java edition, note that TLS 1.2 is enabled by default in Java 8 but not in Java 6 or 7. If you are using these earlier versions, you can refer to this this Oracle how-to.

Ready to get started?

Learn more or sign up for a free trial:

CData API Server