Databricks cloud data source
This article describes how to set up the Databricks cloud data source.
For a general overview of setting up a cloud data source, see Manage a cloud data source.
Data types
The Databricks data source supports all Databricks data types. To ensure data is imported correctly, map the Databricks data types according to the following guidelines:
| Databricks | Tealium |
|---|---|
| Numeric data types | Number attributes |
| String and binary data types | String attributes |
| Logical data types | Boolean attributes |
| Date and time data types | Date attributes |
| Arrays | Array of strings, array of numbers, or array of booleans |
| Map, struct, object, variant | String attributes |
For more information, see Databricks: Data Types (AWS, Azure, GCP).
Multitable join is not currently supported. You can achieve the same functionality with a Databricks view. For more information, see Databricks: Create and Manage Views (AWS, Azure, GCP).
Create a connection
Tealium uses a service principal to access your Databricks compute resource. Before you proceed, you must create a service principal in Databricks and generate an OAuth secret. For more information, see Databricks: Authorize access with a service principal using OAuth.
Databricks personal access tokens (PAT) are not supported. For more information, see Databricks: Authenticate with Databricks personal access token (legacy) (AWS, Azure, GCP).
To configure a new connection, enter the following connection details:
- Hostname: The hostname of your Databricks compute resource. Examples:
- AWS:
MY_ACCOUNT.cloud.databricks.com - Azure:
MY_ACCOUNT.azuredatabricks.net - GCP:
MY_ACCOUNT.gcp.databricks.com
- AWS:
- HTTP Path: The HTTP path to your compute resource. For example:
/sql/1.0/warehouses/3fbc78304284503a. To find the HTTP Path, go to the SQL Warehouses screen in Databricks, select the warehouse where your table is located, and click Connection details. - Catalog: The name of the catalog for this connection.
- Schema: The name of the schema for this connection.
- OAuth Client ID: The service principal’s UUID or Application ID.
- OAuth Client Secret: The service principal’s generated secret.
For more information, see Databricks: Compute settings (AWS, Azure, GCP).
After you connect to Databricks, select the data source table from the Table Selection list.
Query mode
For a general overview, see Query modes.
For Databricks, note the following requirements:
- Timestamp + Incrementing and Timestamp modes: The selected timestamp column must be the type
TIMESTAMP.
For more information, see Databricks: TIMESTAMP type (AWS, Azure, GCP). - Incrementing mode: The selected numeric column must increment in value for every row added. A recommended definition for an auto-increment column is:
COL1 BIGINT GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1)
For more information, see Databricks CREATE TABLE (AWS, Azure, GCP).
WHERE clause
For a general overview, see SQL Query.
The WHERE clause does not support subqueries from multiple tables. To import data from multiple Databricks tables, create a view in Databricks and select the view in the data source configuration.
For more information, see Databricks: What is a view? (AWS, Azure, GCP).
IP access list
If your Databricks workspace is restricted by IP addresses, add the Tealium IP addresses to your Databricks IP access list.
For more information, see Databricks: Manage IP access list (AWS, Azure, GCP).
This page was last updated: January 20, 2026