Automate Unity Catalog setup using Terraform
You can automate Unity Catalog setup by using the Databricks Terraform provider. This article provides links to the Terraform provider Unity Catalog deployment guide and resource reference documentation, along with requirements ("Before you begin") and validation and deployment tips.
Before you begin
To automate Unity Catalog setup using Terraform, you must have the following:
- Your Azure Databricks account must be on the Premium plan.
- In your Azure tenant, you must have permission to create:
- A storage account to use with Azure Data Lake Storage Gen2. See Create a storage account to use with Azure Data Lake Storage Gen2.
- A new resource to hold a system-assigned managed identity. This requires that you be a
Contributor
orOwner
of a resource group in any subscription in the tenant.
To use the Databricks Terraform provider to configure a metastore for Unity Catalog, storage for the metastore, any external storage, and all of their related access credentials, you must have the following:
- An Azure account.
- An account-level admin user in your Azure account.
- On your local development machine, you must have:
The Terraform CLI. See Download Terraform on the Terraform website.
The Azure CLI, signed in through the
az login
command with a user that hasContributor
orOwner
rights to your subscription. See How to install the Azure CLI. To sign in by using a Microsoft Entra ID service principal, see Azure CLI login with a Microsoft Entra ID service principal. To sign in by using an Azure Databricks user account, see Azure CLI login with an Azure Databricks user account.Note
When you authenticate with automated tools, systems, scripts, and apps, it is a security best practice to sign in through the
az login
command with a Microsoft Entra ID service principal. See Sign in with a service principal and Authenticating with Azure Service Principal.
To use the Databricks Terraform provider to configure all other Unity Catalog infrastructure components, you must have the following:
- An Azure Databricks workspace.
- On your local development machine, you must have:
- The Terraform CLI. See Download Terraform on the Terraform website.
- One of the following:
Databricks CLI version 0.205 or above, configured with your Azure Databricks personal access token by running
databricks configure --host <workspace-url> --profile <some-unique-profile-name>
. See Install or update the Databricks CLI and Azure Databricks personal access token authentication.Note
As a security best practice, when you authenticate with automated tools, systems, scripts, and apps, Databricks recommends that you use personal access tokens belonging to service principals instead of workspace users. To create tokens for service principals, see Manage tokens for a service principal.
The Azure CLI, signed in through the
az login
command. See How to install the Azure CLI. To sign in by using a Microsoft Entra ID service principal, see Azure CLI login with a Microsoft Entra ID service principal. To sign in by using an Azure Databricks user account, see Azure CLI login with an Azure Databricks user account.Note
When you authenticate with automated tools, systems, scripts, and apps, it is a security best practice to sign in through the
az login
command with a Microsoft Entra ID service principal. See Sign in with a service principal and Authenticating with Azure Service Principal.The following two Azure Databricks environment variables:
DATABRICKS_HOST
, set to the value of your workspace instance URL, for examplehttps://dbc-1234567890123456.cloud.databricks.com
DATABRICKS_TOKEN
, set to the value of your Azure Databricks personal access token or Microsoft Entra ID (formerly Azure Active Directory) token. See also Monitor and revoke personal access tokens.
To set these environment variables, see your operating system's documentation.
Note
As a security best practice, when you authenticate with automated tools, systems, scripts, and apps, Databricks recommends that you use personal access tokens belonging to service principals instead of workspace users. To create tokens for service principals, see Manage tokens for a service principal.
Terraform provider Unity Catalog deployment guide and resource reference documentation
To learn how to deploy all prerequisites and enable Unity Catalog for a workspace, see Deploying pre-requisite resources and enabling Unity Catalog in the Databricks Terraform provider documentation.
If you already have some Unity Catalog infrastructure components in place, you can use Terraform to deploy additional Unity Catalog infrastructure components as needed. See each section of the guide referenced in the previous paragraph and the Unity Catalog section of the Databricks Terraform provider documentation.
Validate, plan, deploy, or destroy the resources
- To validate the syntax of the Terraform configurations without deploying them, run the
terraform validate
command. - To show the actions that Terraform would take to deploy the configurations, run the
terraform plan
command. This command does not actually deploy the configurations. - To deploy the configurations, run the
terraform deploy
command. - To delete the deployed resources, run the
terraform destroy
command.