Optimize the cluster utilization of Delta Live Tables pipelines with Enhanced Autoscaling
This article discusses how to use Enhanced Autoscaling to optimize your Delta Live Tables pipelines on Azure Databricks.
What is Enhanced Autoscaling?
Databricks Enhanced Autoscaling optimizes cluster utilization by automatically allocating cluster resources based on workload volume, with minimal impact to the data processing latency of your pipelines.
Enhanced Autoscaling improves on the Azure Databricks cluster autoscaling functionality with the following features:
- Enhanced Autoscaling implements optimization of streaming workloads, and adds enhancements to improve the performance of batch workloads. Enhanced Autoscaling optimizes costs by adding or removing machines as the workload changes.
- Enhanced Autoscaling proactively shuts down under-utilized nodes while guaranteeing there are no failed tasks during shutdown. The existing cluster autoscaling feature scales down nodes only if the node is idle.
Enhanced Autoscaling is the default autoscaling mode when you create a new pipeline in the Delta Live Tables UI. You can enable Enhanced Autoscaling for existing pipelines by editing the pipeline settings in the UI. You can also enable Enhanced Autoscaling when you create or edit pipelines with the Delta Live Tables API.
Which metrics does Enhanced Autoscaling use to make a scale-up or scale-down decision?
Enhanced autoscaling uses two metrics to decide on scaling up or scaling down:
- Task slot utilization: This is the average ratio of the number of busy task slots to the total task slots available in the cluster.
- Task queue size: This is the number of tasks waiting to be executed in task slots.
On which type of pipelines does Enhanced Autoscaling operate?
Enhanced Autoscaling is available for pipelines that use classic compute.
Pipeline compute | Enhanced Autoscaling enabled? |
---|---|
Classic (user-configured) | Enabled by default, but can be disabled |
Enhanced Autoscaling for classic pipelines
For Delta Live Tables pipelines that use classic compute, Enhanced Autoscaling is the default autoscaling mode when you create a new pipeline in the Delta Live Tables UI. You can enable Enhanced Autoscaling for existing classic pipelines by editing the pipeline settings in the UI. You can also enable Enhanced Autoscaling when you create or edit classic pipelines with the Delta Live Tables API.
Enable Enhanced Autoscaling for DLT pipelines using classic compute
Note
Because compute resources are automatically optimized for serverless DLT pipelines pipelines, settings for Databricks Enhanced Autoscaling are not available when you select Serverless for a pipeline.
To use Enhanced Autoscaling, do one of the following:
- Set Cluster mode to Enhanced autoscaling when you create a pipeline or edit a pipeline in the Delta Live Tables UI.
- Add the
autoscale
setting to the pipeline cluster configuration and set themode
field toENHANCED
. See Configure your compute settings.
Use the following guidelines when configuring Enhanced Autoscaling for production pipelines:
- Leave the
Min workers
setting at the default. - Set the
Max workers
setting to a value based on budget and pipeline priority.
The following example configures an Enhanced Autoscaling cluster with a minimum of 5 workers and a maximum of 10 workers. max_workers
must be greater than or equal to min_workers
.
Note
- Enhanced Autoscaling is available for
updates
clusters only. The existing autoscaling feature is used formaintenance
clusters. - The
autoscale
configuration has two modes:LEGACY
: Use cluster autoscaling.ENHANCED
: Use Enhanced Autoscaling.
{
"clusters": [
{
"autoscale": {
"min_workers": 5,
"max_workers": 10,
"mode": "ENHANCED"
}
}
]
}
The pipeline is automatically restarted after the autoscaling configuration changes if the pipeline is configured for continuous execution. After restart, expect a short period of increased latency. Following this brief period of increased latency, the cluster size should be updated based on your autoscale
configuration, and the pipeline latency returned to its previous latency characteristics.
Limit costs for classic pipelines that use Enhanced Autoscaling
For classic pipelines that use Enhanced Autoscaling, you can decrease your costs if you accept greater latency in performance. By tuning the Max workers parameter in the Pipelines Compute pane, you can achieve a better cost-latency trade-off for your particular needs.
Monitor Enhanced Autoscaling enabled classic pipelines
You can use the event log in the Delta Live Tables user interface to monitor Enhanced Autoscaling metrics for classic pipelines. Enhanced Autoscaling events have the autoscale
event type. The following are example events:
Event | Message |
---|---|
Cluster resize request started | Scaling [up or down] to <y> executors from current cluster size of <x> |
Cluster resize request succeeded | Achieved cluster size <x> for cluster <cluster-id> with status SUCCEEDED |
Cluster resize request partially succeeded | Achieved cluster size <x> for cluster <cluster-id> with status PARTIALLY_SUCCEEDED |
Cluster resize request failed | Achieved cluster size <x> for cluster <cluster-id> with status FAILED |
You can also view Enhanced Autoscaling events by directly querying the event log:
- To query the event log for backlog metrics, see Monitor data backlog by querying the event log.
- To monitor cluster resizing requests and responses during Enhanced Autoscaling operations, see Monitor Enhanced Autoscaling events from the event log for pipelines without serverless enabled.