May 2021
These features and Azure Databricks platform improvements were released in May 2021.
Note
The release date and content listed below only corresponds to actual deployment of the Azure Public Cloud in most case.
It provide the evolution history of Azure Databricks service on Azure Public Cloud for your reference that may not be suitable for Azure operated by 21Vianet.
Note
Releases are staged. Your Azure Databricks account may not be updated until a week or more after the initial release date.
Databricks Mosaic AI: a data-native and collaborative solution for the full ML lifecycle
May 27, 2021
The new Machine Learning persona, selectable in the sidebar of the Azure Databricks UI, gives you easy access to a new purpose-built environment for ML, including the model registry and four new features in Public Preview:
- A new dashboard page with convenient resources, recents, and getting started links.
- A new Experiments page that centralizes experiment discovery and management.
- AutoML, a way to automatically generate ML models from data and accelerate the path to production.
- Feature Store, a way to catalog ML features and make them available for training and serving, increasing reuse. With a data-lineage-based feature search that leverages automatically-logged data sources, you can make features available for training and serving with simplified model deployment that doesn't require changes to the client application.
For details, see AI and machine learning on Databricks.
SQL Analytics is renamed to Databricks SQL
May 27, 2021
SQL Analytics is renamed to Databricks SQL. For more details, see the Databricks SQL release note.
Create and manage ETL pipelines using Delta Live Tables (Public Preview)
May 26, 2021
Databricks is pleased to introduce Delta Live Tables, a cloud service that makes extract, transform, and load (ETL) development simple, reliable, and scalable. Delta Live Tables:
- Provides an intuitive and familiar declarative interface to build pipelines.
- Enables you to monitor data processing pipelines, visualize dependencies, and manage pipelines and dependencies across different environments.
- Enables test-driven development, enforcement of data quality constraints, and application of uniform data error handling policies
- Automates deployment of your data processing pipelines so you can easily upgrade, rollback, and incrementally reprocesses data.
See What is Delta Live Tables? for details.
Azure Spot VMs are GA
May 24, 2021
The ability to create Azure Databricks clusters with Azure Spot Virtual Machines is now generally available. You can now get the benefit of significantly lower-cost Azure spot instances and reduce your total cost of ownership (TCO) of Azure Databricks. You can choose to use Azure spot instances when you:
Use the UI to create a cluster, selecting the Spot instances checkbox.
Note
The Azure Spot VMs feature is not available on Azure Operated by 21Vianet currently.
Use the API to create a cluster, specifying the
azure_attributes
field in the cluster attributes of the request.Use the UI to create an instance pool, selecting the All Spot option.
Use the API to create an instance pool, specifying the
azure_attributes
field in the create instance pool request.
Encrypt Databricks SQL queries and query history using your own key (Public Preview)
May 20, 2021
For details, see the Databricks SQL release notes.
Increased limit for the number of terminated all-purpose clusters
May 18, 2021: Version 3.46
You can now have up to 150 terminated all-purpose clusters in an Azure Databricks workspace. Previously the limit was 120. For details, see Terminate a compute. The limit on the number of terminated all-purpose clusters returned by the Clusters API request is also now 150.
Increased limit for the number of pinned clusters
May 18, 2021: Version 3.46
You can now have up to 70 pinned clusters in an Azure Databricks workspace. Previously the limit was 50. For details, see Pin a compute
Manage where notebook results are stored (Public Preview)
May 18, 2021: Version 3.46
You can now choose to store all notebook results in your root Azure Storage instance regardless of size or run type. By default, some results for interactive notebooks are stored in Azure Databricks. A new configuration enables you to store these in root Azure Storage instance in your own account. For details, see Configure notebook result storage location.
This feature has no impact on notebooks run as jobs, whose results are always stored in root Azure Storage instance.
Encrypt notebook and secret data in the control plane with your own key (Public Preview)
May 10, 2021
An Azure Databricks workspace comprises a control plane that is hosted in an Azure Databricks-managed subscription and a compute plane that is deployed in your Azure subscription. The control plane stores your managed services data, which includes notebook commands, secrets, and other workspace configuration data. By default, this data is encrypted with an Azure Databricks-managed key, but you can now add a key from your Azure Key Vault instance to encrypt this data. See Enable customer-managed keys for managed services.
Databricks Runtime 7.4 series support ends
May 3, 2021
Support for Databricks Runtime 7.4, Databricks Runtime 7.4 for Machine Learning, and Databricks Runtime 7.4 for Genomics ended on May 3. See Databricks support lifecycles.
Repos users can now integrate with Azure DevOps using personal access tokens
May 3-10, 2021: Version 3.45
In addition to Microsoft Entra ID access tokens, you can now use a personal access token to authenticate with Azure DevOps. For details, see Set up Databricks Git folders (Repos).