CLI (v2) Azure Data Lake Gen2 YAML schema
APPLIES TO: Azure CLI ml extension v2 (current)
The source JSON schema can be found at this resource.
Note
The YAML syntax detailed in this document is based on the JSON schema for the latest version of the ML CLI v2 extension. This syntax is guaranteed only to work with the latest version of the ML CLI v2 extension. You can find the schemas for older extension versions at https://azuremlschemasprod.azureedge.net/.
YAML syntax
Key | Type | Description | Allowed values | Default value |
---|---|---|---|---|
$schema |
string | The YAML schema. If you use the Azure Machine Learning Visual Studio Code extension to author the YAML file, you can invoke schema and resource completions if you include $schema at the top of your file. |
||
type |
string | Required. The datastore type. | azure_data_lake_gen2 |
|
name |
string | Required. The datastore name. | ||
description |
string | The datastore description. | ||
tags |
object | The datastore tag dictionary. | ||
account_name |
string | Required. The Azure storage account name. | ||
filesystem |
string | Required. The file system name. The parent directory containing the files and folders, equivalent to an Azure Blog storage container. | ||
endpoint |
string | The endpoint suffix of the storage service, used for creation of the storage account endpoint URL. It combines the storage account name and endpoint . Example storage account URL: https://<storage-account-name>.dfs.core.chinacloudapi.cn . |
core.chinacloudapi.cn |
|
protocol |
string | Protocol for connection to the file system. | https , abfss |
https |
credentials |
object | Service principal credentials for connecting to the Azure storage account. Credential secrets are stored in the workspace key vault. | ||
credentials.tenant_id |
string | The service principal tenant ID. Required if credentials is specified. |
||
credentials.client_id |
string | The service principal client ID. Required if credentials is specified. |
||
credentials.client_secret |
string | The service principal client secret. Required if credentials is specified. |
||
credentials.resource_url |
string | The resource URL that specifies the operations that will be performed on the Azure Data Lake Storage Gen2 account. | https://storage.azure.com/ |
|
credentials.authority_url |
string | The authority URL used for user authentication. | https://login.chinacloudapi.cn |
Remarks
The az ml datastore
command can be used for managing Azure Machine Learning datastores.
Examples
Examples are available in the examples GitHub repository. Several are shown here:
YAML: identity-based access
$schema: https://azuremlschemas.azureedge.net/latest/azureDataLakeGen2.schema.json
name: adls_gen2_credless_example
type: azure_data_lake_gen2
description: Credential-less datastore pointing to an Azure Data Lake Storage Gen2.
account_name: mytestdatalakegen2
filesystem: my-gen2-container
YAML: tenant ID, client ID, client secret
$schema: https://azuremlschemas.azureedge.net/latest/azureDataLakeGen2.schema.json
name: adls_gen2_example
type: azure_data_lake_gen2
description: Datastore pointing to an Azure Data Lake Storage Gen2.
account_name: mytestdatalakegen2
filesystem: my-gen2-container
credentials:
tenant_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
client_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
client_secret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX