CLI (v2) job schedule YAML schema
APPLIES TO: Azure CLI ml extension v2 (current)
The source JSON schema can be found at https://azuremlschemas.azureedge.net/latest/schedule.schema.json.
Note
The YAML syntax detailed in this document is based on the JSON schema for the latest version of the ML CLI v2 extension. This syntax is guaranteed only to work with the latest version of the ML CLI v2 extension. You can find the schemas for older extension versions at https://azuremlschemasprod.azureedge.net/.
YAML syntax
Key | Type | Description | Allowed values |
---|---|---|---|
$schema |
string | The YAML schema. | |
name |
string | Required. Name of the schedule. | |
description |
string | Description of the schedule. | |
tags |
object | Dictionary of tags for the schedule. | |
trigger |
object | The trigger configuration to define rule when to trigger job. One of RecurrenceTrigger or CronTrigger is required. |
|
create_job |
object or string | Required. The definition of the job that will be triggered by a schedule. One of string or JobDefinition is required. |
Trigger configuration
Recurrence trigger
Key | Type | Description | Allowed values |
---|---|---|---|
type |
string | Required. Specifies the schedule type. | recurrence |
frequency |
string | Required. Specifies the unit of time that describes how often the schedule fires. | minute , hour , day , week , month |
interval |
integer | Required. Specifies the interval at which the schedule fires. | |
start_time |
string | Describes the start date and time with timezone. If start_time is omitted, the first job will run instantly and the future jobs will be triggered based on the schedule, saying start_time will be equal to the job created time. If the start time is in the past, the first job will run at the next calculated run time. | |
end_time |
string | Describes the end date and time with timezone. If end_time is omitted, the schedule will continue to run until it's explicitly disabled. | |
timezone |
string | Specifies the time zone of the recurrence. If omitted, by default is UTC. | See appendix for timezone values |
pattern |
object | Specifies the pattern of the recurrence. If pattern is omitted, the job(s) will be triggered according to the logic of start_time, frequency and interval. |
Recurrence schedule
Recurrence schedule defines the recurrence pattern, containing hours
, minutes
, and weekdays
.
- When frequency is
day
, pattern can specifyhours
andminutes
. - When frequency is
week
andmonth
, pattern can specifyhours
,minutes
andweekdays
.
Key | Type | Allowed values |
---|---|---|
hours |
integer or array of integer | 0-23 |
minutes |
integer or array of integer | 0-59 |
week_days |
string or array of string | monday , tuesday , wednesday , thursday , friday , saturday , sunday |
CronTrigger
Key | Type | Description | Allowed values |
---|---|---|---|
type |
string | Required. Specifies the schedule type. | cron |
expression |
string | Required. Specifies the cron expression to define how to trigger jobs. expression uses standard crontab expression to express a recurring schedule. A single expression is composed of five space-delimited fields:MINUTES HOURS DAYS MONTHS DAYS-OF-WEEK |
|
start_time |
string | Describes the start date and time with timezone. If start_time is omitted, the first job will run instantly and the future jobs will be triggered based on the schedule, saying start_time will be equal to the job created time. If the start time is in the past, the first job will run at the next calculated run time. | |
end_time |
string | Describes the end date and time with timezone. If end_time is omitted, the schedule will continue to run until it's explicitly disabled. | |
timezone |
string | Specifies the time zone of the recurrence. If omitted, by default is UTC. | See appendix for timezone values |
Job definition
Customer can directly use create_job: azureml:<job_name>
or can use the following properties to define the job.
Key | Type | Description | Allowed values |
---|---|---|---|
type |
string | Required. Specifies the job type. Only pipeline job is supported. | pipeline |
job |
string | Required. Define how to reference a job, it can be azureml:<job_name> or a local pipeline job yaml such as file:hello-pipeline.yml . |
|
experiment_name |
string | Experiment name to organize the job under. Each job's run record will be organized under the corresponding experiment in the studio's "Experiments" tab. If omitted, we'll take schedule name as default value. | |
inputs |
object | Dictionary of inputs to the job. The key is a name for the input within the context of the job and the value is the input value. | |
outputs |
object | Dictionary of output configurations of the job. The key is a name for the output within the context of the job and the value is the output configuration. | |
settings |
object | Default settings for the pipeline job. See Attributes of the settings key for the set of configurable properties. |
Attributes of the settings
key
Key | Type | Description | Default value |
---|---|---|---|
default_datastore |
string | Name of the datastore to use as the default datastore for the pipeline job. This value must be a reference to an existing datastore in the workspace using the azureml:<datastore-name> syntax. Any outputs defined in the outputs property of the parent pipeline job or child step jobs will be stored in this datastore. If omitted, outputs will be stored in the workspace blob datastore. |
|
default_compute |
string | Name of the compute target to use as the default compute for all steps in the pipeline. If compute is defined at the step level, it will override this default compute for that specific step. This value must be a reference to an existing compute in the workspace using the azureml:<compute-name> syntax. |
|
continue_on_step_failure |
boolean | Whether the execution of steps in the pipeline should continue if one step fails. The default value is False , which means that if one step fails, the pipeline execution will be stopped, canceling any running steps. |
False |
Job inputs
Key | Type | Description | Allowed values | Default value |
---|---|---|---|---|
type |
string | The type of job input. Specify uri_file for input data that points to a single file source, or uri_folder for input data that points to a folder source. |
uri_file , uri_folder |
uri_folder |
path |
string | The path to the data to use as input. This can be specified in a few ways: - A local path to the data source file or folder, for example, path: ./iris.csv . The data will get uploaded during job submission. - A URI of a cloud path to the file or folder to use as the input. Supported URI types are azureml , https , wasbs , abfss , adl . For more information on how to use the azureml:// URI format, see Core yaml syntax. - An existing registered Azure Machine Learning data asset to use as the input. To reference a registered data asset, use the azureml:<data_name>:<data_version> syntax or azureml:<data_name>@latest (to reference the latest version of that data asset), for example, path: azureml:cifar10-data:1 or path: azureml:cifar10-data@latest . |
||
mode |
string | Mode of how the data should be delivered to the compute target. For read-only mount ( ro_mount ), the data will be consumed as a mount path. A folder will be mounted as a folder and a file will be mounted as a file. Azure Machine Learning will resolve the input to the mount path. For download mode the data will be downloaded to the compute target. Azure Machine Learning will resolve the input to the downloaded path. If you only want the URL of the storage location of the data artifact(s) rather than mounting or downloading the data itself, you can use the direct mode. This will pass in the URL of the storage location as the job input. In this case, you're fully responsible for handling credentials to access the storage. |
ro_mount , download , direct |
ro_mount |
Job outputs
Key | Type | Description | Allowed values | Default value |
---|---|---|---|---|
type |
string | The type of job output. For the default uri_folder type, the output will correspond to a folder. |
uri_folder |
uri_folder |
path |
string | The path to the data to use as input. This can be specified in a few ways: - A local path to the data source file or folder, for example, path: ./iris.csv . The data will get uploaded during job submission. - A URI of a cloud path to the file or folder to use as the input. Supported URI types are azureml , https , wasbs , abfss , adl . For more information on how to use the azureml:// URI format, see Core yaml syntax. - An existing registered Azure Machine Learning data asset to use as the input. To reference a registered data asset, use the azureml:<data_name>:<data_version> syntax or azureml:<data_name>@latest (to reference the latest version of that data asset), for example, path: azureml:cifar10-data:1 or path: azureml:cifar10-data@latest . |
||
mode |
string | Mode of how output file(s) will get delivered to the destination storage. For read-write mount mode (rw_mount ) the output directory will be a mounted directory. For upload mode the file(s) written will get uploaded at the end of the job. |
rw_mount , upload |
rw_mount |
Remarks
The az ml schedule
command can be used for managing Azure Machine Learning models.
Examples
Examples are available in the examples GitHub repository. A couple are shown below.
YAML: Schedule with recurrence pattern
APPLIES TO: Azure CLI ml extension v2 (current)
$schema: https://azuremlschemas.azureedge.net/latest/schedule.schema.json
name: simple_recurrence_job_schedule
display_name: Simple recurrence job schedule
description: a simple hourly recurrence job schedule
trigger:
type: recurrence
frequency: day #can be minute, hour, day, week, month
interval: 1 #every day
schedule:
hours: [4,5,10,11,12]
minutes: [0,30]
start_time: "2022-07-10T10:00:00" # optional - default will be schedule creation time
time_zone: "Pacific Standard Time" # optional - default will be UTC
create_job: ./simple-pipeline-job.yml
# create_job: azureml:simple-pipeline-job
YAML: Schedule with cron expression
APPLIES TO: Azure CLI ml extension v2 (current)
$schema: https://azuremlschemas.azureedge.net/latest/schedule.schema.json
name: simple_cron_job_schedule
display_name: Simple cron job schedule
description: a simple hourly cron job schedule
trigger:
type: cron
expression: "0 * * * *"
start_time: "2022-07-10T10:00:00" # optional - default will be schedule creation time
time_zone: "Pacific Standard Time" # optional - default will be UTC
# create_job: azureml:simple-pipeline-job
create_job: ./simple-pipeline-job.yml
Appendix
Timezone
Current schedule supports the following timezones. The key can be used directly in the Python SDK, while the value can be used in the YAML job. The table is organized by UTC(Coordinated Universal Time).
UTC | Key | Value |
---|---|---|
UTC +08:00 | CHINA_STANDARD_TIME | "China Standard Time" |