CLI (v2) 批量部署 YAML 架构

项目
08/29/2024

源 JSON 架构可在 https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json 中找到。

注意

本文档中详细介绍的 YAML 语法基于最新版本的 ML CLI v2 扩展的 JSON 架构。此语法必定仅适用于最新版本的 ML CLI v2 扩展。可以在 https://azuremlschemasprod.azureedge.net/ 上查找早期扩展版本的架构。

YAML 语法

密钥	类型	说明	允许的值	默认值
`$schema`	字符串	YAML 架构。如果使用 Azure 机器学习 VS Code 扩展来创作 YAML 文件，则可通过在文件顶部包含 `$schema` 来调用架构和资源完成操作。
`name`	字符串	必需。部署的名称。
`description`	字符串	部署的说明。
`tags`	object	部署的标记字典。
`endpoint_name`	字符串	必需。要在其中创建部署的终结点的名称。
`type`	字符串	必需。批量部署的类型。使用 `model` 进行模型部署，使用 `pipeline` 进行管道组件部署。版本 1.7 中的新增功能。	`model`，`pipeline`	`model`
`settings`	object	部署的配置。有关允许的值，请参阅模型和管道组件的特定 YAML 参考。版本 1.7 中的新增功能。

提示

CLI 扩展版本 1.7 及更高版本中引入了密钥 type。为了完全支持向后兼容性，此属性默认为 model。但是，如果未显式指示，则不会强制执行密钥 settings，并且模型部署设置的所有属性都应在 YAML 规范的根目录中指明。

模型部署的 YAML 语法

当为 type: model 时，将强制实施以下语法：

密钥	类型	说明	允许的值	默认值
`model`	字符串或对象	必需。要用于部署的模型。此值可以是对工作区中现有版本受控模型的引用，也可以是对内联模型规范的引用。若要引用现有模型，请使用 `azureml:<model-name>:<version>` 语法。若要内联定义模型，请遵循模型架构。对于生产方案，最佳做法应该是单独创建模型并在此处引用模型。
`code_configuration`	object	评分代码逻辑的配置。如果模型采用 MLflow 格式，则不需要此属性。
`code_configuration.code`	字符串	包含用于对模型进行评分的所有 Python 源代码的本地目录。
`code_configuration.scoring_script`	字符串	上述目录中的 Python 文件。此文件必须具有 `init()` 函数和 `run()` 函数。对于任何成本较高或者一般性的准备工作（例如将模型加载到内存中），请使用 `init()` 函数。 `init()` 将在进程开始时调用一次。使用 `run(mini_batch)` 为每个项评分；`mini_batch` 值是文件路径列表。 `run()` 函数应返回 Pandas 数据帧或数组。每个返回的元素指示 `mini_batch` 中成功运行一次输入元素。
`environment`	字符串或对象	用于作业的环境。此值可以是对工作区中现有版本受控环境的引用，也可以是对内联环境规范的引用。如果模型采用 MLflow 格式，则不需要此属性。若要引用现有环境，请使用 `azureml:<environment-name>:<environment-version>` 语法。若要以内联方式定义环境，请遵循环境架构。对于生产方案，最佳做法应该是单独创建环境并在此处引用环境。
`compute`	字符串	必需。要执行批处理评分作业的计算目标的名称。此值应是通过 `azureml:<compute-name>` 语法对工作区中现有计算的引用。
`resources.instance_count`	整型	要用于每个批量评分作业的节点数。		`1`
`settings`	object	模型部署的特定配置。在 1.7 版中已更改。
`settings.max_concurrency_per_instance`	整型	每个实例的并行 `scoring_script` 运行的最大数量。		`1`
`settings.error_threshold`	整型	应忽略的文件失败次数。如果整个输入的错误计数超出此值，则批量评分作业将终止。 `error_threshold` 用于整个输入，而不是单独的小批量。如果省略，则允许文件失败任意次数，而不会终止作业。		`-1`
`settings.logging_level`	字符串	日志详细程度。	`warning`，`info`，`debug`	`info`
`settings.mini_batch_size`	整型	`code_configuration.scoring_script` 可以在一次 `run()` 调用中处理的文件数。		`10`
`settings.retry_settings`	object	用于对每个小批量进行评分的重试设置。
`settings.retry_settings.max_retries`	整型	针对失败或已超时小批量的最大重试次数。		`3`
`settings.retry_settings.timeout`	整型	对单个小批量进行评分的超时（以秒为单位）。当小批量大小较大或模型运行成本较高时，请使用较大的值。		`30`
`settings.output_action`	字符串	指示应如何在输出文件中组织输出。如果要按照自定义模型部署中的输出中的指示生成输出文件，请使用 `summary_only`。如果要返回预测作为 `run()` 函数 `return` 语句的一部分，请使用 `append_row`。	`append_row`，`summary_only`	`append_row`
`settings.output_file_name`	string	批量评分输出文件的名称。		`predictions.csv`
`settings.environment_variables`	object	要针对每个批量评分作业设置的环境变量键值对字典。

备注

az ml batch-deployment 命令可用于管理 Azure 机器学习批量部署。

示例

示例 GitHub 存储库中提供了示例。下面引用了其中一些示例：

YAML：MLflow 模型部署

包含 MLflow 模型的模型部署，不需要指示 code_configuration 或 environment：

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
endpoint_name: heart-classifier-batch
name: classifier-xgboost-mlflow
description: A heart condition classifier based on XGBoost
type: model
model: azureml:heart-classifier-mlflow@latest
compute: azureml:batch-cluster
resources:
  instance_count: 2
settings:
  max_concurrency_per_instance: 2
  mini_batch_size: 2
  output_action: append_row
  output_file_name: predictions.csv
  retry_settings:
    max_retries: 3
    timeout: 300
  error_threshold: -1
  logging_level: info

YAML：带有评分脚本的自定义模型部署

指示要使用的评分脚本和环境的模型部署：

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
name: mnist-torch-dpl
description: A deployment using Torch to solve the MNIST classification dataset.
endpoint_name: mnist-batch
type: model
model:
  name: mnist-classifier-torch
  path: model
code_configuration:
  code: code
  scoring_script: batch_driver.py
environment:
  name: batch-torch-py38
  image: mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest
  conda_file: environment/conda.yaml
compute: azureml:batch-cluster
resources:
  instance_count: 1
settings:
  max_concurrency_per_instance: 2
  mini_batch_size: 10
  output_action: append_row
  output_file_name: predictions.csv
  retry_settings:
    max_retries: 3
    timeout: 30
  error_threshold: -1
  logging_level: info

YAML：旧模型部署

如果 YAML 中未指示属性 type，则推断模型部署。但是，密钥 settings 将不可用，并且属性应放置在 YAML 的根目录中，如此示例中所示。强烈建议始终指定属性 type。

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
endpoint_name: heart-classifier-batch
name: classifier-xgboost-mlflow
description: A heart condition classifier based on XGBoost
model: azureml:heart-classifier-mlflow@latest
compute: azureml:batch-cluster
resources:
  instance_count: 2
max_concurrency_per_instance: 2
mini_batch_size: 2
output_action: append_row
output_file_name: predictions.csv
retry_settings:
  max_retries: 3
  timeout: 300
error_threshold: -1
logging_level: info

YAML：管道组件部署

一个简单的管道组件部署：

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
name: hello-batch-dpl
endpoint_name: hello-pipeline-batch
type: pipeline
component: azureml:hello_batch@latest
settings:
    default_compute: batch-cluster

后续步骤

安装并使用 CLI (v2)

通过