使用 ONNX 对来自 AutoML 的计算机视觉模型进行预测

项目
08/29/2024

本文将介绍如何使用 Open Neural Network Exchange (ONNX) 对从 Azure 机器学习中的自动机器学习 (AutoML) 生成的计算机视觉模型进行预测。

若要使用 ONNX 进行预测，需执行以下操作：

从 AutoML 训练运行下载 ONNX 模型文件。
了解 ONNX 模型的输入和输出。
预处理数据，使其成为输入图像所需的格式。
对 Python 的 ONNX 运行时执行推理。
可视化对象检测和实例分段任务的预测。

ONNX 是机器学习和深度学习模型的开放标准。它支持在常用的 AI 框架中进行模型导入和导出（互操作性）。更多详细信息，请浏览 ONNX GitHub 项目。

ONNX 运行时是一个支持跨平台推理的开源项目。 ONNX 运行时提供跨编程语言（包括 Python、C++、C#、C、Java 和 JavaScript）的 API。可以使用这些 API 对输入图像执行推理。将模型导出为 ONNX 格式后，便可以在项目需要的任何编程语言中使用这些 API。

本指南介绍如何使用用于 ONNX 运行时的 Python API 对常见视觉任务的图像进行预测。可以跨语言使用这些 ONNX 导出的模型。

先决条件

对任何受支持的图像任务（分类、对象检测或实例分段）获取经 AutoML 训练的计算机视觉模型。详细了解 AutoML 对计算机视觉任务的支持。
安装 onnxruntime 包。本文中的方法已使用 1.3.0-1.8.0 版本进行了测试。

下载 ONNX 模型文件

可以使用 Azure 机器学习工作室 UI 或 Azure 机器学习 Python SDK 从 AutoML 运行下载 ONNX 模型文件。建议使用具有实验名称和父运行 ID 的 SDK 进行下载。

Azure 机器学习工作室

在 Azure 机器学习工作室中，通过训练笔记本中生成的指向实验的超链接进入实验，或选择“资产”下的“实验”选项卡中实验名称进入实验。然后，选择最佳子运行。

在最佳子运行中，转到“输出+日志”>“train_artifacts” 。使用“下载”按钮手动下载以下文件：

labels.json：包含训练数据集中所有类或标签的文件。
model.onnx：ONNX 格式的模型。

显示下载 ONNX 模型文件的选项的屏幕截图。

将下载的模型文件保存到目录。本文中的示例使用 ./automl_models 目录。

Azure 机器学习 Python SDK

在 SDK 中，可以使用实验名称和父运行 ID 选择最佳子运行（按主要指标）。然后，可以下载 labels.json 和 model.onnx 文件。

以下代码根据相关的主要指标返回最佳子运行。

from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient
mlflow_client = MlflowClient()

credential = DefaultAzureCredential()
ml_client = None
try:
    ml_client = MLClient.from_config(credential)
except Exception as ex:
    print(ex)
    # Enter details of your Azure Machine Learning workspace
    subscription_id = ''   
    resource_group = ''  
    workspace_name = ''
    ml_client = MLClient(credential, subscription_id, resource_group, workspace_name)

import mlflow
from mlflow.tracking.client import MlflowClient

# Obtain the tracking URL from MLClient
MLFLOW_TRACKING_URI = ml_client.workspaces.get(
    name=ml_client.workspace_name
).mlflow_tracking_uri

mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)

# Specify the job name
job_name = ''

# Get the parent run
mlflow_parent_run = mlflow_client.get_run(job_name)
best_child_run_id = mlflow_parent_run.data.tags['automl_best_child_run_id']
# get the best child run
best_run = mlflow_client.get_run(best_child_run_id)

下载 labels.json 文件，其中包含训练数据集中的所有类和标签。

local_dir = './automl_models'
if not os.path.exists(local_dir):
    os.mkdir(local_dir)

labels_file = mlflow_client.download_artifacts(
    best_run.info.run_id, 'train_artifacts/labels.json', local_dir
)

下载 model.onnx 文件。

onnx_model_path = mlflow_client.download_artifacts(
    best_run.info.run_id, 'train_artifacts/model.onnx', local_dir
)

如果使用 ONNX 模型对对象检测和实例分段进行批量推理，请参阅有关批量评分的模型生成的部分。

生成模型进行批量评分

默认情况下，AutoML for Images 支持分类的批量评分。但是对象检测和实例分割 ONNX 模型不支持批量推理。若要对于对象检测和实例分段进行批量推断，请使用以下过程为所需的批大小生成 ONNX 模型。为特定批大小生成的模型不能用于其他批大小。

下载 conda 环境文件，并创建用于命令作业的环境对象。

#  Download conda file and define the environment

conda_file = mlflow_client.download_artifacts(
    best_run.info.run_id, "outputs/conda_env_v_1_0_0.yml", local_dir
)
from azure.ai.ml.entities import Environment
env = Environment(
    name="automl-images-env-onnx",
    description="environment for automl images ONNX batch model generation",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-cuda11.1-cudnn8-ubuntu18.04",
    conda_file=conda_file,
)

使用以下模型特定参数提交脚本。有关参数的更多详细信息，请参阅模型特定超参数；有关受支持的对象检测模型名称，请参阅受支持的模型体系结构部分。

若要获取创建批处理评分模型所需的参数值，请参阅 AutoML 训练运行 outputs 文件夹下生成的评分脚本。使用最佳子运行评分文件内模型设置变量中提供的超参数值。

对于多类图像分类，为最佳子运行生成的 ONNX 模型默认支持批量评分。因此，此任务类型不需要模型特定的参数，可以跳到加载标签和 ONNX 模型文件部分。

inputs = {'model_name': 'fasterrcnn_resnet34_fpn',  # enter the faster rcnn or retinanet model name
         'batch_size': 8,  # enter the batch size of your choice
         'height_onnx': 600,  # enter the height of input to ONNX model
         'width_onnx': 800,  # enter the width of input to ONNX model
         'job_name': job_name,
         'task_type': 'image-object-detection',
         'min_size': 600,  # minimum size of the image to be rescaled before feeding it to the backbone
         'max_size': 1333,  # maximum size of the image to be rescaled before feeding it to the backbone
         'box_score_thresh': 0.3,  # threshold to return proposals with a classification score > box_score_thresh
         'box_nms_thresh': 0.5,  # NMS threshold for the prediction head
         'box_detections_per_img': 100   # maximum number of detections per image, for all classes
         }

inputs = {'model_name': 'yolov5',  # enter the yolo model name
          'batch_size': 8,  # enter the batch size of your choice
          'height_onnx': 640,  # enter the height of input to ONNX model
          'width_onnx': 640,  # enter the width of input to ONNX model
          'job_name': job_name,
          'task_type': 'image-object-detection',
          'img_size': 640,  # image size for inference
          'model_size': 'small',  # size of the yolo model
          'box_score_thresh': 0.1,  # threshold to return proposals with a classification score > box_score_thresh
          'box_iou_thresh': 0.5
        }

inputs = {'model_name': 'maskrcnn_resnet50_fpn',  # enter the maskrcnn model name
         'batch_size': 8,  # enter the batch size of your choice
         'height_onnx': 600,  # enter the height of input to ONNX model
         'width_onnx': 800,  # enter the width of input to ONNX model
         'job_name': job_name,
         'task_type': 'image-instance-segmentation',
         'min_size': 600,  # minimum size of the image to be rescaled before feeding it to the backbone
         'max_size': 1333,  # maximum size of the image to be rescaled before feeding it to the backbone
         'box_score_thresh': 0.3,  # threshold to return proposals with a classification score > box_score_thresh
         'box_nms_thresh': 0.5,  # NMS threshold for the prediction head
         'box_detections_per_img': 100  # maximum number of detections per image, for all classes
         }

下载 ONNX_batch_model_generator_automl_for_images.py 文件并将其保留在当前目录中，然后提交脚本。使用以下命令作业提交 azureml-examples GitHub 存储库中可用的脚本 ONNX_batch_model_generator_automl_for_images.py，以生成特定批大小的 ONNX 模型。在以下代码中，训练的模型环境用来提交此脚本，以生成 ONNX 模型并将其保存到 outputs 目录。

from azure.ai.ml import command

job = command(
    code="./onnx_generator_files",  # local path where the code is stored
    command="python ONNX_batch_model_generator_automl_for_images.py --model_name ${{inputs.model_name}} --batch_size ${{inputs.batch_size}} --height_onnx ${{inputs.height_onnx}} --width_onnx ${{inputs.width_onnx}} --job_name ${{inputs.job_name}} --task_type ${{inputs.task_type}} --min_size ${{inputs.min_size}} --max_size ${{inputs.max_size}} --box_score_thresh ${{inputs.box_score_thresh}} --box_nms_thresh ${{inputs.box_nms_thresh}} --box_detections_per_img ${{inputs.box_detections_per_img}}",
    inputs=inputs,
    environment=env,
    compute=compute_name,
    display_name="ONNX-batch-model-generation-rcnn",
    description="Use the PyTorch to generate ONNX batch scoring model.",
)
returned_job = ml_client.create_or_update(job)
ml_client.jobs.stream(returned_job.name)

from azure.ai.ml import command

job = command(
    code="./onnx_generator_files",  # local path where the code is stored
    command="python ONNX_batch_model_generator_automl_for_images.py --model_name ${{inputs.model_name}} --batch_size ${{inputs.batch_size}} --height_onnx ${{inputs.height_onnx}} --width_onnx ${{inputs.width_onnx}} --job_name ${{inputs.job_name}} --task_type ${{inputs.task_type}} --img_size ${{inputs.img_size}} --model_size ${{inputs.model_size}} --box_score_thresh ${{inputs.box_score_thresh}} --box_iou_thresh ${{inputs.box_iou_thresh}}",
    inputs=inputs,
    environment=env,
    compute=compute_name,
    display_name="ONNX-batch-model-generation",
    description="Use the PyTorch to generate ONNX batch scoring model.",
)
returned_job = ml_client.create_or_update(job)
ml_client.jobs.stream(returned_job.name)

from azure.ai.ml import command

job = command(
    code="./onnx_generator_files",  # local path where the code is stored
    command="python ONNX_batch_model_generator_automl_for_images.py --model_name ${{inputs.model_name}} --batch_size ${{inputs.batch_size}} --height_onnx ${{inputs.height_onnx}} --width_onnx ${{inputs.width_onnx}} --job_name ${{inputs.job_name}} --task_type ${{inputs.task_type}} --min_size ${{inputs.min_size}} --max_size ${{inputs.max_size}} --box_score_thresh ${{inputs.box_score_thresh}} --box_nms_thresh ${{inputs.box_nms_thresh}} --box_detections_per_img ${{inputs.box_detections_per_img}}",
    inputs=inputs,
    environment=env,
    compute=compute_name,
    display_name="ONNX-batch-model-generation-maskrcnn",
    description="Use the PyTorch to generate ONNX batch scoring model.",
)
returned_job = ml_client.create_or_update(job)
ml_client.jobs.stream(returned_job.name)

生成批模型后，可手动通过 UI 从 Outputs+>logsoutputs 中下载，或者使用以下方法：

batch_size = 8  # use the batch size used to generate the model
returned_job_run = mlflow_client.get_run(returned_job.name)

# Download run's artifacts/outputs
onnx_model_path = mlflow_client.download_artifacts(
    returned_job_run.info.run_id, 'outputs/model_'+str(batch_size)+'.onnx', local_dir
)

完成模型下载步骤后，使用 ONNX 运行时 Python 包并使用 model.onnx 文件执行推理。为进行演示，本文使用如何为每个视觉任务准备图像数据集中的数据集。

我们已使用所有视觉任务各自的数据集为其训练了模型，以演示 ONNX 模型推理。

加载标签和 ONNX 模型文件

以下代码片段加载 labels.json，其中类名已排序。也就是说，如果 ONNX 模型预测标签 ID 为 2，则它对应于 labels.json 文件中的第三个索引给出的标签名称。

import json
import onnxruntime

labels_file = "automl_models/labels.json"
with open(labels_file) as f:
    classes = json.load(f)
print(classes)
try:
    session = onnxruntime.InferenceSession(onnx_model_path)
    print("ONNX model loaded...")
except Exception as e: 
    print("Error loading ONNX file: ", str(e))

获取 ONNX 模型的预期输入和输出详细信息

使用模型时，务必了解一些特定于模型和特定于任务的详细信息。这些详细信息包括输入数量和输出数量、用于预处理图像的预期输入形状或格式，以及输出形状，确保你了解特定于模型或特定于任务的输出。

sess_input = session.get_inputs()
sess_output = session.get_outputs()
print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")

for idx, input_ in enumerate(range(len(sess_input))):
    input_name = sess_input[input_].name
    input_shape = sess_input[input_].shape
    input_type = sess_input[input_].type
    print(f"{idx} Input name : { input_name }, Input shape : {input_shape}, \
    Input type  : {input_type}")  

for idx, output in enumerate(range(len(sess_output))):
    output_name = sess_output[output].name
    output_shape = sess_output[output].shape
    output_type = sess_output[output].type
    print(f" {idx} Output name : {output_name}, Output shape : {output_shape}, \
    Output type  : {output_type}")

ONNX 模型的预期输入和输出格式

每个 ONNX 模型都有一组预定义的输入和输出格式。

此示例应用具有 134 个图像和 4 个类/标签的 fridgeObjects 数据集上训练的模型，以说明 ONNX 模型推理。有关训练图像分类任务的详细信息，请参阅多类图像分类笔记本。

输入格式

输入是经过预处理的图像。

输入名称	输入形状	输入类型	描述
input1	`(batch_size, num_channels, height, width)`	ndarray(float)	输入是经过预处理的图像，形状为 `(1, 3, 224, 224)`，批大小为 1，高度和宽度为 224。这些数字对应于训练示例中 `crop_size` 所用的值。

输出格式

输出是所有类/标签的 logit 数组。

输出名称	输出形状	输出类型	描述
output1	`(batch_size, num_classes)`	ndarray(float)	模型返回 logit（没有 `softmax`）。例如，对于批大小为 1 和 4 的类，它返回 `(1, 4)`。

此示例使用具有 128 个图像和 4 个类/标签的多标签 fridgeObjects 数据集上训练的模型，以说明 ONNX 模型推理。有关多标签图像分类的模型训练的详细信息，请参阅多标签图像分类笔记本。

输入格式

输入是经过预处理的图像。

输入名称	输入形状	输入类型	描述
input1	`(batch_size, num_channels, height, width)`	ndarray(float)	输入是经过预处理的图像，形状为 `(1, 3, 224, 224)`，批大小为 1，高度和宽度为 224。这些数字对应于训练示例中 `crop_size` 所用的值。

输出格式

输出是所有类/标签的 logit 数组。

输出名称	输出形状	输出类型	描述
output1	`(batch_size, num_classes)`	ndarray(float)	模型返回 logit（没有 `sigmoid`）。例如，对于批大小为 1 和 4 的类，它返回 `(1, 4)`。

此对象检测示例使用具有 128 个图像和 4 个类/标签的 fridgeObjects 检测数据集上训练的模型，以说明 ONNX 模型推理。此示例训练 Faster R-CNN 模型，以演示推理步骤。有关训练对象检测模型的详细信息，请参阅对象检测笔记本。

输入格式

输入是经过预处理的图像。

输入名称	输入形状	输入类型	说明
输入	`(batch_size, num_channels, height, width)`	ndarray(float)	输入是经过预处理的图像，形状为 `(1, 3, 600, 800)`，批大小为 1，高度为 600，宽度为 800。

输出格式

输出是 output_names 和预测的元组。对此，output_names 和 predictions 都是长度为 3*batch_size 的列表。对于 Faster R-CNN，输出顺序是框、标签和分数；而对于 RetinaNet，输出顺序是框、分数、标签。

输出名称	输出形状	输出类型	描述
`output_names`	`(3*batch_size)`	关键值列表	批大小为 2 时，`output_names` 为 `['boxes_0', 'labels_0', 'scores_0', 'boxes_1', 'labels_1', 'scores_1']`
`predictions`	`(3*batch_size)`	ndarray(float) 列表	批大小为 2 时，`predictions` 的形状为 `[(n1_boxes, 4), (n1_boxes), (n1_boxes), (n2_boxes, 4), (n2_boxes), (n2_boxes)]`。对此，每个索引的值与 `output_names` 中相同的索引对应。

下表描述了为图像批中的每个示例返回的框、标签和分数。

名称	形状	类型	描述
框	`(n_boxes, 4)`，其中每个框都设有 `x_min, y_min, x_max, y_max`	ndarray(float)	模型返回 n 个框及其左上角和右下角坐标。
标签	`(n_boxes)`	ndarray(float)	每个框中对象的标签或类 ID。
分数	`(n_boxes)`	ndarray(float)	每个框中对象的置信度分数。

此对象检测示例使用具有 128 个图像和 4 个类/标签的 fridgeObjects 检测数据集上训练的模型，以说明 ONNX 模型推理。此示例训练 YOLO 模型，以演示推理步骤。有关训练对象检测模型的详细信息，请参阅对象检测笔记本。

输入格式

输入是经过预处理的图像，形状为 (1, 3, 640, 640)，批大小为 1，高度和宽度为 640。这些数字对应于训练示例中所用的值。

输入名称	输入形状	输入类型	说明
输入	`(batch_size, num_channels, height, width)`	ndarray(float)	输入是经过预处理的图像，形状为 `(1, 3, 640, 640)`，批大小为 1，高度为 640，宽度为 640。

输出格式

ONNX 模型预测包含多个输出。需要第一个输出才能对检测执行非最大抑制。为方便使用，自动化 ML 将在执行 NMS 后处理步骤后显示输出格式。对于批中每个示例，NMS 后的输出是框、标签和分数列表。

输出名称	输出形状	输出类型	描述
输出	`(batch_size)`	ndarray(float) 列表	模型返回批中每个示例的框检测

列表中的每个单元格指示形状为 (n_boxes, 6) 的示例的框检测，其中每个框都有 x_min, y_min, x_max, y_max, confidence_score, class_id。

对于该实例分段示例，使用具有 128 个图像和 4 个类/标签的 fridgeObjects 数据集上训练的 Mask R-CNN 模型，以说明 ONNX 模型推理。有关训练实例分段模型的详细信息，请参阅实例分段笔记本。

重要

实例分段任务仅支持掩码 R-CNN。输入和输出格式仅基于掩码 R-CNN。

输入格式

输入是经过预处理的图像。已导出适用于 Mask R-CNN 的 ONNX 模型，用于处理各种形状的图像。建议将其调整为与训练图像大小一致的固定大小，以提高性能。

输入名称	输入形状	输入类型	说明
输入	`(batch_size, num_channels, height, width)`	ndarray(float)	输入是经过预处理的图像，形状为 `(1, 3, input_image_height, input_image_width)`，批大小为 1，高度和宽度与输入图像类似。

输出格式

输出是 output_names 和预测的元组。对此，output_names 和 predictions 都是长度为4*batch_size 的列表。

输出名称	输出形状	输出类型	描述
`output_names`	`(4*batch_size)`	关键值列表	批大小为 2 时，`output_names` 为 `['boxes_0', 'labels_0', 'scores_0', 'masks_0', 'boxes_1', 'labels_1', 'scores_1', 'masks_1']`
`predictions`	`(4*batch_size)`	ndarray(float) 列表	批大小为 2 时，`predictions` 的形状为 `[(n1_boxes, 4), (n1_boxes), (n1_boxes), (n1_boxes, 1, height_onnx, width_onnx), (n2_boxes, 4), (n2_boxes), (n2_boxes), (n2_boxes, 1, height_onnx, width_onnx)]`。对此，每个索引的值与 `output_names` 中相同的索引对应。

名称	形状	类型	描述
框	`(n_boxes, 4)`，其中每个框都设有 `x_min, y_min, x_max, y_max`	ndarray(float)	模型返回 n 个框及其左上角和右下角坐标。
标签	`(n_boxes)`	ndarray(float)	每个框中对象的标签或类 ID。
分数	`(n_boxes)`	ndarray(float)	每个框中对象的置信度分数。
掩码	`(n_boxes, 1, height_onnx, width_onnx)`	ndarray(float)	检测到的对象的掩码（多边形），这些对象具有输入图像的形状高度和宽度。

预处理

执行以下预处理步骤，以实现 ONNX 模型推理：

将图像转换为 RGB。
将图像大小调整为 valid_resize_size 和 valid_resize_size 值，这些值对应于训练期间验证数据集转换时使用的值。 valid_resize_size 的默认值为 256。
将图像中心裁剪为 height_onnx_crop_size 和 width_onnx_crop_size。它与 valid_crop_size 对应，默认值为 224。
将 HxWxC 更改为 CxHxW。
转换为 float 型。
使用 ImageNet 的 mean = [0.485, 0.456, 0.406] 和 std = [0.229, 0.224, 0.225] 进行规范化。

如果在训练期间为超参数 valid_resize_size 和 valid_crop_size 选择了不同的值，则应使用这些值。

获取 ONNX 模型所需的输入形状。

batch, channel, height_onnx_crop_size, width_onnx_crop_size = session.get_inputs()[0].shape
batch, channel, height_onnx_crop_size, width_onnx_crop_size

无 PyTorch

import glob
import numpy as np
from PIL import Image

def preprocess(image, resize_size, crop_size_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_size: value to resize the image
    :type image: Int
    :param crop_size_onnx: expected height of an input image in onnx model
    :type crop_size_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """

    image = image.convert('RGB')
    # resize
    image = image.resize((resize_size, resize_size))
    #  center  crop
    left = (resize_size - crop_size_onnx)/2
    top = (resize_size - crop_size_onnx)/2
    right = (resize_size + crop_size_onnx)/2
    bottom = (resize_size + crop_size_onnx)/2
    image = image.crop((left, top, right, bottom))

    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1) # CxHxW
    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:]/255 - mean_vec[i])/std_vec[i]
             
    np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_cls/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

有 PyTorch

import glob
import torch
import numpy as np
from PIL import Image
from torchvision import transforms

def _make_3d_tensor(x) -> torch.Tensor:
    """This function is for images that have less channels.

    :param x: input tensor
    :type x: torch.Tensor
    :return: return a tensor with the correct number of channels
    :rtype: torch.Tensor
    """
    return x if x.shape[0] == 3 else x.expand((3, x.shape[1], x.shape[2]))

def preprocess(image, resize_size, crop_size_onnx):
    transform = transforms.Compose([
        transforms.Resize(resize_size),
        transforms.CenterCrop(crop_size_onnx),
        transforms.ToTensor(),
        transforms.Lambda(_make_3d_tensor),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
    
    img_data = transform(image)
    img_data = img_data.numpy()
    img_data = np.expand_dims(img_data, axis=0)
    return img_data

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_cls/test_images_dir/*"  # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

执行以下预处理步骤，以实现 ONNX 模型推理。对于多类图像分类，这些步骤相同。

将图像转换为 RGB。
将图像大小调整为 valid_resize_size 和 valid_resize_size 值，这些值对应于训练期间验证数据集转换时使用的值。 valid_resize_size 的默认值为 256。
将图像中心裁剪为 height_onnx_crop_size 和 width_onnx_crop_size。这对应于默认值为 224 的 valid_crop_size。
将 HxWxC 更改为 CxHxW。
转换为 float 型。
使用 ImageNet 的 mean = [0.485, 0.456, 0.406] 和 std = [0.229, 0.224, 0.225] 进行规范化。

如果在训练期间为超参数 valid_resize_size 和 valid_crop_size 选择了不同的值，则应使用这些值。

获取 ONNX 模型所需的输入形状。

batch, channel, height_onnx_crop_size, width_onnx_crop_size = session.get_inputs()[0].shape
batch, channel, height_onnx_crop_size, width_onnx_crop_size

无 PyTorch

import glob
import numpy as np
from PIL import Image

def preprocess(image, resize_size, crop_size_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_size: value to resize the image
    :type image: Int
    :param crop_size_onnx: expected height of an input image in onnx model
    :type crop_size_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """

    image = image.convert('RGB')
    # resize
    image = image.resize((resize_size, resize_size))
    # center  crop
    left = (resize_size - crop_size_onnx)/2
    top = (resize_size - crop_size_onnx)/2
    right = (resize_size + crop_size_onnx)/2
    bottom = (resize_size + crop_size_onnx)/2
    image = image.crop((left, top, right, bottom))

    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1) # CxHxW

    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:] / 255 - mean_vec[i]) / std_vec[i]    
    np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_label/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

有 PyTorch

import glob
import torch
import numpy as np
from PIL import Image
from torchvision import transforms

def _make_3d_tensor(x) -> torch.Tensor:
    """This function is for images that have less channels.

    :param x: input tensor
    :type x: torch.Tensor
    :return: return a tensor with the correct number of channels
    :rtype: torch.Tensor
    """
    return x if x.shape[0] == 3 else x.expand((3, x.shape[1], x.shape[2]))

def preprocess(image, resize_size, crop_size_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_size: value to resize the image
    :type image: Int
    :param crop_size_onnx: expected height of an input image in onnx model
    :type crop_size_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """
    transform = transforms.Compose([
        transforms.Resize(resize_size),
        transforms.CenterCrop(crop_size_onnx),
        transforms.ToTensor(),
        transforms.Lambda(_make_3d_tensor),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
    
    img_data = transform(image)
    img_data = img_data.numpy()
    img_data = np.expand_dims(img_data, axis=0)
    
    return img_data

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_label/test_images_dir/*"  # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

对于具有 Faster R-CNN 体系结构的对象检测，请遵循与图像分类相同的预处理步骤，但图像裁剪除外。可以将图像大小调整为高 600 和宽 800。可以使用以下代码获取预期的输入高度和宽度。

batch, channel, height_onnx, width_onnx = session.get_inputs()[0].shape
batch, channel, height_onnx, width_onnx

然后，执行预处理步骤。

import glob
import numpy as np
from PIL import Image

def preprocess(image, height_onnx, width_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param height_onnx: expected height of an input image in onnx model
    :type height_onnx: Int
    :param width_onnx: expected width of an input image in onnx model
    :type width_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """

    image = image.convert('RGB')
    image = image.resize((width_onnx, height_onnx))
    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1) # CxHxW
    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:] / 255 - mean_vec[i]) / std_vec[i]
    np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_od/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, height_onnx, width_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

对于具有 YOLO 体系结构的对象检测，请遵循与图像分类相同的预处理步骤，但图像裁剪除外。可以将图像调整为高度 600 和宽度 800，并使用以下代码获取预期的输入高度和宽度。

batch, channel, height_onnx, width_onnx = session.get_inputs()[0].shape
batch, channel, height_onnx, width_onnx

有关 YOLO 所需的预处理，请参阅 yolo_onnx_preprocessing_utils.py。

import glob
import numpy as np
from yolo_onnx_preprocessing_utils import preprocess

# use height and width based on the generated model
test_images_path = "automl_models_od_yolo/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
pad_list = []
for i in range(batch_size):
    img_processed, pad = preprocess(image_files[i])
    img_processed_list.append(img_processed)
    pad_list.append(pad)
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

重要

实例分段任务仅支持掩码 R-CNN。预处理步骤仅基于掩码 R-CNN。

执行以下预处理步骤，以实现 ONNX 模型推理：

将图像转换为 RGB。
调整图像大小。
将 HxWxC 更改为 CxHxW。
转换为 float 型。
使用 ImageNet 的 mean = [0.485, 0.456, 0.406] 和 std = [0.229, 0.224, 0.225] 进行规范化。

对于 resize_height 和 resize_width，还可以使用训练期间使用的值，但受 Mask R-CNN 的 min_size 和 max_size 超参数限制。

import glob
import numpy as np
from PIL import Image

def preprocess(image, resize_height, resize_width):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_height: resize height of an input image
    :type resize_height: Int
    :param resize_width: resize width of an input image
    :type resize_width: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray of shape 1xCxHxW
    """

    image = image.convert('RGB')
    image = image.resize((resize_width, resize_height))
    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1)  # CxHxW
    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:]/255 - mean_vec[i])/std_vec[i]
    np_image = np.expand_dims(norm_img_data, axis=0)  # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images
# use height and width based on the trained model
# use height and width based on the generated model
test_images_path = "automl_models_is/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, height_onnx, width_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

使用 ONNX 运行时进行推理

使用 ONNX 运行时进行推理因各个计算机视觉任务而异。

def get_predictions_from_ONNX(onnx_session, img_data):
    """Perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: scores with shapes
            (1, No. of classes in training dataset) 
    :rtype: numpy array
    """

    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")    
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    scores = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    
    return scores[0]

scores = get_predictions_from_ONNX(session, img_data)

def get_predictions_from_ONNX(onnx_session,img_data):
    """Perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: scores with shapes
            (1, No. of classes in training dataset) 
    :rtype: numpy array
    """
    
    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")    
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    scores = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    
    return scores[0]

scores = get_predictions_from_ONNX(session, img_data)

def get_predictions_from_ONNX(onnx_session, img_data):
    """perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: boxes, labels , scores 
            (No. of boxes, 4) (No. of boxes,) (No. of boxes,)
    :rtype: tuple
    """

    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    
    # predict with ONNX Runtime
    output_names = [output.name for output in sess_output]
    predictions = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})

    return output_names, predictions

output_names, predictions = get_predictions_from_ONNX(session, img_data)

def get_predictions_from_ONNX(onnx_session,img_data):
    """perform predictions with ONNX Runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: boxes, labels , scores 
    :rtype: list
    """
    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    pred = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    return pred[0]

result = get_predictions_from_ONNX(session, img_data)

实例分段模型预测框、标签、分数和掩码。 ONNX 输出每个实例的预测掩码，以及相应的边界框和类置信度分数。必要时，可能需要从二进制掩码转换为 polygon。


def get_predictions_from_ONNX(onnx_session, img_data):
    """Perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: boxes, labels , scores , masks with shapes
            (No. of instances, 4) (No. of instances,) (No. of instances,)
            (No. of instances, 1, HEIGHT, WIDTH))  
    :rtype: tuple
    """
    
    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    predictions = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    return output_names, predictions

output_names, predictions = get_predictions_from_ONNX(session, img_data)

后期处理

对 softmax() 应用预测值，以获取每个类的分类置信度分数（概率）。然后，将预测出概率最高的类。

无 PyTorch

def softmax(x):
    e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return e_x / np.sum(e_x, axis=1, keepdims=True)

conf_scores = softmax(scores)
class_preds = np.argmax(conf_scores, axis=1)
print("predicted classes:", ([(class_idx, classes[class_idx]) for class_idx in class_preds]))

有 PyTorch

conf_scores = torch.nn.functional.softmax(torch.from_numpy(scores), dim=1)
class_preds = torch.argmax(conf_scores, dim=1)
print("predicted classes:", ([(class_idx.item(), classes[class_idx]) for class_idx in class_preds]))

该步骤不同于多类分类。需要将 sigmoid 应用于 logit（ONNX 输出），以获取多标签图像分类的置信度分数。

无 PyTorch

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
conf_scores = sigmoid(scores)
image_wise_preds = np.where(conf_scores > score_threshold)
for image_idx, class_idx in zip(image_wise_preds[0], image_wise_preds[1]):
    print('image: {}, class_index: {}, class_name: {}'.format(image_files[image_idx], class_idx, classes[class_idx]))

有 PyTorch

# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
conf_scores = torch.sigmoid(torch.from_numpy(scores))
image_wise_preds = torch.where(conf_scores > score_threshold)
for image_idx, class_idx in zip(image_wise_preds[0], image_wise_preds[1]):
    print('image: {}, class_index: {}, class_name: {}'.format(image_files[image_idx], class_idx, classes[class_idx]))

对于多类和多标签分类，可以对 AutoML 中支持的所有模型体系结构执行上述步骤。

对于对象检测，预测将自动采用 height_onnx、width_onnx 的比例。若要将预测框坐标转换为原始尺寸，可以执行以下计算。

Xmin * original_width/width_onnx
Ymin * original_height/height_onnx
Xmax * original_width/width_onnx
Ymax * original_height/height_onnx

另一种方法是使用以下代码将框尺寸缩放到 [0, 1] 的范围内。这样做，可以用框坐标与原始图像相应的高度和宽度坐标相乘（如将预测结果可视化部分所述），以获取原始图像维度中的框。

def _get_box_dims(image_shape, box):
    box_keys = ['topX', 'topY', 'bottomX', 'bottomY']
    height, width = image_shape[0], image_shape[1]

    box_dims = dict(zip(box_keys, [coordinate.item() for coordinate in box]))

    box_dims['topX'] = box_dims['topX'] * 1.0 / width
    box_dims['bottomX'] = box_dims['bottomX'] * 1.0 / width
    box_dims['topY'] = box_dims['topY'] * 1.0 / height
    box_dims['bottomY'] = box_dims['bottomY'] * 1.0 / height

    return box_dims

def _get_prediction(boxes, labels, scores, image_shape, classes):
    bounding_boxes = []
    for box, label_index, score in zip(boxes, labels, scores):
        box_dims = _get_box_dims(image_shape, box)

        box_record = {'box': box_dims,
                      'label': classes[label_index],
                      'score': score.item()}

        bounding_boxes.append(box_record)

    return bounding_boxes

# Filter the results with threshold.
# Please replace the threshold for your test scenario.
score_threshold = 0.8
filtered_boxes_batch = []
for batch_sample in range(0, batch_size*3, 3):
    # in case of retinanet change the order of boxes, labels, scores to boxes, scores, labels
    # confirm the same from order of boxes, labels, scores output_names 
    boxes, labels, scores = predictions[batch_sample], predictions[batch_sample + 1], predictions[batch_sample + 2]
    bounding_boxes = _get_prediction(boxes, labels, scores, (height_onnx, width_onnx), classes)
    filtered_bounding_boxes = [box for box in bounding_boxes if box['score'] >= score_threshold]
    filtered_boxes_batch.append(filtered_bounding_boxes)

以下代码创建框、标签和分数。使用这些边界框详细信息执行对 Faster R-CNN 模型所执行的后期处理步骤。

from yolo_onnx_preprocessing_utils import non_max_suppression, _convert_to_rcnn_output

result_final = non_max_suppression(
    torch.from_numpy(result),
    conf_thres=0.1,
    iou_thres=0.5)

def _get_box_dims(image_shape, box):
    box_keys = ['topX', 'topY', 'bottomX', 'bottomY']
    height, width = image_shape[0], image_shape[1]

    box_dims = dict(zip(box_keys, [coordinate.item() for coordinate in box]))

    box_dims['topX'] = box_dims['topX'] * 1.0 / width
    box_dims['bottomX'] = box_dims['bottomX'] * 1.0 / width
    box_dims['topY'] = box_dims['topY'] * 1.0 / height
    box_dims['bottomY'] = box_dims['bottomY'] * 1.0 / height

    return box_dims

def _get_prediction(label, image_shape, classes):
    
    boxes = np.array(label["boxes"])
    labels = np.array(label["labels"])
    labels = [label[0] for label in labels]
    scores = np.array(label["scores"])
    scores = [score[0] for score in scores]

    bounding_boxes = []
    for box, label_index, score in zip(boxes, labels, scores):
        box_dims = _get_box_dims(image_shape, box)

        box_record = {'box': box_dims,
                      'label': classes[label_index],
                      'score': score.item()}

        bounding_boxes.append(box_record)

    return bounding_boxes

bounding_boxes_batch = []
for result_i, pad in zip(result_final, pad_list):
    label, image_shape = _convert_to_rcnn_output(result_i, height_onnx, width_onnx, pad)
    bounding_boxes_batch.append(_get_prediction(label, image_shape, classes))
print(json.dumps(bounding_boxes_batch, indent=1))

将预测结果可视化

使用标签将输入图像可视化。

import matplotlib.image as mpimg
import matplotlib.pyplot as plt
%matplotlib inline

sample_image_index = 0 # change this for an image of interest from image_files list
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img_np = mpimg.imread(image_files[sample_image_index])

img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size

fig,ax = plt.subplots(1, figsize=(15, 15))
# Display the image
ax.imshow(img_np)

label = class_preds[sample_image_index]
if torch.is_tensor(label):
    label = label.item()
    
conf_score = conf_scores[sample_image_index]
if torch.is_tensor(conf_score):
    conf_score = np.max(conf_score.tolist())
else:
    conf_score = np.max(conf_score)

display_text = '{} ({})'.format(label, round(conf_score, 3))
print(display_text)

color = 'red'
plt.text(30, 30, display_text, color=color, fontsize=30)

plt.show()

使用标签将输入图像可视化。

import matplotlib.image as mpimg
import matplotlib.pyplot as plt
%matplotlib inline

sample_image_index = 0 # change this for an image of interest from image_files list
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img_np = mpimg.imread(image_files[sample_image_index])
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size

fig,ax = plt.subplots(1, figsize=(15, 15))
# Display the image
ax.imshow(img_np)
# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
label_offset_x = 30
label_offset_y = 30
if torch.is_tensor(conf_scores):
    sample_image_scores = conf_scores[sample_image_index].tolist()
else:
    sample_image_scores = conf_scores[sample_image_index]
    
for index, score in enumerate(sample_image_scores):
    if score > score_threshold:
        label = classes[index]
        display_text = '{} ({})'.format(label, round(score, 3))
        print(display_text)

        color = 'red'
        plt.text(label_offset_x, label_offset_y, display_text, color=color, fontsize=30)
        label_offset_y += 30

plt.show()

使用框和标签将输入图像可视化。

import matplotlib.image as mpimg
import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline

img_np = mpimg.imread(image_files[1])  # replace with desired image index
image_boxes = filtered_boxes_batch[1]  # replace with desired image index

IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
print(img.size)

fig,ax = plt.subplots(1)
# Display the image
ax.imshow(img_np)

# Draw box and label for each detection 
for detect in image_boxes:
    label = detect['label']
    box = detect['box']
    ymin, xmin, ymax, xmax =  box['topY'], box['topX'], box['bottomY'], box['bottomX']
    topleft_x, topleft_y = x * xmin, y * ymin
    width, height = x * (xmax - xmin), y * (ymax - ymin)
    print('{}: {}, {}, {}, {}'.format(detect['label'], topleft_x, topleft_y, width, height))
    rect = patches.Rectangle((topleft_x, topleft_y), width, height, 
                             linewidth=1, edgecolor='green', facecolor='none')

    ax.add_patch(rect)
    color = 'green'
    plt.text(topleft_x, topleft_y, label, color=color)

plt.show()

使用框和标签将输入图像可视化。

import matplotlib.image as mpimg
import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline

img_np = mpimg.imread(image_files[1])  # replace with desired image index
image_boxes = bounding_boxes_batch[1]  # replace with desired image index

IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
print(img.size)

fig,ax = plt.subplots(1)
# Display the image
ax.imshow(img_np)

# Draw box and label for each detection 
for detect in image_boxes:
    label = detect['label']
    box = detect['box']
    ymin, xmin, ymax, xmax =  box['topY'], box['topX'], box['bottomY'], box['bottomX']
    topleft_x, topleft_y = x * xmin, y * ymin
    width, height = x * (xmax - xmin), y * (ymax - ymin)
    print('{}: {}, {}, {}, {}'.format(detect['label'], topleft_x, topleft_y, width, height))
    rect = patches.Rectangle((topleft_x, topleft_y), width, height, 
                             linewidth=1, edgecolor='green', facecolor='none')

    ax.add_patch(rect)
    color = 'green'
    plt.text(topleft_x, topleft_y, label, color=color)

plt.show()

使用掩码和标签将示例输入图像可视化。

import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline

def display_detections(image, boxes, labels, scores, masks, resize_height, 
                       resize_width, classes, score_threshold):
    """Visualize boxes and masks
    
    :param image: raw image
    :type image: PIL image
    :param boxes: box with shape (No. of instances, 4) 
    :type boxes: ndarray 
    :param labels: classes with shape (No. of instances,) 
    :type labels: ndarray
    :param scores: scores with shape (No. of instances,)
    :type scores: ndarray
    :param masks: masks with shape (No. of instances, 1, HEIGHT, WIDTH) 
    :type masks:  ndarray
    :param resize_height: expected height of an input image in onnx model
    :type resize_height: Int
    :param resize_width: expected width of an input image in onnx model
    :type resize_width: Int
    :param classes: classes with shape (No. of classes) 
    :type classes:  list
    :param score_threshold: threshold on scores in the range of 0-1
    :type score_threshold: float
    :return: None
    """

    _, ax = plt.subplots(1, figsize=(12,9))

    image = np.array(image)
    original_height = image.shape[0]
    original_width = image.shape[1]

    for mask, box, label, score in zip(masks, boxes, labels, scores):        
        if score <= score_threshold:
            continue
        mask = mask[0, :, :, None]        
        # resize boxes to original raw input size
        box = [box[0]*original_width/resize_width, 
               box[1]*original_height/resize_height, 
               box[2]*original_width/resize_width, 
               box[3]*original_height/resize_height]
        
        mask = cv2.resize(mask, (image.shape[1], image.shape[0]), 0, 0, interpolation = cv2.INTER_NEAREST)
        # mask is a matrix with values in the range of [0,1]
        # higher values indicate presence of object and vice versa
        # select threshold or cut-off value to get objects present       
        mask = mask > score_threshold
        image_masked = image.copy()
        image_masked[mask] = (0, 255, 255)
        alpha = 0.5  # alpha blending with range 0 to 1
        cv2.addWeighted(image_masked, alpha, image, 1 - alpha,0, image)
        rect = patches.Rectangle((box[0], box[1]), box[2] - box[0], box[3] - box[1],\
                                 linewidth=1, edgecolor='b', facecolor='none')
        ax.annotate(classes[label] + ':' + str(np.round(score, 2)), (box[0], box[1]),\
                    color='w', fontsize=12)
        ax.add_patch(rect)
        
    ax.imshow(image)
    plt.show()

score_threshold = 0.5
img = Image.open(image_files[1])  # replace with desired image index
image_boxes = filtered_boxes_batch[1]  # replace with desired image index
boxes, labels, scores, masks = predictions[4:8]  # replace with desired image index
display_detections(img, boxes.copy(), labels, scores, masks.copy(), 
                   height_onnx, width_onnx, classes, score_threshold)

通过

使用 ONNX 对来自 AutoML 的计算机视觉模型进行预测

先决条件

下载 ONNX 模型文件

Azure 机器学习工作室

Azure 机器学习 Python SDK

生成模型进行批量评分

加载标签和 ONNX 模型文件

获取 ONNX 模型的预期输入和输出详细信息

ONNX 模型的预期输入和输出格式

输入格式

输出格式

输入格式

输出格式

输入格式

输出格式

输入格式

输出格式

输入格式

输出格式

预处理

无 PyTorch

有 PyTorch

无 PyTorch

有 PyTorch

使用 ONNX 运行时进行推理

后期处理

无 PyTorch

有 PyTorch

无 PyTorch

有 PyTorch

将预测结果可视化

后续步骤

其他资源