Set up MLOps with Azure DevOps
APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)
Azure Machine Learning allows you to integrate with Azure DevOps pipeline to automate the machine learning lifecycle. Some of the operations you can automate are:
- Deployment of Azure Machine Learning infrastructure
- Data preparation (extract, transform, load operations)
- Training machine learning models with on-demand scale-out and scale-up
- Deployment of machine learning models as public or private web services
- Monitoring deployed machine learning models (such as for performance analysis)
In this article, you learn about using Azure Machine Learning to set up an end-to-end MLOps pipeline that runs a linear regression to predict taxi fares in NYC. The pipeline is made up of components, each serving different functions, which can be registered with the workspace, versioned, and reused with various inputs and outputs. you're going to be using the recommended Azure architecture for MLOps and AzureMLOps (v2) solution accelerator to quickly setup an MLOps project in Azure Machine Learning.
Tip
We recommend you understand some of the recommended Azure architectures for MLOps before implementing any solution. You'll need to pick the best architecture for your given Machine learning project.
Prerequisites
- An Azure subscription. If you don't have an Azure subscription, create a trial subscription before you begin. Try the trial subscription.
- An Azure Machine Learning workspace.
- Git running on your local machine.
- An organization in Azure DevOps.
- Azure DevOps project that will host the source repositories and pipelines.
- The Terraform extension for Azure DevOps if you're using Azure DevOps + Terraform to spin up infrastructure
Note
Git version 2.27 or newer is required. For more information on installing the Git command, see https://git-scm.com/downloads and select your operating system
Important
The CLI commands in this article were tested using Bash. If you use a different shell, you may encounter errors.
Set up authentication with Azure and DevOps
Before you can set up an MLOps project with Azure Machine Learning, you need to set up authentication for Azure DevOps.
Create service principal
For the use of the demo, the creation of one or two service principles is required, depending on how many environments, you want to work on (Dev or Prod or Both). These principles can be created using one of the following methods:
Navigate to Azure App Registrations
Select New Registration.
Go through the process of creating a Service Principle (SP) selecting Accounts in any organizational directory (Any Microsoft Entra directory - Multitenant) and name it Azure-ARM-Dev-ProjectName. Once created, repeat and create a new SP named Azure-ARM-Prod-ProjectName. Replace ProjectName with the name of your project so that the service principal can be uniquely identified.
Go to Certificates & Secrets and add for each SP New client secret, then store the value and secret separately.
To assign the necessary permissions to these principals, select your respective subscription and go to IAM. Select +Add then select Add Role Assignment.
Select Contributor and add members selecting + Select Members. Add the member Azure-ARM-Dev-ProjectName as create before.
Repeat step here, if you deploy Dev and Prod into the same subscription, otherwise change to the prod subscription and repeat with Azure-ARM-Prod-ProjectName. The basic SP setup is successfully finished.
Set up Azure DevOps
Navigate to Azure DevOps.
Select create a new project (Name the project
mlopsv2
for this tutorial).In the project under Project Settings (at the bottom left of the project page) select Service Connections.
Select Create Service Connection.
Select Azure Resource Manager, select Next, select Service principal (manual), select Next and select the Scope Level Subscription.
- Subscription Name - Use the name of the subscription where your service principal is stored.
- Subscription Id - Use the
subscriptionId
you used in Step 1 input as the Subscription ID - Service Principal Id - Use the
appId
from Step 1 output as the Service Principal ID - Service principal key - Use the
password
from Step 1 output as the Service Principal Key - Tenant ID - Use the
tenant
from Step 1 output as the Tenant ID
Name the service connection Azure-ARM-Prod.
Select Grant access permission to all pipelines, then select Verify and Save.
The Azure DevOps setup is successfully finished.
Set up source repository with Azure DevOps
Open the project you created in Azure DevOps
Open the Repos section and select Import Repository
Enter https://github.com/Azure/mlops-v2-ado-demo into the Clone URL field. Select import at the bottom of the page
Open the Project settings at the bottom of the left hand navigation pane
Under the Repos section, select Repositories. Select the repository you created in previous step Select the Security tab
Under the User permissions section, select the mlopsv2 Build Service user. Change the permission Contribute permission to Allow and the Create branch permission to Allow.
Open the Pipelines section in the left hand navigation pane and select on the 3 vertical dots next to the Create Pipelines button. Select Manage Security
Select the mlopsv2 Build Service account for your project under the Users section. Change the permission Edit build pipeline to Allow
Note
This finishes the prerequisite section and the deployment of the solution accelerator can happen accordingly.
Deploying infrastructure via Azure DevOps
This step deploys the training pipeline to the Azure Machine Learning workspace created in the previous steps.
Tip
Make sure you understand the Architectural Patterns of the solution accelerator before you checkout the MLOps v2 repo and deploy the infrastructure. In examples you'll use the classical ML project type.
Run Azure infrastructure pipeline
Go to your repository,
mlops-v2-ado-demo
, and select the config-infra-prod.yml file.Important
Make sure you've selected the main branch of the repo.
This config file uses the namespace and postfix values the names of the artifacts to ensure uniqueness. Update the following section in the config to your liking.
namespace: [5 max random new letters] postfix: [4 max random new digits] location: chinaeast2
Note
If you are running a Deep Learning workload such as CV or NLP, ensure your GPU compute is available in your deployment zone.
Select Commit and push code to get these values into the pipeline.
Go to Pipelines section
Select Create Pipeline.
Select Azure Repos Git.
Select the repository that you cloned in from the previous section
mlops-v2-ado-demo
Select Existing Azure Pipelines YAML file
Select the
main
branch and choosemlops/devops-pipelines/cli-ado-deploy-infra.yml
, then select Continue.Run the pipeline; it will take a few minutes to finish. The pipeline should create the following artifacts:
- Resource Group for your Workspace including Storage Account, Container Registry, Application Insights, Keyvault and the Azure Machine Learning Workspace itself.
- In the workspace, there's also a compute cluster created.
Now the infrastructure for your MLOps project is deployed.
Note
The Unable move and reuse existing repository to required location warnings may be ignored.
Sample Training and Deployment Scenario
The solution accelerator includes code and data for a sample end-to-end machine learning pipeline which runs a linear regression to predict taxi fares in NYC. The pipeline is made up of components, each serving different functions, which can be registered with the workspace, versioned, and reused with various inputs and outputs. Sample pipelines and workflows for the Computer Vision and NLP scenarios will have different steps and deployment steps.
This training pipeline contains the following steps:
Prepare Data
- This component takes multiple taxi datasets (yellow and green) and merges/filters the data, and prepare the train/val and evaluation datasets.
- Input: Local data under ./data/ (multiple .csv files)
- Output: Single prepared dataset (.csv) and train/val/test datasets.
Train Model
- This component trains a Linear Regressor with the training set.
- Input: Training dataset
- Output: Trained model (pickle format)
Evaluate Model
- This component uses the trained model to predict taxi fares on the test set.
- Input: ML model and Test dataset
- Output: Performance of model and a deploy flag whether to deploy or not.
- This component compares the performance of the model with all previous deployed models on the new test dataset and decides whether to promote or not model into production. Promoting model into production happens by registering the model in AML workspace.
Register Model
- This component scores the model based on how accurate the predictions are in the test set.
- Input: Trained model and the deploy flag.
- Output: Registered model in Azure Machine Learning.
Deploying model training pipeline
Go to ADO pipelines
Select New Pipeline.
Select Azure Repos Git.
Select the repository that you cloned in from the previous section
mlopsv2
Select Existing Azure Pipelines YAML file
Select
main
as a branch and choose/mlops/devops-pipelines/deploy-model-training-pipeline.yml
, then select Continue.Save and Run the pipeline
Note
At this point, the infrastructure is configured and the Prototyping Loop of the MLOps Architecture is deployed. you're ready to move to our trained model to production.
Deploying the Trained model
This scenario includes prebuilt workflows for two approaches to deploying a trained model, batch scoring or a deploying a model to an endpoint for real-time scoring. You may run either or both of these workflows to test the performance of the model in your Azure ML workspace. IN this example we will be using real-time scoring.
Deploy ML model endpoint
Go to ADO pipelines
Select New Pipeline.
Select Azure Repos Git.
Select the repository that you cloned in from the previous section
mlopsv2
Select Existing Azure Pipelines YAML file
Select
main
as a branch and choose Managed Online Endpoint/mlops/devops-pipelines/deploy-online-endpoint-pipeline.yml
then select Continue.Online endpoint names need to be unique, so change
taxi-online-$(namespace)$(postfix)$(environment)
to another unique name and then select Run. No need to change the default if it doesn't fail.Important
If the run fails due to an existing online endpoint name, recreate the pipeline as described previously and change [your endpoint-name] to [your endpoint-name (random number)]
When the run completes, you'll see output similar to the following image:
To test this deployment, go to the Endpoints tab in your AzureML workspace, select the endpoint and click the Test Tab. You can use the sample input data located in the cloned repo at
/data/taxi-request.json
to test the endpoint.
Clean up resources
- If you're not going to continue to use your pipeline, delete your Azure DevOps project.
- In Azure portal, delete your resource group and Azure Machine Learning instance.
Next steps
- Install and set up Python SDK v2
- Install and set up Python CLI v2
- Azure MLOps (v2) solution accelerator on GitHub
- Learn more about Azure Pipelines with Azure Machine Learning
- Learn more about GitHub Actions with Azure Machine Learning