在 Azure Data Science Virtual Machine 上安全存储访问凭据

云应用程序代码通常包含用于向云服务进行身份验证的凭据。 在构建云应用程序时时,这些凭据的管理和安全性是一个众所周知的挑战。 理想情况下,凭据永远不应出现在开发人员工作站上。 我们永远不应该将凭据签入到源代码管理。

Azure 资源的托管标识功能有助于解决问题。 它为 Azure 服务提供了 Microsoft Entra ID 中的自动托管标识。 可以使用此标识向支持 Microsoft Entra 身份验证的任何服务进行身份验证。 此外,此标识可避免在代码中放置任何嵌入凭据。

要保护凭据,请将 Windows Installer (MSI) 与 Azure 密钥保管库结合使用。 Azure 密钥保管库是一种托管的 Azure 服务,可安全地存储机密和加密密钥。 可以使用托管标识访问密钥保管库,然后从该密钥保管库检索授权机密和加密密钥。

有关 Azure 资源托管标识和密钥保管库的文档构成了用于深入了解这些服务信息的综合资源。 本文将介绍在数据科学虚拟机 (DSVM) 上使用 MSI 和密钥保管库访问 Azure 资源的基本方法。

在 DSVM 上创建托管标识

# Prerequisite: You already created a Data Science VM in the usual way.

# Create an identity principal for the VM.
az vm assign-identity -g <Resource Group Name> -n <Name of the VM>
# Get the principal ID of the DSVM.
az resource list -n <Name of the VM> --query [*].identity.principalId --out tsv

为 VM 主体分配 Key Vault 访问权限

# Prerequisite: You already created an empty Key Vault resource on Azure through use of the Azure portal or Azure CLI.

# Assign only get and set permissions but not the capability to list the keys.
az keyvault set-policy --object-id <Principal ID of the DSVM from previous step> --name <Key Vault Name> -g <Resource Group of Key Vault>  --secret-permissions get set

从 DSVM 访问 Key Vault 中的机密

# Get the access token for the VM.
x=`curl http://localhost:50342/oauth2/token --data "resource=https://vault.azure.cn" -H Metadata:true`
token=`echo $x | python -c "import sys, json; print(json.load(sys.stdin)['access_token'])"`

# Access the key vault by using the access token.
curl https://<Vault Name>.vault.azure.cn/secrets/SQLPasswd?api-version=2016-10-01 -H "Authorization: Bearer $token"

从 DSVM 访问存储密钥

# Prerequisite: You have granted your VMs MSI access to use storage account access keys based on instructions at https://learn.microsoft.com/active-directory/managed-service-identity/tutorial-linux-vm-access-storage. This article describes the process in more detail.

y=`curl http://localhost:50342/oauth2/token --data "resource=https://management.chinacloudapi.cn/" -H Metadata:true`
ytoken=`echo $y | python -c "import sys, json; print(json.load(sys.stdin)['access_token'])"`
curl https://management.chinacloudapi.cn/subscriptions/<SubscriptionID>/resourceGroups/<ResourceGroup of Storage account>/providers/Microsoft.Storage/storageAccounts/<Storage Account Name>/listKeys?api-version=2016-12-01 --request POST -d "" -H "Authorization: Bearer $ytoken"

# Now you can access the data in the storage account from the retrieved storage account keys.

从 Python 访问 Key Vault

from azure.keyvault import KeyVaultClient
from msrestazure.azure_active_directory import MSIAuthentication

"""MSI Authentication example."""

# Get credentials.
credentials = MSIAuthentication(
    resource='https://vault.azure.cn'
)

# Create a Key Vault client.
key_vault_client = KeyVaultClient(
    credentials
)

key_vault_uri = "https://<key Vault Name>.vault.azure.cn/"

secret = key_vault_client.get_secret(
    key_vault_uri,  # Your key vault URL.
    # The name of your secret that already exists in the key vault.
    "SQLPasswd",
    ""              # The version of the secret; empty string for latest.
)
print("My secret value is {}".format(secret.value))

从 Azure CLI 访问 Key Vault

# With managed identities for Azure resources set up on the DSVM, users on the DSVM can use Azure CLI to perform the authorized functions. The following commands enable access to the key vault from Azure CLI, without a required Azure account login.
# Prerequisites: MSI is already set up on the DSVM, as indicated earlier. Specific permissions, like accessing storage account keys, reading specific secrets, and writing new secrets, are provided to the MSI.

# Authenticate to Azure CLI without a required Azure account. 
az login --msi

# Retrieve a secret from the key vault. 
az keyvault secret show --vault-name <Vault Name> --name SQLPasswd

# Create a new secret in the key vault.
az keyvault secret set --name MySecret --vault-name <Vault Name> --value "Helloworld"

# List access keys for the storage account.
az storage account keys list -g <Storage Account Resource Group> -n <Storage Account Name>