Quickstart: Create an Azure Data Factory using ARM template
APPLIES TO: Azure Data Factory Azure Synapse Analytics
Tip
Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!
This quickstart describes how to use an Azure Resource Manager template (ARM template) to create an Azure data factory. The pipeline you create in this data factory copies data from one folder to another folder in an Azure blob storage. For a tutorial on how to transform data using Azure Data Factory, see Tutorial: Transform data using Spark.
An Azure Resource Manager template is a JavaScript Object Notation (JSON) file that defines the infrastructure and configuration for your project. The template uses declarative syntax. You describe your intended deployment without writing the sequence of programming commands to create the deployment.
Note
This article does not provide a detailed introduction of the Data Factory service. For an introduction to the Azure Data Factory service, see Introduction to Azure Data Factory.
If your environment meets the prerequisites and you're familiar with using ARM templates, select the Deploy to Azure button. The template will open in the Azure portal.
Prerequisites
Azure subscription
If you don't have an Azure subscription, create a trial account before you begin.
Create a file
Open a text editor such as Notepad, and create a file named emp.txt with the following content:
John, Doe
Jane, Doe
Save the file in the C:\ADFv2QuickStartPSH folder. (If the folder doesn't already exist, create it.)
Review template
The template used in this quickstart is from Azure Quickstart Templates.
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"metadata": {
"_generator": {
"name": "bicep",
"version": "0.4.1.14562",
"templateHash": "8367564219536411224"
}
},
"parameters": {
"dataFactoryName": {
"type": "string",
"defaultValue": "[format('datafactory{0}', uniqueString(resourceGroup().id))]",
"metadata": {
"description": "Data Factory Name"
}
},
"location": {
"type": "string",
"defaultValue": "[resourceGroup().location]",
"metadata": {
"description": "Location of the data factory."
}
},
"storageAccountName": {
"type": "string",
"defaultValue": "[format('storage{0}', uniqueString(resourceGroup().id))]",
"metadata": {
"description": "Name of the Azure storage account that contains the input/output data."
}
},
"blobContainerName": {
"type": "string",
"defaultValue": "[format('blob{0}', uniqueString(resourceGroup().id))]",
"metadata": {
"description": "Name of the blob container in the Azure Storage account."
}
}
},
"functions": [],
"variables": {
"dataFactoryLinkedServiceName": "ArmtemplateStorageLinkedService",
"dataFactoryDataSetInName": "ArmtemplateTestDatasetIn",
"dataFactoryDataSetOutName": "ArmtemplateTestDatasetOut",
"pipelineName": "ArmtemplateSampleCopyPipeline"
},
"resources": [
{
"type": "Microsoft.Storage/storageAccounts",
"apiVersion": "2021-04-01",
"name": "[parameters('storageAccountName')]",
"location": "[parameters('location')]",
"sku": {
"name": "Standard_LRS"
},
"kind": "StorageV2"
},
{
"type": "Microsoft.Storage/storageAccounts/blobServices/containers",
"apiVersion": "2021-04-01",
"name": "[format('{0}/default/{1}', parameters('storageAccountName'), parameters('blobContainerName'))]",
"dependsOn": [
"[resourceId('Microsoft.Storage/storageAccounts', parameters('storageAccountName'))]"
]
},
{
"type": "Microsoft.DataFactory/factories",
"apiVersion": "2018-06-01",
"name": "[parameters('dataFactoryName')]",
"location": "[parameters('location')]",
"identity": {
"type": "SystemAssigned"
}
},
{
"type": "Microsoft.DataFactory/factories/linkedservices",
"apiVersion": "2018-06-01",
"name": "[format('{0}/{1}', parameters('dataFactoryName'), variables('dataFactoryLinkedServiceName'))]",
"properties": {
"type": "AzureBlobStorage",
"typeProperties": {
"connectionString": "[format('DefaultEndpointsProtocol=https;AccountName={0};AccountKey={1};EndpointSuffix=core.chinacloudapi.cn', parameters('storageAccountName'), listKeys(resourceId('Microsoft.Storage/storageAccounts', parameters('storageAccountName')), '2021-04-01').keys[0].value)]"
}
},
"dependsOn": [
"[resourceId('Microsoft.DataFactory/factories', parameters('dataFactoryName'))]",
"[resourceId('Microsoft.Storage/storageAccounts', parameters('storageAccountName'))]"
]
},
{
"type": "Microsoft.DataFactory/factories/datasets",
"apiVersion": "2018-06-01",
"name": "[format('{0}/{1}', parameters('dataFactoryName'), variables('dataFactoryDataSetInName'))]",
"properties": {
"linkedServiceName": {
"referenceName": "[variables('dataFactoryLinkedServiceName')]",
"type": "LinkedServiceReference"
},
"type": "Binary",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"container": "[format('{0}/default/{1}', parameters('storageAccountName'), parameters('blobContainerName'))]",
"folderPath": "input",
"fileName": "emp.txt"
}
}
},
"dependsOn": [
"[resourceId('Microsoft.Storage/storageAccounts/blobServices/containers', split(format('{0}/default/{1}', parameters('storageAccountName'), parameters('blobContainerName')), '/')[0], split(format('{0}/default/{1}', parameters('storageAccountName'), parameters('blobContainerName')), '/')[1], split(format('{0}/default/{1}', parameters('storageAccountName'), parameters('blobContainerName')), '/')[2])]",
"[resourceId('Microsoft.DataFactory/factories', parameters('dataFactoryName'))]",
"[resourceId('Microsoft.DataFactory/factories/linkedservices', parameters('dataFactoryName'), variables('dataFactoryLinkedServiceName'))]"
]
},
{
"type": "Microsoft.DataFactory/factories/datasets",
"apiVersion": "2018-06-01",
"name": "[format('{0}/{1}', parameters('dataFactoryName'), variables('dataFactoryDataSetOutName'))]",
"properties": {
"linkedServiceName": {
"referenceName": "[variables('dataFactoryLinkedServiceName')]",
"type": "LinkedServiceReference"
},
"type": "Binary",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"container": "[format('{0}/default/{1}', parameters('storageAccountName'), parameters('blobContainerName'))]",
"folderPath": "output"
}
}
},
"dependsOn": [
"[resourceId('Microsoft.Storage/storageAccounts/blobServices/containers', split(format('{0}/default/{1}', parameters('storageAccountName'), parameters('blobContainerName')), '/')[0], split(format('{0}/default/{1}', parameters('storageAccountName'), parameters('blobContainerName')), '/')[1], split(format('{0}/default/{1}', parameters('storageAccountName'), parameters('blobContainerName')), '/')[2])]",
"[resourceId('Microsoft.DataFactory/factories', parameters('dataFactoryName'))]",
"[resourceId('Microsoft.DataFactory/factories/linkedservices', parameters('dataFactoryName'), variables('dataFactoryLinkedServiceName'))]"
]
},
{
"type": "Microsoft.DataFactory/factories/pipelines",
"apiVersion": "2018-06-01",
"name": "[format('{0}/{1}', parameters('dataFactoryName'), variables('pipelineName'))]",
"properties": {
"activities": [
{
"name": "MyCopyActivity",
"type": "Copy",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"source": {
"type": "BinarySource",
"storeSettings": {
"type": "AzureBlobStorageReadSettings",
"recursive": true
}
},
"sink": {
"type": "BinarySink",
"storeSettings": {
"type": "AzureBlobStorageWriterSettings"
}
},
"enableStaging": false
},
"inputs": [
{
"referenceName": "[variables('dataFactoryDataSetInName')]",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "[variables('dataFactoryDataSetOutName')]",
"type": "DatasetReference"
}
]
}
]
},
"dependsOn": [
"[resourceId('Microsoft.DataFactory/factories', parameters('dataFactoryName'))]",
"[resourceId('Microsoft.DataFactory/factories/datasets', parameters('dataFactoryName'), variables('dataFactoryDataSetInName'))]",
"[resourceId('Microsoft.DataFactory/factories/datasets', parameters('dataFactoryName'), variables('dataFactoryDataSetOutName'))]"
]
}
]
}
There are Azure resources defined in the template:
- Microsoft.Storage/storageAccounts: Defines a storage account.
- Microsoft.DataFactory/factories: Create an Azure Data Factory.
- Microsoft.DataFactory/factories/linkedServices: Create an Azure Data Factory linked service.
- Microsoft.DataFactory/factories/datasets: Create an Azure Data Factory dataset.
- Microsoft.DataFactory/factories/pipelines: Create an Azure Data Factory pipeline.
More Azure Data Factory template samples can be found in the quickstart template gallery.
Deploy the template
Select the following image to sign in to Azure and open a template. The template creates an Azure Data Factory account, a storage account, and a blob container.
Select or enter the following values.
Unless it's specified, use the default values to create the Azure Data Factory resources:
- Subscription: Select an Azure subscription.
- Resource group: Select Create new, enter a unique name for the resource group, and then select OK.
- Region: Select a location. For example, China East 2.
- Data Factory Name: Use default value.
- Location: Use default value.
- Storage Account Name: Use default value.
- Blob Container: Use default value.
Review deployed resources
Select Go to resource group.
Verify your Azure Data Factory is created.
- Your Azure Data Factory name is in the format - datafactory<uniqueid>.
Verify your storage account is created.
- The storage account name is in the format - storage<uniqueid>.
Select the storage account created and then select Containers.
- On the Containers page, select the blob container you created.
- The blob container name is in the format - blob<uniqueid>.
- On the Containers page, select the blob container you created.
Upload a file
On the Containers page, select Upload.
In te right pane, select the Files box, and then browse to and select the emp.txt file that you created earlier.
Expand the Advanced heading.
In the Upload to folder box, enter input.
Select the Upload button. You should see the emp.txt file and the status of the upload in the list.
Select the Close icon (an X) to close the Upload blob page.
Keep the container page open, because you can use it to verify the output at the end of this quickstart.
Start Trigger
Navigate to the Data factories page, and select the data factory you created.
Select Open on the Open Azure Data Factory Studio tile.
Select the Author tab .
Select the pipeline created - ArmtemplateSampleCopyPipeline.
Select Add Trigger > Trigger Now.
In the right pane under Pipeline run, select OK.
Monitor the pipeline
Select the Monitor tab .
You see the activity runs associated with the pipeline run. In this quickstart, the pipeline has only one activity of type: Copy. As such, you see a run for that activity.
Verify the output file
The pipeline automatically creates an output folder in the blob container. Then, it copies the emp.txt file from the input folder to the output folder.
In the Azure portal, on the Containers page, select Refresh to see the output folder.
Select output in the folder list.
Confirm that the emp.txt is copied to the output folder.
Clean up resources
You can clean up the resources that you created in the Quickstart in two ways. You can delete the Azure resource group, which includes all the resources in the resource group. If you want to keep the other resources intact, delete only the data factory you created in this tutorial.
Deleting a resource group deletes all resources including data factories in it. Run the following command to delete the entire resource group:
Remove-AzResourceGroup -ResourceGroupName $resourcegroupname
If you want to delete just the data factory, and not the entire resource group, run the following command:
Remove-AzDataFactoryV2 -Name $dataFactoryName -ResourceGroupName $resourceGroupName
Related content
In this quickstart, you created an Azure Data Factory using an ARM template and validated the deployment. To learn more about Azure Data Factory and Azure Resource Manager, continue on to the articles below.
- Azure Data Factory documentation
- Learn more about Azure Resource Manager
- Get other Azure Data Factory ARM templates