Encoding video and audio with Media Services

Media Services logo v3


Warning

Azure Media Services will be retired June 30th, 2024. For more information, see the AMS Retirement Guide.

Tip

Want to generate thumbnails, stitch two videos together, subclip a video or rotate it (among other things)? You can find Media Services sample code on the Samples page.

The term encoding in Media Services applies to the process of converting files containing digital video and/or audio from one standard format to another, with the purpose of (a) reducing the size of the files, and/or (b) producing a format that's compatible with a broad range of devices and apps. This process is also referred to as video compression, or transcoding. See the Data compression and the What Is Encoding and Transcoding? for further discussion of the concepts.

Videos are typically delivered to devices and apps by progressive download or through adaptive bitrate streaming.

transforms and jobs

Important

Media Services does not bill for canceled or jobs that throw errors. For example, a job that has reached 50% progress and is canceled is not billed at 50% of the job minutes. You are only charged for finished jobs.

  • To deliver by progressive download, you can use Azure Media Services to convert a digital media file (mezzanine) into an MP4 file, which contains video that's been encoded with the H.264 codec, and audio that's been encoded with the AAC codec. This MP4 file is written to an Asset in your storage account. You can use the Azure Storage APIs or SDKs (for example, Storage REST API or .NET SDK) to download the file directly. If you created the output Asset with a specific container name in storage, use that location. Otherwise, you can use Media Services to list the asset container URLs.
  • To prepare content for delivery by adaptive bitrate streaming, the mezzanine file needs to be encoded at multiple bitrates (high to low). To ensure graceful transition of quality, the resolution of the video is lowered as the bitrate is lowered. This results in a so-called encoding ladder–a table of resolutions and bitrates (see auto-generated adaptive bitrate ladder or use the content aware encoding preset). You can use Media Services to encode your mezzanine files at multiple bitrates. In doing so, you'll get a set of MP4 files and associated streaming configuration files written to an Asset in your storage account. You can then use the Dynamic Packaging capability in Media Services to deliver the video via streaming protocols like MPEG-DASH and HLS. This requires you to create a Streaming Locator and build streaming URLs corresponding to the supported protocols, which can then be handed off to devices/apps based on their capabilities.

Transforms and jobs

To encode with Media Services v3, you need to create a Transform and a Job. The transform defines a recipe for your encoding settings and outputs; the job is an instance of the recipe. For more information, see Transforms and Jobs.

When encoding with Media Services, you use presets to tell the encoder how the input media files should be processed. In Media Services v3, you use Standard Encoder to encode your files. For example, you can specify the video resolution and/or the number of audio channels you want in the encoded content.

You can get started quickly with one of the built-in presets based on industry best practices or you can choose to build a custom preset to target your specific scenario or device requirements.

Starting with January 2019, when encoding with the Standard Encoder to produce MP4 file(s), a new .mpi file is generated and added to the output Asset. This MPI file is intended to improve performance for dynamic packaging and streaming scenarios.

Note

You shouldn't modify or remove the MPI file, or take any dependency in your service on the existence (or not) of such a file.

Built-in presets

Media Services supports the following built-in encoding presets:

BuiltInStandardEncoderPreset

BuiltInStandardEncoderPreset is used to set a built-in preset for encoding the input video with the Standard Encoder.

The following built-in presets are currently supported:

  • EncoderNamedPreset.AACGoodQualityAudio: Produces a single MP4 file containing only stereo audio encoded at 192 kbps.

  • EncoderNamedPreset.AdaptiveStreaming: This supports H.264 adaptive bitrate encoding. For more information, see auto-generating a bitrate ladder.

  • EncoderNamedPreset.H265AdaptiveStreaming : Similar to the AdaptiveStreaming preset, but uses the HEVC (H.265) codec. Produces a set of GOP aligned MP4 files with H.265 video and stereo AAC audio. Auto-generates a bitrate ladder based on the input resolution, bitrate and frame rate. The auto-generated preset will never exceed the input resolution. For example, if the input is 720p, output will remain 720p at best.

  • EncoderNamedPreset.ContentAwareEncoding: Exposes a preset for H.264 content-aware encoding. Produces a set of GOP-aligned MP4s by using content-aware encoding. Given any input content, the service performs an initial lightweight analysis of the input content, and uses the results to determine the optimal number of layers, appropriate bitrate and resolution settings for delivery by adaptive streaming. This preset is particularly effective for low and medium complexity videos, where the output files will be at lower bitrates but at a quality that still delivers a good experience to viewers. The output will contain MP4 files with video and audio interleaved. This preset only produces output up to 1080P HD. If 4K output is required, you can configure the preset with PresetConfigurations by using the "maxHeight" property. For more information, see content-aware encoding.

  • EncoderNamedPreset.H265ContentAwareEncoding: Exposes a preset for HEVC (H.265) content-aware encoding. Produces a set of GOP-aligned MP4s by using content-aware encoding. Given any input content, the service performs an initial lightweight analysis of the input content, and uses the results to determine the optimal number of layers, appropriate bitrate and resolution settings for delivery by adaptive streaming. This preset is particularly effective for low and medium complexity videos, where the output files will be at lower bitrates but at a quality that still delivers a good experience to viewers. The output will contain MP4 files with video and audio interleaved. This preset produces output up to 4K HD. If 8K output is required, you can configure the preset with PresetConfigurations by using the "maxHeight" property.

  • EncoderNamedPreset.H264MultipleBitrate1080p: produces a set of eight GOP-aligned MP4 files, ranging from 6000 kbps to 400 kbps, and stereo AAC audio. Resolution starts at 1080p and goes down to 360p.

  • EncoderNamedPreset.H264MultipleBitrate720p: produces a set of six GOP-aligned MP4 files, ranging from 3400 kbps to 400 kbps, and stereo AAC audio. Resolution starts at 720p and goes down to 360p.

  • EncoderNamedPreset.H264MultipleBitrateSD: produces a set of five GOP-aligned MP4 files, ranging from 1600 kbps to 400 kbps, and stereo AAC audio. Resolution starts at 480p and goes down to 360p.

  • EncoderNamedPreset.H264SingleBitrate1080p: produces an MP4 file where the video is encoded with H.264 codec at 6750 kbps and a picture height of 1080 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps. If you desire lower bitrates for audio, you can build a custom encoding preset in your transform and adjust the sampling rate or channel count to get down to lower values for AAC-LC.

  • EncoderNamedPreset.H264SingleBitrate720p: produces an MP4 file where the video is encoded with H.264 codec at 4500 kbps and a picture height of 720 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps. If you desire lower bitrates for audio, you can build a custom encoding preset in your transform and adjust the sampling rate or channel count to get down to lower values for AAC-LC.

  • EncoderNamedPreset.H264SingleBitrateSD: produces an MP4 file where the video is encoded with H.264 codec at 2200 kbps and a picture height of 480 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps. If you desire lower bitrates for audio, you can build a custom encoding preset in your transform and adjust the sampling rate or channel count to get down to lower values for AAC-LC.

  • EncoderNamedPreset.H265SingleBitrate720P: produces an MP4 file where the video is encoded with HEVC (H.265) codec at 1800 kbps and a picture height of 720 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps.

  • EncoderNamedPreset.H265SingleBitrate1080p: produces an MP4 file where the video is encoded with HEVC (H.265) codec at 3500 kbps and a picture height of 1080 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps.

  • EncoderNamedPreset.H265SingleBitrate4K: produces an MP4 file where the video is encoded with HEVC (H.265) codec at 9500 kbps and a picture height of 2160 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps.

To see the most up-to-date presets list, see built-in presets to be used for encoding videos.

Custom presets

Media Services fully supports customizing all values in presets to meet your specific encoding needs and requirements.

StandardEncoderPreset

StandardEncoderPreset describes settings to be used when encoding the input video with the Standard Encoder. Use this preset when customizing Transform presets.

Considerations

When creating custom presets, the following considerations apply:

  • All values for height and width on AVC content must be a multiple of four.
  • In Azure Media Services v3, all of the encoding bitrates are in bits per second. This is different from the presets with our v2 APIs, which used kilobits/second as the unit. For example, if the bitrate in v2 was specified as 128 (kilobits/second), in v3 it would be set to 128000 (bits/second).

Preset schema

In Media Services v3, presets are strongly typed entities in the API itself. You can find the "schema" definition for these objects in Open API Specification (or Swagger). You can also view the preset definitions (like StandardEncoderPreset) in the REST API, .NET SDK (or other Media Services v3 SDK reference documentation).

Scaling encoding in v3

For accounts created with the 2020-05-01 or later version of the API or through the Azure portal, scaling and media reserved units are no longer required. Scaling will be automatic and handled by the service internally.

Billing

Media Services does not bill for canceled or errored jobs. For example, a job that has reached 50% progress and is canceled is not billed at 50% of the job minutes. You are only charged for finished jobs.

For more information, see pricing.

Encoding samples

See the extensive list of Encoding Samples.