What is HTTP Adaptive Streaming (HAS)
Video (streaming) services use HTTP Adaptive Streaming (HAS). The keyword is adaptive. HAS allows a video player to select the most optimal multimedia quality out of the renditions provided by the back-end. For example, a smartphone on 3G should play a different video quality than a desktop on gigabit ethernet. HLS and MPEG-DASH are HTTP Adaptive Streaming specifications.
HTTP Adaptive Streaming uses manifests. A manifest links to renditions. Each rendition represents a different quality. A rendition links to segments.
Adapting to different conditions
A client may switch to a different rendition if the following conditions change:
- Network conditions.
- Viewport conditions. A device with a smaller viewport (e.g. a smartphone) doesn't need to download the same renditions as a device with a bigger viewport (e.g. a Smart TV).
- Platform conditions. Every platform supports a specific set of codecs and containers, so you can optimize per platform.
For example, if your manifest offers a H.264 1080p rendition and a H.265 1080p rendition, then the Safari browser should pick the H.265 rendition, because a lower bandwidth will be required to stream H.265. (H.265 encodes data more efficiently than H.264.) Chrome should pick the H.264 rendition, because this browser has no support for H.265, while Safari does support H.265.
Other conditions might apply, for example battery conditions (some renditions might consume less energy) or connectivity-type conditions (e.g. perhaps you're out of mobile data).
Concepts
Manifests
A manifest is a file. This file contains references to multimedia renditions (video, audio, ...).
A back-end creates a manifest and uploads it to a server. A client downloads a manifest and parses it.
A HLS manifest is plain-text file with the .m3u8 extension.
An MPEG-DASH manifest is a XML-file with the .mpd extension.
Specification
A manifest adheres to a technical specification. This specification ('spec') dictates the rules and syntax. If the rules are not respected by the back-end, then the client could perceive the manifest as gibberish. If the rules are not respected by the client, then the stream would not be played out (correctly).
HLS and MPEG-DASH are examples of such specifications. Back-ends create HLS (or MPEG-DASH) manifests and upload it to a web server. Clients (and their video players) download the manifest and parse it.
Renditions
A manifest references multiple renditions of the same type of content. For example, there's a 360p version of the video, a 720p version, and so on. Certain renditions are better suited to stream than others.
Segments
A media file is chunked into 1 or more segments. If the network conditions change, a client can switch to segments in another rendition.
Read more
This article offers basic information on HTTP Adaptive Streaming. The resources below pick up where we left off.
- StreamingMedia.com: https://www.streamingmedia.com/Articles/Editorial/What-Is-.../What-Is-Adaptive-Streaming-75195.aspx?utm_source=ottball
- Mozilla.org: https://developer.mozilla.org/en-US/docs/Web/Guide/Audio_and_video_delivery/Setting_up_adaptive_streaming_media_sources
- Toast UI: https://medium.com/@toastui/implementing-adaptive-http-streaming-using-the-web-e2c12d46a38f
- Bitmovin.com: https://bitmovin.com/adaptive-streaming/
Please contact us if you've an article to add to this list.
Bonus
You can interpret HTTP Adaptive Streaming quite literally.
- HTTP: the HTTP protocol is used to request data. This protocol is the 'universal protocol' across internet browsers and applications, and requires no additional support from plug-ins or firewalls.
❌ It should not deliver the stream through RTMP (or any other protocol besides HTTP), because the stream would no longer be HTTP. - Adaptive: the multimedia content exists in multiple variations and the video stream can adapt to different conditions.
❌ It should not deliver only one rendition of the content. For example, the manifest should not only contain a 4K rendition of the video content, because the stream would no longer be adaptable. - Streaming: segments of data are being sent (i.e. streamed) to the viewer. The server (e.g. a web server) is streaming data (e.g. video chunks of 10 seconds) to a client (e.g. a browser).
❌ It should not deliver 1 media file per rendition, because changing renditions would become challenging.
This article is a work-in-progress.