This article is part 6 of the "21 Thoughts on Video Streaming in 2021"-series.

John Deutscher (Program Manager at Microsoft) on his focus for 2021:

For me, it's mostly around low latency, at scale, and in an "Enterprise" environment, and not such much in an "OTT" environment. For Enterprise, it's all about keeping the latency low enough to match Q&A and chat experiences. For OTT, it's all about matching broadcast latency. We need to get this LL-HLS and LL-DASH stuff to work at scale!

When asked on his hopes for 2021:

We need better ingest protocols!
- RTMP lacks the standardization to work around it.
- SRT needs to make sure that there's a spec, written down, and accurately maintained outside of Haivision.
- RIST is also out there, as well as the DASH ingest protocol.

All have their values, but one thing that is missing from "all" ingest protocols is some way to track ingest telemetry better. Our biggest problems with multiple 3rd party encoders is knowing what is failing, where, and why -- and pushing that info to the customer to solve things early. There's no work around standardizing ingest telemetry between encoders and services... Akamai designed some protocols, but they never took off.

Keep reading to get 1) context and 2) my take.

Context

Regarding enterprise latency:

So what's broadcast latency and what is enterprise latency?

When you want to match broadcast latency, you want to match the "broadcast happening in the real world". There are two approaches to achieve this:

  1. Reduce the latency of your OTT stream as much as possible, for example by using "low latency streaming protocols".
  2. Synchronize your broadcast stream with your OTT stream, for example by adding a delay to the broadcast signal that non-OTT viewers are seeing, and using timed metadata which allows your OTT stream to be "in sync" with your broadcast stream. This isn't straightforward, because different platforms support different types of streams (in different ways) and different types of metadata. (There's still the possibility that "real" people might have access to the broadcast event without any delay by being physically there, for example a sports game.)

I want to say that enterprise latency is similar to the second approach. But, on top of syncing your "broadcast stream" and your "OTT stream", you might also need to synchronize chats, quizes, multiple streams (e.g. web conference), ...

Regarding ingest protocols:

In the context of live video on a website, you could argue that an ingest stream is the "input" of that live video on a website. "Something" will take that ingest stream, process it, and transform it to a "final" stream.

When you're watching a live stream on Twitch or YouTube, you are not watching the ingest stream. The ingest stream comes out of a camera, and gets fed to a studio/encoder/transcoder/packager. This step takes the ingest stream, transforms it, and creates the "final" video stream you watch on Twitch/YouTube. This final stream is often an "OTT stream", such as HLS or DASH.

Why can't the ingest stream be the final stream?

Ingest streams don't scale. You need to process them before you can send them to millions of viewers. This usually involves re-encoding the content, putting it in a different kind of format, and encrypting it.

RTMP is the "oldest" ingest protocol. SRT and RIST are newer types of ingest protocols. An RTMP/SRT/RIST stream follows the RTMP/SRT/RIST protocol.

My take

This is difficult stuff.

  • Doing low latency correct for a broadcast stream is hard. Doing low latency + syncing for an enterprise stream is really hard.
  • Ingest telemetry isn't the most flashy topic.

Most streaming efforts are focused on the B2C market, and not so much on a niche B2B market. It'll be tough for John to find a community who wants to collaborate.