Media Session API: the input of Chrome's Global Media Controls

One of the most underused Web APIs must be the Media Session API. Spotify and YouTube are avid users of this API. Chrome's (recently launched) Global Media Controls feature relies on it.

The Global Media Controls feature in Chrome

Let's define two terms before we're in too deep:

  1. A Platform UI is a (native) interface provided by the platform and can be accessed outside of the application. (e.g. the lock-screen/notification panels on smartphones/desktops.)
  2. A Media Session happens whenever an (app) user is consuming multimedia content on a (web) app. (e.g. the user listening to a specific podcast on your website through their smartphone).

The Media Session API links a platform UI to a media session.

  1. As an app user you can control the playback of the media session and see its metadata (e.g. song title, artwork, etc.) through the platform UI.
  2. As a developer you can configure metadata and action handlers of media sessions. For example, you can configure a song title as metadata, and configure what happens when a user clicks a button in the platform UI.

Let's visualize how it looks to the user.

Media Session API Comparison

And if you need a screen-recording of a demo:

The Media Session API is supported on Chrome and Edge (Chromium), and is in development on Safari and Firefox Preview.

Chrome recently enabled Global Media Controls, as further explained by TechCrunch. Did you notice that new, little music icon in your control bar when you're on YouTube? That's the "Global Media Controls". It's related to YouTube using the Media Session API, and Chrome serving a Platform UI based on the configured Media Session metadata.

Use-cases

This API could be a nifty feature in the following product use-cases:

  • An audio application (e.g. Spotify) where you want to enable your listeners to switch between songs/podcast/audiobooks, and play/pause the current asset, without opening up the app.
  • A video application (e.g. YouTube) where your users are (temporarily) only consuming the audio of your video content. (This scenario is more common than you think with (live) sports content, music videos, (talk) shows and interviews.) Similar to the above item, you want your users to control playback outside of your app.

If you've a website/app with audio content, you should consider using this easy-to-implement API.

Background

The Media Session API is a standardization project led by the W3C, and has been in development since 2015. Chrome was one of the first browsers to support this API in February 2017.

Resources

The following resources help to understand the Media Session API.

Demo

Other

Implementation

The Media Session API is supported on some platforms, and its implementation is relatively straight forward. Let's first discuss some Media Session concepts, then overview which platforms support the API, and how you'd implement it.

Concepts

The Editor's Draft has the following introduction on the Media Session API:

Media is used extensively today, and the Web is one of the primary means of consuming media content. Many platforms can display media metadata, such as title, artist, album and album art on various UI elements such as notification, media control center, device lockscreen and wearable devices. This specification aims to enable web pages to specify the media metadata to be displayed in platform UI, and respond to media controls which may come from platform UI or media keys, thereby improving the user experience.

As a developer, you can configure the metadata, action handlers, the playback state and the position state of a media session.

Sketchy model diagram of the Media Session API

Metadata

You can configure the metadata of the active Media Session. For example:

navigator.mediaSession.metadata = new MediaMetadata(
    {
        title: "Media Session API",
        artist: "Thijs Lowette",
        album: "Web API",
        artwork: [
            {
                src: "msa128.jpg", 
                sizes: "128x128", 
                type: "image/jpeg"
            },
            {
                src: "msa256.jpg",
                sizes: "256x256"
            },
            {
                src: "msa1024.jpg",
                sizes: "1024x1024",
                type: "image/jpeg"
            },
            {
                src: "msa128.png",
                sizes: "128x128", 
                type: "image/png"
            },
            {
                src: "msa256.png",
                sizes: "256x256",
                type: "image/png"
            },
            {
                src: "msa.ico",
                sizes: "128x128 256x256",
                type: "image/x-icon"
            }
        ]
    }
);

Action Handlers

You can configure action handlers to respond to the intend of users. An action can be visualized by the platform UI as a control (icon/button) in the lock-screen/notification window.

navigator.mediaSession.setActionHandler("play", _ => audioPlayer.play());

The following actions are part of the specification.

Action Added Description
play < 2017 Clicking this control indicates that the user wants to play/resume playback
pause < 2017 Clicking this control indicates that the user wants to pause playback
seekbackward < 2017 Clicking this control indicates that the viewer wants to jump backward in time for a short amount of time
seekforward < 2017 Clicking this control indicates that the viewer wants to jump forward in time for a short amount of time
previoustrack < 2017 Clicking this control indicates that the viewer wants to play the previous asset (in the playlist)
nexttrack < 2017 Clicking this control indicates that the viewer wants to play the next asset (in the playlist)
skipad 2018-10-24 Clicking this control indicates that the user wants to skip the active advertisement
stop 2019-03-08 Clicking this control indicates that the user wants to stop playback (and reset the state)
seekto 2019-06-20 Clicking this control indicates that the user to seek to a specific position in the timeline

In the callback functions of some actions, you'll have access to callback parameters. The MediaSessionActionDetails information mentions the following parameters:

Action Parameters
seekto seekTime, fastSeek
seekbackward seekOffset
seekforward seekOffset

These parameters may be necessary – for example to do a correct programmatic adjustment of the playhead position.

navigator.mediaSession.setActionHandler("seekto", function(details) {
    audioPlayer.currentTime = details.seekTime;
});

Some actions do not yet have an implementation on platforms. For example, the skipad action is currently not visualized on Chrome for Android – the platform with arguably the most thorough implementation of the Media Session API.

Personal opinion: in the future, there should be some mechanism in the Media Session API to indicate that an ad is currently playing (e.g. through the playback state). If there wasn't such a mechanism, then the platform UI would always show a skip-ad button – which isn't too great for obvious reasons.

Playback State

You can configure the playback state, which shifts between none, playing and paused. This extension was added on 2016-12-15. Please to refer https://w3c.github.io/mediasession/#example-set-playbackState to understand possible use-cases.

Position State

You can configure the position state. This allows the platform UI to visualize the current playhead position and duration of the active media session. This extension was added on 2019-05-03.

navigator.mediaSession.setPositionState({
    duration: 60,
    playbackRate: 2,
    position: 10
});

Supported platforms

Chrome Safari Edge Firefox
Mobile Yes In Development No In Development
Desktop Yes Under Consideration Yes Under Consideration

This feature is supported on Chrome for Desktop and Chrome for Android according to https://developer.mozilla.org/en-US/docs/Web/API/Media_Session_API and https://caniuse.com/#feat=mdn-api_mediasession. Chrome's team is still maintaining this API, as demonstrated with a new feature being added to Chrome 78.

It's also supported on Microsoft Edge (Chromium).

It's in development by Safari. Between 2015 and 2018, it seemed like no progress was made, but as of November 2019 this task seems to be re-assigned to someone new. Perhaps a sign that they're finally thinking of putting it into the world?

It's also seems to be in development by Firefox. Apparently the Mozilla team is targeting Fenix AKA Firefox Preview.

It should be noted that platform implementations of the Media Session API can differ. For example, Chrome for Desktop is not able to render the artwork contained in the MediaMetadata, whereas Chrome for Android can.

Browser

So, based on the above information, you can only really use it on Chrome at the moment. That's OK though, because Chrome has more than 60% of the market share of the mobile browsers according to https://gs.statcounter.com/browser-market-share/mobile/worldwide, and close to 70% on Desktop.

Code examples

An in-house code example is available at https://github.com/ottball/media-session-api, with a demo available at https://ottball.github.io/media-session-api/sample-1.html. This sample implementation of the Media Session API uses THEOplayer and ID3 tags to update the Media Session's metadata. (Note: bugs are tracked in the README.md.)

  1. The W3C specification contains multiple examples at https://w3c.github.io/mediasession/#examples.
  2. Google did a great job providing a tutorial on the Media Session API at https://developers.google.com/web/updates/2017/02/media-session, and I encourage you to check it out.
  3. MDN's example at https://developer.mozilla.org/en-US/docs/Web/API/Media_Session_API is fairly limited, but it's something.

Android

You can achieve a Media Session-like experience on native Android applications, as described by https://developer.android.com/guide/topics/media-apps/working-with-a-media-session, which seems to be loosely based on the Media Session standard. Compared to web, there are less things predefined, and you use MediaStyle Notifications to render (custom) actions in the lock-screen.

Conclusion

Implementing the Media Session API is a nice boost to the UX of your application. Your viewers and listeners can see which songs and videos are playing, and they can manage playback outside of your application.

The Media Session API allows you to get creative. For example, you could try to emulate a video in the platform UI by continuously updating the artwork of the media session's metadata.

The API is still in development, and seems to be back on the radar of platform developers like Safari and Firefox – and was never out of sight of Google.

Let the author know if something should be added to this article.