Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add external WebVTT subtitles into HTTP Live Stream on iOS client

We have videos encoded via bitmovin.com and provided as HTTP Live Streams (Fairplay HLS), but subtitles although in WebVTT format are exposed separately as direct URLs for the whole file, not individual segments and are not part of the HLS m3u8 playlist.

I am looking for the way how an external .vtt file downloaded separately can still be included in the HLS stream and be available as a subtitle in AVPlayer.

I know Apple's recommendation is to include segmented VTT subtitles into the HLS playlist, but I can't change the server implementation right now, so I want to clarify if it is even possible to provide the subtitle to AVPlayer to play along with the HLS stream.

The only valid post on this subject claiming it is possible is this: Subtitles for AVPlayer/MPMoviePlayerController. However, the sample code loads local mp4 file from bundle and I am struggling to make it work for m3u8 playlist via AVURLAsset. Actually, I am having problem to get videoTrack from the remote m3u8 stream as the asset.tracks(withMediaType: AVMediaTypeVideo) returns empty array. Any ideas if this approach can work for real HLS stream? Or is there any other way to play separate WebVTT subtitle with HLS stream without including them into HLS playlist on the server? Thanks.

func playFpsVideo(with asset: AVURLAsset, at context: UIViewController) {

    let composition = AVMutableComposition()

    // Video
    let videoTrack = composition.addMutableTrack(withMediaType: AVMediaTypeVideo, preferredTrackID: kCMPersistentTrackID_Invalid)

    do {

        let tracks = asset.tracks(withMediaType: AVMediaTypeVideo)

        // ==> The code breaks here, tracks is an empty array
        guard let track = tracks.first else {
            Log.error("Can't get first video track")
            return
        }

        try videoTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, asset.duration), of: track, at: kCMTimeZero)

    } catch {

        Log.error(error)
        return
    }


    // Subtitle, some test from the bundle..
    guard let subsUrl = Bundle.main.url(forResource: "subs", withExtension: "vtt") else {
        Log.error("Can't load subs.vtt from bundle")
        return
    }

    let subtitleAsset = AVURLAsset(url: subsUrl)

    let subtitleTrack = composition.addMutableTrack(withMediaType: AVMediaTypeText, preferredTrackID: kCMPersistentTrackID_Invalid)

    do {

        let subTracks = subtitleAsset.tracks(withMediaType: AVMediaTypeText)

        guard let subTrack = subTracks.first else {
            Log.error("Can't get first subs track")
            return
        }

        try subtitleTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, asset.duration), of: subTrack, at: kCMTimeZero)

    } catch {

        Log.error(error)
        return
    }


    // Prepare item and play it
    let item = AVPlayerItem(asset: composition)

    let player = AVPlayer(playerItem: item)

    let playerViewController = AVPlayerViewController()
    playerViewController.player = player

    self.playerViewController = playerViewController

    context.present(playerViewController, animated: true) {
        playerViewController.player?.play()
    }
}
like image 858
Martin Koles Avatar asked Apr 01 '17 19:04

Martin Koles


2 Answers

I figured this out. It took forever and I hated it. I'm putting my explanation and source code on Github but I'll put stuff here too incase the link dies for whatever reason: https://github.com/kanderson-wellbeats/sideloadWebVttToAVPlayer

I'm dropping this explanation here to try to save some future people a lot of pain. Lots of stuff I found online was wrong, or left out confusing pieces, or had a bunch of extra irrelevant information, or a mixture of all three. On top of that, I saw lots of people asking for help and trying to do the same thing with nobody providing any clear answers.

So to begin I'll describe what I'm trying to do. My backend server is Azure Media Services, and it's been really great for streaming different resolution video as needed but it just doesn't really support WebVtt. Yeah you can host a file on there, but it seems it cannot give us a master playlist that includes a reference to the subtitles playlist (as Apple requires). It seems both Apple and Microsoft decided what they were going to do with subtitles back in like 2012 and haven't touched it since. At that time they either didn't talk to each other or deliberately went opposite directions, but they happen to have poor intercompatibilty, and now devs like us are forced to stretch the gap between the behemoths. Many of the resources online covering this topic are addressing things like optimized caching of arbitrary streamed data, but I found those resources to be more confusing than helpful. All I'm wanting to do is add subtitles to on-demand videos played in AVPlayer being served by Azure Media Services with the HLS protocol when I have a hosted WebVtt file - nothing more, nothing less. I'll start by describing everything in words, then I'll put the actual code at the end.

Here is the extremely condensed version of what you need to do:

  1. Intercept the requests for the master playlist and return an edited version of it that references the subtitle playlists (multiple for multiple languages, or just one for one language)
  2. Select a subtitle to show (well documented on https://developer.apple.com/documentation/avfoundation/media_playback_and_selection/selecting_subtitles_and_alternative_audio_tracks )
  3. Intercept requests to the subtitle playlists that will come through (after you've selected a subtitle to show) and return playlists you've built on the fly that reference the WebVtt files on the server

That's it. Not too much, except there are many complications that get in the way that I had to discover myself. I'll describe them each first briefly and then in greater detail.

Brief complication explanations:

  1. Many requests will be coming through, but you should only (and can only) handle a couple of them yourself, the others need to be allowed to pass through untouched. I will describe which ones need handling and which ones don't and how to handle them.
  2. Apple decided a simple HTTP request was not good enough and decided to obscure things by translating it into a weird double-identity AVAssetResourceLoadingRequest thing that has a DataRequest property (AVAssetResourceLoadingDataRequest) and a ContentInformationRequest property (AVAssetResourceLoadingContentInformationRequest). I still don't understand why this was necessary or what benefit it brings, but what I've done here with them is working. Some promising blogs/resources seem to suggest you have to mess with the ContentInformationRequest but I find that you can simply ignore the ContentInformationRequest, and in fact messing with it more often than not just breaks things.
  3. Apple suggests you segment your VTT file into small pieces, but you simply can't do this client-side (Apple disallows this), but luckily it also seems you don't actually have to do it, it's merely a suggestion.

INTERCEPTING REQUESTS

To intercept requests, you have to subclass/extend AVAssetResourceLoaderDelegate and the method of interest is the ShouldWaitForLoadingOfRequestedResource method. To make use of the delegate, instantiate your AVPlayer by handing it an AVPlayerItem but hand the AVPlayerItem an AVUrlAsset which has a delegate property you assign the delegate to. All the requests will come through the ShouldWaitForLoadingOfRequestedResource method so that's where all the business will happen, except for one sneaky complication - the method will only be invoked if requests begin with something other than http/https, so my advice is to stick a constant string at the front of the Url you're using to create your AVUrlAsset, which you can then just shave off after the requests comes in to your delegate - let's call that "CUSTOMSCHEME". This part is described in a couple of places online, but it can be super frustrating if you don't know you have to do it because it will seem like nothing is happening at all.

INTERCEPTING - TYPE A) redirecting

Ok so now we're intercepting requests, but you don't want to (/can't) handle them all yourself. Some of the requests you just want to allow to pass through. You do this by doing the following:

  1. create a new NSUrlRequest to the CORRECTED Url (shave off that "CUSTOMSCHEME" part from earlier) and set it to the Redirect property on the LoadingRequest
  2. create a new NSHttpUrlResponse with that same corrected Url and a 302 code and set it to the Response property on the LoadingRequest
  3. call FinishLoading on the LoadingRequest
  4. return true

With those steps you can add in breakpoints and stuff to debug and inspect all the requests that will come through, but they'll proceed normally so you won't break anything. However, this approach isn't just for debugging, it's also a necessary thing to do for several requests even in the finished project.

INTERCEPTING - TYPE B) editing/faking response

When some requests come in, you'll want to do a request of your own so the response to your request (with some tweaking) can be used to fulfill the LoadingRequest. So do the following:

  1. create an NSUrlSession and call the CreateDataTask method on the session (with a corrected URL - remove the "CUSTOMSCHEME")
  2. call Resume on the DataTask (outside of the callback on the DataTask)
  3. return true
  4. up in the DataTask's callback you'll have data, so (after doing your edits) you call Respond on the LoadingRequest's DataRequest property with that (edited) data, followed by calling FinishLoading on the LoadingRequest

INTERCEPTING - which requests get which type of treatment

Lots of requests will come in, some need to be redirected, some need to be given manufactured/altered data responses. Here are the types of requests you'll see in the order they'll come in and what to do with each:

  1. a request to the master playlist, but the DataRequest's RequestedLength is 2 - just redirect (TYPE A)
  2. a request to the master playlist, but the DataRequest's RequestedLength matches the (unedited) length of the master playlist - do your own request to the master playlist so you can edit it and return the edited result (TYPE B)
  3. a request to the master playist, but the DataRequest's RequestedLength is humongous - do the same thing as you did for the previous one (TYPE B)
  4. lots of requests will come through for fragments of audio and video - all these requests need to be redirected (TYPE A)
  5. once you get the master playlist edited correctly (and a subtitle selected) a request will come through for the subtitle playlist - edit this one to return a manufactured subtitle playlist (TYPE B)

HOW TO EDIT THE PLAYLISTS - master playlist

The master playlist is easy to edit. The change is two things:

  1. each video resource has its own line and they all need to be told about the subtitle group (for each line that starts with #EXT-X-STREAM-INF I'm adding ,SUBTITLES="subs" on the end)
  2. new lines need to be added for each subtitle language/type, all belonging to the subtitle group with their own URL (so for each type, add a line like #EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",LANGUAGE="!!!yourLanguageHere!!!",NAME="!!!yourNameHere!!!",AUTOSELECT=YES,URI="!!!yourCustomUrlHere!!!"

The !!!yourCustomUrlHere!!! you use in step 2 will have to be detected by you when it's used for a request so you can return the manufactured subtitle playlist as part of the response, so set it to something unique. That Url will also have to use the "CUSTOMSCHEME" thing so that it comes to the delegate. You can also check out this streaming example to see how the manifest should look: https://developer.apple.com/streaming/examples/basic-stream-osx-ios5.html (sniff the network traffic with the browser debugger to see it).

HOW TO EDIT THE PLAYLISTS - subtitle playlist

The subtitle playlist is a little more complicated. You have to make the whole thing yourself. The way I've done it is to actually grab the WebVtt file myself inside the DataTask callback, then parse the thing down to find the end of the very last timestamp sequence, convert that to an integer number of seconds, and then insert that value in a couple places in a big string. Again, you can use the example listed above and sniff network traffic to see a real example for yourself. So it looks like this:

#EXTM3U
#EXT-X-TARGETDURATION:!!!thatLengthIMentioned!!!
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-PLAYLIST-TYPE:VOD
#EXTINF:!!!thatLengthIMentioned!!!
!!!absoluteUrlToTheWebVttFileOnTheServer!!!
#EXT-X-ENDLIST

Note that the playlist does NOT segment the vtt file as Apple recommends because this can't be done client-side (source: https://developer.apple.com/forums/thread/113063?answerId=623328022#623328022 ). Also note that I do NOT put a comma at the end of the "EXTINF" line even though Apple's example here says to do that, because it seems to break it: https://developer.apple.com/videos/play/wwdc2012/512/

Now the actual code:

public class CustomResourceLoaderDelegate : AVAssetResourceLoaderDelegate
{
    public const string LoaderInterceptionWorkaroundUrlPrefix = "CUSTOMSCHEME"; // a scheme other than http(s) needs to be used for AVUrlAsset's URL or ShouldWaitForLoadingOfRequestedResource will never be called
    private const string SubtitlePlaylistBoomerangUrlPrefix = LoaderInterceptionWorkaroundUrlPrefix + "SubtitlePlaylist";
    private const string SubtitleBoomerangUrlSuffix = "m3u8";
    private readonly NSUrlSession _session;
    private readonly List<SubtitleBundle> _subtitleBundles;

    public CustomResourceLoaderDelegate(IEnumerable<WorkoutSubtitleDto> subtitles)
    {
        _subtitleBundles = subtitles.Select(subtitle => new SubtitleBundle {SubtitleDto = subtitle}).ToList();
        _session = NSUrlSession.FromConfiguration(NSUrlSessionConfiguration.DefaultSessionConfiguration);
    }

    public override bool ShouldWaitForLoadingOfRequestedResource(AVAssetResourceLoader resourceLoader,
        AVAssetResourceLoadingRequest loadingRequest)
    {
        var requestString = loadingRequest.Request.Url.AbsoluteString;
        var dataRequest = loadingRequest.DataRequest;

        if (requestString.StartsWith(SubtitlePlaylistBoomerangUrlPrefix))
        {
            var uri = new Uri(requestString);
            var targetLanguage = uri.Host.Split(".").First();
            var targetSubtitle = _subtitleBundles.FirstOrDefault(s => s.SubtitleDto.Language == targetLanguage);

            Debug.WriteLine("### SUBTITLE PLAYLIST " + requestString);
            if (targetSubtitle == null)
            {
                loadingRequest.FinishLoadingWithError(new NSError());
                return true;
            }
            var subtitlePlaylistTask = _session.CreateDataTask(NSUrlRequest.FromUrl(NSUrl.FromString(targetSubtitle.SubtitleDto.CloudFileURL)),
                (data, response, error) =>
                {
                    if (error != null)
                    {
                        loadingRequest.FinishLoadingWithError(error);
                        return;
                    }
                    if (data == null || !data.Any())
                    {
                        loadingRequest.FinishLoadingWithError(new NSError());
                        return;
                    }
                    MakePlaylistAndFragments(targetSubtitle, Encoding.UTF8.GetString(data.ToArray()));

                    loadingRequest.DataRequest.Respond(NSData.FromString(targetSubtitle.Playlist));
                    loadingRequest.FinishLoading();
                });
            subtitlePlaylistTask.Resume();
            return true;
        }

        if (!requestString.ToLower().EndsWith(".ism/manifest(format=m3u8-aapl)") || // lots of fragment requests will come through, we're just going to fix their URL so they can proceed normally (getting bits of video and audio)
            (dataRequest != null && 
             dataRequest.RequestedOffset == 0 && // this catches the first (of 3) master playlist requests. the thing sending out these requests and handling the responses seems unable to be satisfied by our handling of this (just for the first request), so that first request is just let through. if you mess with request 1 the whole thing stops after sending request 2. although this means the first request doesn't get the same edited master playlist as the second or third, apparently that's fine.
             dataRequest.RequestedLength == 2 &&
             dataRequest.CurrentOffset == 0))
        {
            Debug.WriteLine("### REDIRECTING REQUEST " + requestString);
            var redirect = new NSUrlRequest(new NSUrl(requestString.Replace(LoaderInterceptionWorkaroundUrlPrefix, "")));
            loadingRequest.Redirect = redirect;
            var fakeResponse = new NSHttpUrlResponse(redirect.Url, 302, null, null);
            loadingRequest.Response = fakeResponse;
            loadingRequest.FinishLoading();
            return true;
        }

        var correctedRequest = new NSMutableUrlRequest(new NSUrl(requestString.Replace(LoaderInterceptionWorkaroundUrlPrefix, "")));
        if (dataRequest != null)
        {
            var headers = new NSMutableDictionary();
            foreach (var requestHeader in loadingRequest.Request.Headers)
            {
                headers.Add(requestHeader.Key, requestHeader.Value);
            }
            correctedRequest.Headers = headers;
        }

        var masterPlaylistTask = _session.CreateDataTask(correctedRequest, (data, response, error) =>
        {
            Debug.WriteLine("### REQUEST CARRIED OUT AND RESPONSE EDITED " + requestString);
            if (error == null)
            {
                var dataString = Encoding.UTF8.GetString(data.ToArray());
                var stringWithSubsAdded = AddSubs(dataString);

                dataRequest?.Respond(NSData.FromString(stringWithSubsAdded));

                loadingRequest.FinishLoading();
            }
            else
            {
                loadingRequest.FinishLoadingWithError(error);
            }
        });
        masterPlaylistTask.Resume();
        return true;
    }

    private string AddSubs(string dataString)
    {
        var tracks = dataString.Split("\r\n").ToList();
        for (var ii = 0; ii < tracks.Count; ii++)
        {
            if (tracks[ii].StartsWith("#EXT-X-STREAM-INF"))
            {
                tracks[ii] += ",SUBTITLES=\"subs\"";
            }
        }

        tracks.AddRange(_subtitleBundles.Select(subtitle => "#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID=\"subs\",LANGUAGE=\"" + subtitle.SubtitleDto.Language + "\",NAME=\"" + subtitle.SubtitleDto.Title + "\",AUTOSELECT=YES,URI=\"" + SubtitlePlaylistBoomerangUrlPrefix + "://" + subtitle.SubtitleDto.Language + "." + SubtitleBoomerangUrlSuffix + "\""));

        var finalPlaylist = string.Join("\r\n", tracks);
        return finalPlaylist;
    }

    private void MakePlaylistAndFragments(SubtitleBundle subtitle, string vtt)
    {
        var noWhitespaceVtt = vtt.Replace(" ", "").Replace("\n", "").Replace("\r", "");
        var arrowIndex = noWhitespaceVtt.LastIndexOf("-->");
        var afterArrow = noWhitespaceVtt.Substring(arrowIndex);
        var firstColon = afterArrow.IndexOf(":");
        var period = afterArrow.IndexOf(".");
        var timeString = afterArrow.Substring(firstColon - 2, period /*(+ 2 - 2)*/);
        var lastTime = (int)TimeSpan.Parse(timeString).TotalSeconds;

        var resultLines = new List<string>
        {
            "#EXTM3U",
            "#EXT-X-TARGETDURATION:" + lastTime,
            "#EXT-X-VERSION:3",
            "#EXT-X-MEDIA-SEQUENCE:0",
            "#EXT-X-PLAYLIST-TYPE:VOD",
            "#EXTINF:" + lastTime,
            subtitle.SubtitleDto.CloudFileURL,
            "#EXT-X-ENDLIST"
        };

        subtitle.Playlist = string.Join("\r\n", resultLines);
    }

    private class SubtitleBundle
    {
        public WorkoutSubtitleDto SubtitleDto { get; set; }
        public string Playlist { get; set; }
    }

    public class WorkoutSubtitleDto
    {
        public int WorkoutID { get; set; }
        public string Language { get; set; }
        public string Title { get; set; }
        public string CloudFileURL { get; set; }
    }
}
like image 55
SomeXamarinDude Avatar answered Nov 12 '22 15:11

SomeXamarinDude


If using a streaming service where you can edit the streaming manifest and upload other files where your encoded media is, then with a little bit of manual work (which could be scripted out), you can put the subtitles in the manifest in the way that iOS expects it to be. I was able to get this to work with Azure Media Services, although it is a little hacky.

Since Azure Media Services—which I'll call AMS from now on—streaming endpoints create the streaming manifest on the fly, I couldn't just add the necessary changes to a file. Instead, I created a new master playlist based off of AMS' generated playlist. @SomeXamarinDude explains in his answer the changes that are needed in the master playlist, but I'm going to include an example for completeness.

Let's say the AMS generated master playlist from a streaming endpoint with the URL:

https://mediaservicename-use2.streaming.media.azure.net/d36754c2-c8cf-4f0f-b73f-dafd21fff50f/YOUR-ENCODED-ASSET.ism/manifest\(format\=m3u8-aapl\)

Looks like this:

#EXTM3U
#EXT-X-VERSION:4
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",NAME="aac_eng_2_128079_2_1",LANGUAGE="eng",DEFAULT=YES,AUTOSELECT=YES,URI="QualityLevels(128079)/Manifest(aac_eng_2_128079_2_1,format=m3u8-aapl)"
#EXT-X-STREAM-INF:BANDWIDTH=623543,RESOLUTION=320x180,CODECS="avc1.640015,mp4a.40.2",AUDIO="audio"
QualityLevels(466074)/Manifest(video,format=m3u8-aapl)
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=623543,RESOLUTION=320x180,CODECS="avc1.640015",URI="QualityLevels(466074)/Manifest(video,format=m3u8-aapl,type=keyframes)"
#EXT-X-STREAM-INF:BANDWIDTH=976825,RESOLUTION=480x270,CODECS="avc1.64001e,mp4a.40.2",AUDIO="audio"
QualityLevels(811751)/Manifest(video,format=m3u8-aapl)
...

Then, the manually created playlist—which I'll name manually-created-playlist.m3u8—will need to look like this:

#EXTM3U
#EXT-X-VERSION:4
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",LANGUAGE="en",AUTOSELECT=YES,URI="https://mediaservicename-use2.streaming.media.azure.net/d36754c2-c8cf-4f0f-b73f-dafd21fff50f/subtitle-playlist.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",NAME="aac_eng_2_128079_2_1",LANGUAGE="eng",DEFAULT=YES,AUTOSELECT=YES,URI="YOUR-ENCODED-ASSET.ism/QualityLevels(128079)/Manifest(aac_eng_2_128079_2_1,format=m3u8-aapl)"
#EXT-X-STREAM-INF:SUBTITLES="subs",BANDWIDTH=623543,RESOLUTION=320x180,CODECS="avc1.640015,mp4a.40.2",AUDIO="audio"
YOUR-ENCODED-ASSET.ism/QualityLevels(466074)/Manifest(video,format=m3u8-aapl)
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=623543,RESOLUTION=320x180,CODECS="avc1.640015",URI="YOUR-ENCODED-ASSET.ism/QualityLevels(466074)/Manifest(video,format=m3u8-aapl,type=keyframes)"
#EXT-X-STREAM-INF:SUBTITLES="subs",BANDWIDTH=976825,RESOLUTION=480x270,CODECS="avc1.64001e,mp4a.40.2",AUDIO="audio"
YOUR-ENCODED-ASSET.ism/QualityLevels(811751)/Manifest(video,format=m3u8-aapl)
...

Note that the path changes I had to make to the various bitrate playlists.

This manual playlist will then need to be uploaded to the same Azure Storage Container that contains the rest of your encoded media assets.

I also had to create and upload a file called subtitle-playlist.m3u8 and a transcript.vtt to the same Azure Storage Container. My subtitle playlist looked like this:

#EXTM3U
#EXT-X-TARGETDURATION:61
#EXT-X-ALLOW-CACHE:YES
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:1
#EXTINF:61.061000
https://mediaservicename-use2.streaming.media.azure.net/d36754c2-c8cf-4f0f-b73f-dafd21fff50f/transcript.vtt
#EXT-X-ENDLIST

Note that some of the subtitle playlist values depend on the length of the WebVTT file.

At this point, you should be able to point a HLS player to the following URL and be able to enable closed captions:

https://mediaservicename-use2.streaming.media.azure.net/d36754c2-c8cf-4f0f-b73f-dafd21fff50f/manually-created-master-playlist.m3u8

I hope this helps someone. Apparently there is a ticket in the works for fixing this on AMS' side.

Thank you to @SomeXamarinDude for your answer; I would have been totally lost with this issue if it weren't for all the groundwork you put in.

like image 28
zakinator123 Avatar answered Nov 12 '22 17:11

zakinator123