Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript parser for RDF/JSON from WEBVTT

Good evening.

Straight to the point - I would need a script that grabs RDF/JSON structure from specific time interval in WEBVTT file. Does such a thing exist?

RDF/JSON is Talis specified file structure that looks like this:

{ "S" : { "P" : [ O ] } }

WEBVTT implements mentioned structure like this:

0
00:00:00,000 --> 00:00:46,119
{ "S" : { "P" : [ O ] } }

1
00:00:48,000 --> 00:00:50,211
{ "S" : { "P" : [ O ] } }

...

And I would use such file while viewing the video files in such way that when I click on some part of the timeline, script fetches corresponding RDF/JSON code (I'm able to do this now, there is a WEBVTT parser already), and then parser fetches requested information in the object from the RDF/JSON structure.

I was really happy when I saw that jQuery has getJson implemented, but it works only for "normal" json files.

The best thing would probably be to just write the script, but my timing and knowledge are very limited, so I would like to hear any suggestion or solution that anybody might know.

like image 669
3mpetri Avatar asked Aug 01 '11 22:08

3mpetri


2 Answers

I've written a WebVTT parser for my <track>/HTML5 video captioning polyfill Captionator.

Feel free to pick apart the source of the development branch (which has the best WebVTT compliance, so it's probably better to look at that rather than the stable branch.)

The parser code starts here: https://github.com/cgiffard/Captionator/blob/captioncrunch/js/captionator.js#L1686

Ultimately though, what you're describing seems to roughly match the intended use case for the metadata track type (as described in the WHATWG's TimedTextTrack spec.) You can use Captionator (I'd love to recommend to you another library as well, but I'm not aware of anything else that doesn't come bundled with an entire video player, or that implements the TimedTextTrack JS API you'll need) to provide support for it - the TextTrack.oncuechange event and TextTrack.activeCues list enable you to listen for changes to cues when the user seeks within the video timeline. You can then get the text of each cue (less the cue metadata and header) and parse it as JSON. Just set up a caption track like below:

<video src="myvideo.webm" poster="poster.jpg" width="512" height="288">
    <track kind="metadata" src="meta.webvtt" type="text/webvtt" srclang="en" label="Metadata Track" default />
</video>

Then, include the captionator library, initialise it as per the documentation, select your track and set up an event handler. You can access the text of an individual cue like so:

var cueText = document.getElementById("video").tracks[0].activeCues[0].getCueAsSource();

Then just:

var RDFData = JSON.parse(cueText);

Good luck :)

like image 129
Christopher Avatar answered Sep 28 '22 00:09

Christopher


It seems that the RDF/JSON is in fact complex and nested JSON structure with vectors, so getJSON function will successfully parse data from it once its fetched from WEBVTT timed structure.

like image 34
3mpetri Avatar answered Sep 27 '22 22:09

3mpetri