Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Youtube v3 API captions downloading

I'm trying to download captions from some videos on Youtube using their nuget package. Here's some code:

var request = _youtube.Search.List("snippet,id");
request.Q = "Bill Gates";
request.MaxResults = 50;
request.Type = "video";
var results = request.Execute();
foreach (var result in results.Items)
{
    var captionListRequest = _youtube.Captions.List("id,snippet", result.Id.VideoId);
    var captionListResponse = captionListRequest.Execute();
    var russianCaptions =
        captionListResponse.Items.FirstOrDefault(c => c.Snippet.Language.ToLower() == "ru");
    if (russianCaptions != null)
    {
        var downloadRequest = _youtube.Captions.Download(russianCaptions.Id);
        downloadRequest.Tfmt = CaptionsResource.DownloadRequest.TfmtEnum.Srt;
        var ms = new MemoryStream();
        downloadRequest.Download(ms);
    }
}

When the Download method is called I'm getting a weird Newtonsoft.JSON Exception that says:

    Newtonsoft.Json.JsonReaderException: 'Unexpected character encountered while parsing value: T. Path '', line 0, position 0.'
   at Newtonsoft.Json.JsonTextReader.ParseValue()

I've read some other threads on captions downloading problems and have tried to change my authorization workflow: first I've tried to use just the ApiKey but then also tried OAuth. Here's how it looks now:

 var credential = GoogleWebAuthorizationBroker.AuthorizeAsync(
    new ClientSecrets
    {
        ClientId = "CLIENT_ID",
        ClientSecret = "CLIENT_SECRET"
    },
    new[] { YouTubeService.Scope.YoutubeForceSsl },
    "user",
    CancellationToken.None,
    new FileDataStore("Youtube.CaptionsCrawler")).Result;

_youtube = new YouTubeService(new BaseClientService.Initializer
{
    ApplicationName = "LKS Captions downloader",
    HttpClientInitializer = credential
});

So, is it even possible to do what I'm trying to achieve?

P.S. I was able to dig deep into the youtube nuget package and as I see, the actual message, that I get (that Newtonsoft.JSON is trying to deserialize, huh!) is "The permissions associated with the request are not sufficient to download the caption track. The request might not be properly authorized, or the video order might not have enabled third-party contributions for this caption."

So, do I have to be the video owner to download captions? But if so, how do other programs like Google2SRT work?

like image 431
Daniel Vygolov Avatar asked Sep 16 '17 16:09

Daniel Vygolov


1 Answers

Found this post How to get "transcript" in youtube-api v3

You can get them via GET request on: http://video.google.com/timedtext?lang={LANG}&v={VIDEOID}

Example: http://video.google.com/timedtext?lang=en&v=-osCkzoL53U

Note that they should have subtitles added, will not work if auto-generated.

like image 191
Janis S. Avatar answered Sep 28 '22 09:09

Janis S.