Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

View all comments on a YouTube video

Tags:

java

youtube

I am trying to get all the comments on a YouTube video using a Java program. I cannot get them though as it has the "Show More" instead of all the comments. I'm looking for a way to get all the comments or pages of comments that I can go through. I have a video id and things, just need the comments.

I have tried all_comments instead of watch in the URL but it doesn't show all comments still and redirects to watch again.

I have also looked at the YouTube api and can only find how to get comments with their id but I need to get all comments from a video id.

If anyone knows how to do this please tell me.

I have added a 50 rep bounty for whoever can give me a good answer to this.

like image 880
Walshy Avatar asked Feb 16 '16 19:02

Walshy


2 Answers

try this it can download all the comments for a given video which i have tested.

https://github.com/egbertbouman/youtube-comment-downloader

python downloader.py --youtubeid YcZkCnPs45s --output OUT 
Downloading Youtube comments for video: YcZkCnPs45s 
Downloaded 1170 comment(s) 
Done!

output is in the JSON format:

{
  "text": "+Tony Northrup many thanks for the prompt reply - I'll try that.",
  "time": "1 day ago",
  "cid": "z13nfbog0ovqyntk322txzjamuensvpch.1455717946638546"
}
like image 109
samsamara Avatar answered Sep 29 '22 07:09

samsamara


You need to get comment threads list request for your video and then scroll forward using next page token from the last response:

private static int counter = 0;
private static YouTube youtube;

public static void main(String[] args) throws Exception {
    // For Auth details consider:
    // https://github.com/youtube/api-samples/blob/master/java/src/main/java/com/google/api/services/samples/youtube/cmdline/Auth.java
    // Also don't forget secrets https://github.com/youtube/api-samples/blob/master/java/src/main/resources/client_secrets.json
    List<String> scopes = Lists.newArrayList("https://www.googleapis.com/auth/youtube.force-ssl");
    Credential credential = Auth.authorize(scopes, "commentthreads");
    youtube = new YouTube.Builder(Auth.HTTP_TRANSPORT, Auth.JSON_FACTORY, credential).build();

    String videoId = "video_id";

    // Get video comments threads
    CommentThreadListResponse commentsPage = prepareListRequest(videoId).execute();

    while (true) {
        handleCommentsThreads(commentsPage.getItems());

        String nextPageToken = commentsPage.getNextPageToken();
        if (nextPageToken == null)
            break;

        // Get next page of video comments threads
        commentsPage = prepareListRequest(videoId).setPageToken(nextPageToken).execute();
    }

    System.out.println("Total: " + counter);
}

private static YouTube.CommentThreads.List prepareListRequest(String videoId) throws Exception {

    return youtube.commentThreads()
                  .list("snippet,replies")
                  .setVideoId(videoId)
                  .setMaxResults(100L)
                  .setModerationStatus("published")
                  .setTextFormat("plainText");
}

private static void handleCommentsThreads(List<CommentThread> commentThreads) {

    for (CommentThread commentThread : commentThreads) {
        List<Comment> comments = Lists.newArrayList();
        comments.add(commentThread.getSnippet().getTopLevelComment());

        CommentThreadReplies replies = commentThread.getReplies();
        if (replies != null)
            comments.addAll(replies.getComments());

        System.out.println("Found " + comments.size() + " comments.");

        // Do your comments logic here
        counter += comments.size();
    }
}

Consider api-samples, if you need a sample skeleton project.


Update

The situation when you can't get all the comments can be also caused by the quota limits (at least I faced it):

  • units/day 50,000,000
  • units/100seconds/user 300,000

This is not a java, python, js, or whatever language specific rules. If you want to get above the quota, you cant try to apply for higher quota. Though, I would start from controlling your throughput. It's very easy to get above the 100seconds/user quota.

like image 25
Michael Cheremuhin Avatar answered Sep 29 '22 07:09

Michael Cheremuhin