Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

howto crawl all comments of single clip from youtube, more than 100 page

I need to crawl down all of the comments (more than 2,600,000 comments, over 5000 pages) for PSY's Gangnam Style Video from YouTube, see: http://www.youtube.com/all_comments?v=9bZkp7q19f0

The problem is:

1) If I use gdata service, google provides only no more than 1000 comment feeds

2) If I directly crawl html tags from:

site(http://www.youtube.com/all_comments?v=9bZkp7q19f0&page=$(page))

by increasing the page parameter, it would fail after page #101, where no comments displayed on the page.

So plz everyone, how can I get around this problem?

P.S: My crawler is implemented as a chrome extension using javascript, which checks the comment tags of the loaded page, and then loading next page.

like image 819
Robin Hsiang Avatar asked Nov 12 '22 18:11

Robin Hsiang


1 Answers

You may be able to extract the data by crawling the pages and hacking the code for the problems encountered, but that is not the proper way.

You should use the youtube api for this and check the other developer resources concerning to this.

like image 194
mtk Avatar answered Nov 15 '22 09:11

mtk