Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scrape YouTube video from a specific channel and search?

I am using this code to get the url of a youtube channel it works fine, but I would like to add an option to search for a video with a specific title within the channel. and get the url of the first video you find with the search phrase

from bs4 import BeautifulSoup
import requests

url="https://www.youtube.com/feeds/videos.xml?user=LinusTechTips"
html = requests.get(url)
soup = BeautifulSoup(html.text, "lxml")

for entry in soup.find_all("entry"):
    for link in entry.find_all("link"):
        print(link["href"])

like image 924
Kaz25 Avatar asked Dec 14 '22 08:12

Kaz25


1 Answers

In my last answer, you get all the video titles in the given youtube channel, as what you looking for But in the comments between us, you tell me you wanna run the script via cronjob, it takes more effort, so I add another answer.

from bs4 import BeautifulSoup
from lxml import etree
import urllib
import requests
import sys

def fetch_titles(url):
    video_titles = []
    html = requests.get(url)
    soup = BeautifulSoup(html.text, "lxml")
    for entry in soup.find_all("entry"):
        for link in entry.find_all("link"):
            youtube = etree.HTML(urllib.request.urlopen(link["href"]).read()) 
            video_title = youtube.xpath("//span[@id='eow-title']/@title") 
            if len(video_title)>0:
                video_titles.append({"title":video_title[0], "url":link.attrs["href"]})
    return video_titles

def main():
    if sys.argv.__len__() == 1:
        print("Error: You should specifying keyword")
        print("eg: python3 ./main.py KEYWORD")
        return

    url="https://www.youtube.com/feeds/videos.xml?user=LinusTechTips"
    keyword = sys.argv[1]

    video_titles = fetch_titles(url)
    for video in video_titles:
        if video["title"].__contains__(keyword):
            print(video["url"])
            break # add this line, if you want to print the first match only


if __name__ == "__main__":
    main()

When you call the script via Terminal, you should specify the keyword, like this:

$ python3 ./main.py Mac

Which Mac is the keyword and main.py is the python script filename

Output:

https://www.youtube.com/watch?v=l_IHSRPVqwQ

like image 156
Peyman Majidi Avatar answered Jan 01 '23 06:01

Peyman Majidi