Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract individual links from a single youtube playlist link using python

I need a python script that takes link to a single youtube playlist and then gives out a list containing the links to individual videos in the playlist.

I realize that same question was asked few years ago, but it was asked for python2.x and the codes in the answer don't work properly. They are very weird, they work sometimes but give empty output once in a while(maybe some of the packages used there have been updated, I don't know). I've included one of those code below.

If any of you don't believe, run this code several times you'll receive empty list once in a while, but most of the time it does the job of breaking down a playlist.

from bs4 import BeautifulSoup as bs
import requests

r = requests.get('https://www.youtube.com/playlist?list=PL3D7BFF1DDBDAAFE5')
page = r.text
soup=bs(page,'html.parser')
res=soup.find_all('a',{'class':'pl-video-title-link'})
for l in res:
    print(l.get("href"))

In case of some playlists the code just doesn't work at all.

Also, if beautifulsoup can't do the job, any other popular python library will do.

like image 971
theCursedPirate Avatar asked Jun 12 '20 13:06

theCursedPirate


People also ask

How to extract image information of individual video from YouTube playlist?

In this article, we will learn how to extract Image Information of Individual video from the YouTube Playlist Using Python Inside list method, pass contentDetails in part property and in playlistId property pass PlaylistID or PlaylistURL Use Up/Down Arrow keys to increase or decrease volume.

How to get the URL of a YouTube playlist?

a) Open any of YouTube playlist you want to get their individual video URL. b) Then on the browser, get the full YouTube playlist URL and grab anything after “list=”. 4) Execute!

How to download YouTube videos from YouTube playlist using pytube?

From the given URL of a YouTube playlist, our program will perform web scraping and fetch all the YouTube video links and append it under a links array. Then using the pytube library we will download the corresponding YouTube videos from the link in the links array.

How to get next video from YouTube playlist with more than 50?

Now using YouTube API, here’s how you can get the next videos if the YouTube playlist contains more than 50 videos: Using the same YouTube API, now you have to fill in the pageToken with either nextPageToken or prevPageToken. 2) Execute!


1 Answers

It seems youtube loads sometimes different versions of the page, sometimes with html organized like you expected using links with pl-video-title-link class :

<td class="pl-video-title">
   <a class="pl-video-title-link yt-uix-tile-link yt-uix-sessionlink  spf-link " dir="ltr" href="/watch?v=GtWXOzsD5Fw&amp;list=PL3D7BFF1DDBDAAFE5&amp;index=101&amp;t=0s" data-sessionlink="ei=TJbjXtC8NYri0wWCxarQDQ&amp;feature=plpp_video&amp;ved=CGoQxjQYYyITCNCSmqHD_OkCFQrxtAodgqIK2ij6LA">
   Android Application Development Tutorial - 105 - Spinners and ArrayAdapter
   </a>
   <div class="pl-video-owner">
      de <a href="/user/thenewboston" class=" yt-uix-sessionlink      spf-link " data-sessionlink="ei=TJbjXtC8NYri0wWCxarQDQ&amp;feature=playlist&amp;ved=CGoQxjQYYyITCNCSmqHD_OkCFQrxtAodgqIK2ij6LA" >thenewboston</a>
   </div>
   <div class="pl-video-bottom-standalone-badge">
   </div>
</td>

Sometimes with data embedded in a JS variables and loaded dynamically :

window["ytInitialData"] = { .... very big json here .... };

For the second version, you will need to use regex to parse Javascript unless you want to use tools like selenium to grab the content after page load.

The best way is to use the official API which is straightforward to get the playlist items :

  • Go to Google Developer Console, search Youtube Data API / enable Youtube Data API v3

enable API screen

  • Click on Create Credentials / Youtube Data API v3 / Public data Create Credentials Popup Credential Type

  • Alternatively (For Credentials Creation) Go to Credentials / Create Credentials / API key

API key screen

  • install google api client for python :

      pip3 install --upgrade google-api-python-client
    

Use the API key in the script below. This script fetch playlist items for playlist with id PL3D7BFF1DDBDAAFE5, use pagination to get all of them, and re-create the link from the videoId and playlistID :

import googleapiclient.discovery
from urllib.parse import parse_qs, urlparse

#extract playlist id from url
url = 'https://www.youtube.com/playlist?list=PL3D7BFF1DDBDAAFE5'
query = parse_qs(urlparse(url).query, keep_blank_values=True)
playlist_id = query["list"][0]

print(f'get all playlist items links from {playlist_id}')
youtube = googleapiclient.discovery.build("youtube", "v3", developerKey = "YOUR_API_KEY")

request = youtube.playlistItems().list(
    part = "snippet",
    playlistId = playlist_id,
    maxResults = 50
)
response = request.execute()

playlist_items = []
while request is not None:
    response = request.execute()
    playlist_items += response["items"]
    request = youtube.playlistItems().list_next(request, response)

print(f"total: {len(playlist_items)}")
print([ 
    f'https://www.youtube.com/watch?v={t["snippet"]["resourceId"]["videoId"]}&list={playlist_id}&t=0s'
    for t in playlist_items
])

Output:

get all playlist items links from PL3D7BFF1DDBDAAFE5
total: 195
[
    'https://www.youtube.com/watch?v=SUOWNXGRc6g&list=PL3D7BFF1DDBDAAFE5&t=0s', 
    'https://www.youtube.com/watch?v=857zrsYZKGo&list=PL3D7BFF1DDBDAAFE5&t=0s', 
    'https://www.youtube.com/watch?v=Da1jlmwuW_w&list=PL3D7BFF1DDBDAAFE5&t=0s',
    ...........
    'https://www.youtube.com/watch?v=1j4prh3NAZE&list=PL3D7BFF1DDBDAAFE5&t=0s', 
    'https://www.youtube.com/watch?v=s9ryE6GwhmA&list=PL3D7BFF1DDBDAAFE5&t=0s'
]
like image 151
Bertrand Martel Avatar answered Sep 30 '22 14:09

Bertrand Martel