I would like to know how to extract youtube video id or playlist id depending upon the url using a single regex expression. The regex should also ensure that the domain is youtube.com Here are some of the results I need:
Extract Playlist ID For
https://www.youtube.com/playlist?list=PLuC2HflhhpLGQ4RgqA76_Gv52fGA0909r
www.youtube.com/playlist?list=PLuC2HflhhpLGQ4RgqA76_Gv52fGA0909r
http://www.youtube.com/playlist?list=PLuC2HflhhpLGQ4RgqA76_Gv52fGA0909r
https://www.youtube.com/embed/videoseries?list=PLuC2HflhhpLGQ4RgqA76_Gv52fGA0909r
Extract Video ID For
https://www.youtube.com/watch?v=fqMfRi2gJok&index=1&list=PLuC2HflhhpLGQ4RgqA76_Gv52fGA0909r
https://www.youtube.com/watch?v=fqMfRi2gJok
http://youtu.be/cCnrX1w5luM
http://youtube.com/embed/cCnrX1w5luM
http://youtube.com/v/cCnrX1w5luM
https://www.youtube.com/v/cCnrX1w5luM
www.youtube.com/v/cCnrX1w5luM
youtube.com/v/cCnrX1w5luM
These are just example urls only. I need to extract respective ID's for all possible youtube link structures.
In short extract video id and if it is absent obtain playlist id.
Your problem is explicitly has two patterns
The first:
^.*?(?:v|list)=(.*?)(?:&|$)
For any urls which have explicit attribute, or you can say they have =
symbol in url.
Explanation
^.*?(?:v|list)=
: Any string till word v=
or list=
which here we prefer v
over list
,
(.*?)(?:&|$)
: Any string which ended by &
symbol or ending line symbol $
which here we prefer &
over $
.
The second:
^(?:(?!=).)*\/(.*)$
For any url which don't have attribute or there is no =
symbol in url.
Explanation
^(?:(?!=).)*\/
: Any string which don't have =
symbol (here handle by the negative lookahead (?!=)
) till /
symbol,
(.*)$
: Any string till the end of line.
Combine them into one regex we get
^(?:https?:\/\/)?(?:www\.)?youtu\.?be(?:\.com)?.*?(?:v|list)=(.*?)(?:&|$)|^(?:https?:\/\/)?(?:www\.)?youtu\.?be(?:\.com)?(?:(?!=).)*\/(.*)$
here,
(?:https?:\/\/)?(?:www\.)?youtu\.?be(?:\.com)?
is added to handle various form of www.youtube.com's url
and this should help you get what you want
see: DEMO
IMPORTANT NOTE: This question, questioner want to extract id
from www.youtube.com which he prefer "video id" over "playlist id".
https://regex101.com/r/mI3qY9/4
This regex assumes you are giving it a legitimate Youtube link. This grabs all the v
and lists
together:
/(?:(?:\?|&)(?:v|list)=|embed\/|v\/|youtu\.be\/)((?!videoseries)[a-zA-Z0-9_]*)/g
Breakdown:
/
(?: //non-capturing group
(?:\?|&)(?:v|list)= //? or & following a v or list
| //or
embed\/ //embed/
| //or
v\/ //v/
| //or
youtu\.be\/ //youtu.be/
)
(
(?!videoseries) //will not capture "videoseries"
[a-zA-Z0-9_]* //capture any alphabet digits or underscore that follows afterwards
)
/g //global
But you may not be able to tell which is v
and which is list
, so,
This only grabs the v
:
/(?:(?:\?|&)v=|embed\/|v\/|youtu\.be\/)((?!videoseries)[a-zA-Z0-9_]*)/g
This only grabs the list
:
/(?:(?:\?|&)list=)((?!videoseries)[a-zA-Z0-9_]*)/g
This only grabs YouTube v
s:
/(?:youtube\.com.*(?:\?|&)(?:v)=|youtube\.com.*embed\/|youtube\.com.*v\/|youtu\.be\/)((?!videoseries)[a-zA-Z0-9_]*)/g
Only YouTube list
s:
/(?:youtube\.com.*(?:\?|&)(?:list)=)((?!videoseries)[a-zA-Z0-9_]*)/g
This is basically the same but adding youtube\.com.*
too to the regex. It won't grab e.g. http://example.com/v/abc
https://regex101.com/r/mI3qY9/5
Explanation:
youtube\.com.* //Matches youtube.com and any multiple characters followed
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With