Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx pattern to get the YouTube video ID from any YouTube URL

Let's take these URLs as an example:

  1. http://www.youtube.com/watch?v=8GqqjVXhfMU&feature=youtube_gdata_player
  2. http://www.youtube.com/watch?v=8GqqjVXhfMU

This PHP function will NOT properly obtain the ID in case 1, but will in case 2. Case 1 is very common, where ANYTHING can come behind the YouTube ID.

/**
 * get YouTube video ID from URL
 *
 * @param string $url
 * @return string YouTube video id or FALSE if none found. 
 */
function youtube_id_from_url($url) {
    $pattern = 
        '%^# Match any YouTube URL
        (?:https?://)?  # Optional scheme. Either http or https
        (?:www\.)?      # Optional www subdomain
        (?:             # Group host alternatives
          youtu\.be/    # Either youtu.be,
        | youtube\.com  # or youtube.com
          (?:           # Group path alternatives
            /embed/     # Either /embed/
          | /v/         # or /v/
          | /watch\?v=  # or /watch\?v=
          )             # End path alternatives.
        )               # End host alternatives.
        ([\w-]{10,12})  # Allow 10-12 for 11 char YouTube id.
        $%x'
        ;
    $result = preg_match($pattern, $url, $matches);
    if (false !== $result) {
        return $matches[1];
    }
    return false;
}

What I'm thinking is that there must be a way where I can just look for the "v=", no matter where it lies in the URL, and take the characters after that. In this manner, no complex RegEx will be needed. Is this off base? Any ideas for starting points?

like image 423
Shackrock Avatar asked Mar 07 '12 02:03

Shackrock


People also ask

How do I get the YouTube video ID from a URL?

The video ID will be located in the URL of the video page, right after the v= URL parameter. In this case, the URL of the video is: https://www.youtube.com/watch?v=aqz-KE-bpKQ. Therefore, the ID of the video is aqz-KE-bpKQ .

What characters are in a YouTube video ID?

YouTube ID is a string of 11 characters, which consists of both upper and lower case alphabets and numeric values. It is used to define a YouTube video uniquely.


2 Answers

if (preg_match('/youtube\.com\/watch\?v=([^\&\?\/]+)/', $url, $id)) {
  $values = $id[1];
} else if (preg_match('/youtube\.com\/embed\/([^\&\?\/]+)/', $url, $id)) {
  $values = $id[1];
} else if (preg_match('/youtube\.com\/v\/([^\&\?\/]+)/', $url, $id)) {
  $values = $id[1];
} else if (preg_match('/youtu\.be\/([^\&\?\/]+)/', $url, $id)) {
  $values = $id[1];
}
else if (preg_match('/youtube\.com\/verify_age\?next_url=\/watch%3Fv%3D([^\&\?\/]+)/', $url, $id)) {
    $values = $id[1];
} else {   
// not an youtube video
}

This is what I use to extract the id from an youtube url. I think it works in all cases.

Note that at the end $values = id of the video

like image 66
user1236048 Avatar answered Sep 20 '22 21:09

user1236048


I have used the following patterns because YouTube has a youtube-nocookie.com domain too:

'@youtube(?:-nocookie)?\.com/watch[#\?].*?v=([^"\& ]+)@i',
'@youtube(?:-nocookie)?\.com/embed/([^"\&\? ]+)@i',
'@youtube(?:-nocookie)?\.com/v/([^"\&\? ]+)@i',
'@youtube(?:-nocookie)?\.com/\?v=([^"\& ]+)@i',
'@youtu\.be/([^"\&\? ]+)@i',
'@gdata\.youtube\.com/feeds/api/videos/([^"\&\? ]+)@i',

In your case it would only mean to extend the existing expressions with an optional (-nocookie) for the regular YouTube.com URL like so:

if (preg_match('/youtube(?:-nocookie)\.com\/watch\?v=([^\&\?\/]+)/', $url, $id)) {

If you change your proposed expression to NOT contain the final $, it should work like you intended. I added the -nocookie as well.

/**
 * get YouTube video ID from URL
 *
 * @param string $url
 * @return string YouTube video id or FALSE if none found. 
 */
function youtube_id_from_url($url) {
    $pattern = 
        '%^# Match any YouTube URL
        (?:https?://)?  # Optional scheme. Either http or https
        (?:www\.)?      # Optional www subdomain
        (?:             # Group host alternatives
          youtu\.be/    # Either youtu.be,
        |youtube(?:-nocookie)?\.com  # or youtube.com and youtube-nocookie
          (?:           # Group path alternatives
            /embed/     # Either /embed/
          | /v/         # or /v/
          | /watch\?v=  # or /watch\?v=
          )             # End path alternatives.
        )               # End host alternatives.
        ([\w-]{10,12})  # Allow 10-12 for 11 char YouTube id.
        %x'
        ;
    $result = preg_match($pattern, $url, $matches);
    if (false !== $result) {
        return $matches[1];
    }
    return false;
}
like image 31
teezee Avatar answered Sep 21 '22 21:09

teezee