Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

parse youtube video id using preg_match [duplicate]

I am attempting to parse the video ID of a youtube URL using preg_match. I found a regular expression on this site that appears to work;

(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+

As shown in this pic:

alt text

My PHP is as follows, but it doesn't work (gives Unknown modifier '[' error)...

<?
 $subject = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1";

 preg_match("(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+", $subject, $matches);

 print "<pre>";
 print_r($matches);
 print "</pre>";

?>

Cheers

like image 280
J.C Avatar asked May 29 '10 20:05

J.C


People also ask

What is the use of Preg_match () method?

The preg_match() function returns whether a match was found in a string.

What is the difference between Preg_match and Preg_match_all?

preg_match stops looking after the first match. preg_match_all , on the other hand, continues to look until it finishes processing the entire string. Once match is found, it uses the remainder of the string to try and apply another match.


8 Answers

This regex grabs the ID from all of the various URLs I could find... There may be more out there, but I couldn't find reference of them anywhere. If you come across one this doesn't match, please leave a comment with the URL, and I'll try and update the regex to match your URL.

if (preg_match('%(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/\s]{11})%i', $url, $match)) {
    $video_id = $match[1];
}

Here is a sample of the URLs this regex matches: (there can be more content after the given URL that will be ignored)

  • http://youtu.be/dQw4w9WgXcQ ...
  • http://www.youtube.com/embed/dQw4w9WgXcQ ...
  • http://www.youtube.com/watch?v=dQw4w9WgXcQ ...
  • http://www.youtube.com/?v=dQw4w9WgXcQ ...
  • http://www.youtube.com/v/dQw4w9WgXcQ ...
  • http://www.youtube.com/e/dQw4w9WgXcQ ...
  • http://www.youtube.com/user/username#p/u/11/dQw4w9WgXcQ ...
  • http://www.youtube.com/sandalsResorts#p/c/54B8C800269D7C1B/0/dQw4w9WgXcQ ...
  • http://www.youtube.com/watch?feature=player_embedded&v=dQw4w9WgXcQ ...
  • http://www.youtube.com/?feature=player_embedded&v=dQw4w9WgXcQ ...

It also works on the youtube-nocookie.com URL with the same above options.

It will also pull the ID from the URL in an embed code (both iframe and object tags)

like image 83
Benjam Avatar answered Oct 07 '22 19:10

Benjam


Better use parse_url and parse_str to parse the URL and query string:

$subject = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1";
$url = parse_url($subject);
parse_str($url['query'], $query);
var_dump($query);
like image 37
Gumbo Avatar answered Oct 07 '22 17:10

Gumbo


I had to deal with this for a PHP class i wrote a few weeks ago and ended up with a regex that matches any kind of strings: With or without URL scheme, with or without subdomain, youtube.com URL strings, youtu.be URL strings and dealing with all kind of parameter sorting. You can check it out at GitHub or simply copy and paste the code block below:

/**
 *  Check if input string is a valid YouTube URL
 *  and try to extract the YouTube Video ID from it.
 *  @author  Stephan Schmitz <[email protected]>
 *  @param   $url   string   The string that shall be checked.
 *  @return  mixed           Returns YouTube Video ID, or (boolean) false.
 */        
function parse_yturl($url) 
{
    $pattern = '#^(?:https?://)?(?:www\.)?(?:youtu\.be/|youtube\.com(?:/embed/|/v/|/watch\?v=|/watch\?.+&v=))([\w-]{11})(?:.+)?$#x';
    preg_match($pattern, $url, $matches);
    return (isset($matches[1])) ? $matches[1] : false;
}

To explain the regex, here's a spilt up version:

/**
 *  Check if input string is a valid YouTube URL
 *  and try to extract the YouTube Video ID from it.
 *  @author  Stephan Schmitz <[email protected]>
 *  @param   $url   string   The string that shall be checked.
 *  @return  mixed           Returns YouTube Video ID, or (boolean) false.
 */        
function parse_yturl($url) 
{
    $pattern = '#^(?:https?://)?';    # Optional URL scheme. Either http or https.
    $pattern .= '(?:www\.)?';         #  Optional www subdomain.
    $pattern .= '(?:';                #  Group host alternatives:
    $pattern .=   'youtu\.be/';       #    Either youtu.be,
    $pattern .=   '|youtube\.com';    #    or youtube.com
    $pattern .=   '(?:';              #    Group path alternatives:
    $pattern .=     '/embed/';        #      Either /embed/,
    $pattern .=     '|/v/';           #      or /v/,
    $pattern .=     '|/watch\?v=';    #      or /watch?v=,    
    $pattern .=     '|/watch\?.+&v='; #      or /watch?other_param&v=
    $pattern .=   ')';                #    End path alternatives.
    $pattern .= ')';                  #  End host alternatives.
    $pattern .= '([\w-]{11})';        # 11 characters (Length of Youtube video ids).
    $pattern .= '(?:.+)?$#x';         # Optional other ending URL parameters.
    preg_match($pattern, $url, $matches);
    return (isset($matches[1])) ? $matches[1] : false;
}
like image 27
eyecatchUp Avatar answered Oct 07 '22 19:10

eyecatchUp


I perfected regex from the leader answer. It also grabs the ID from all of the various URLs, but more correctly.

if (preg_match('%(?:youtube(?:-nocookie)?\.com/(?:[\w\-?&!#=,;]+/[\w\-?&!#=/,;]+/|(?:v|e(?:mbed)?)/|[\w\-?&!#=,;]*[?&]v=)|youtu\.be/)([\w-]{11})(?:[^\w-]|\Z)%i', $url, $match)) {
    $video_id = $match[1];
}

Also, it correctly handles the wrong IDs, which more than 11 characters.

http://www.youtube.com/watch?v=0zM3nApSvMgDw3qlxF

like image 35
Modder Avatar answered Oct 07 '22 18:10

Modder


Use

 preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+#", $subject, $matches);
like image 23
Dogbert Avatar answered Oct 07 '22 19:10

Dogbert


You forgot to escape the slash character. So this one should do the job:

preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]\/)[^&\n]+|(?<=v=)[^&\n]+#", $subject, $matches);
like image 28
Novan Adrian Avatar answered Oct 07 '22 18:10

Novan Adrian


Parse Start parameter for BBcode (https://developers.google.com/youtube/player_parameters#start)

example: [yt]http://www.youtube.com/watch?v=G059ou-7wmo#t=58[/yt]

PHP regex:

'#\[yt\]https?://(?:[0-9A-Z-]+\.)?(?:youtu\.be/|youtube\.com(?:/embed/|/v/|/watch\?v=|/ytscreeningroom\?v=|/feeds/api/videos/|/user\S*[^\w\-\s]|\S*[^\w\-\s]))([\w\-]{11})[?=#&+%\w-]*(t=(\d+))?\[/yt\]#Uim'

replace:

'<iframe id="ytplayer" type="text/html" width="639" height="360" src="http://www.youtube.com/embed/$1?rel=0&vq=hd1080&start=$3" frameborder="0" allowfullscreen></iframe>'
like image 41
Fixer Avatar answered Oct 07 '22 18:10

Fixer


I didn't see anyone directly address the PHP error, so I'll try to explain.

The reason for the "Unknown modifier '['" error is that you forgot to wrap your regex in delimiters. PHP just takes the first character as a delimiter, so long as it's a non-alphanumeric, non-whitespace ASCII character. So in your regex:

preg_match("(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+", $subject, $matches);

PHP thinks you meant ( as an opening delimiter. It then finds what it thinks is your closing delimiter, the next ) and assumes what follows are pattern modifiers. However it finds that your first pattern modifier, the next character after the first ), is [. [ is obviously not a valid pattern modifier, which is why you get the error that you do.

The solution is to simply wrap your regex in delimiters and make sure any delimiters within the regex that you want to match literally are escaped. I like to use ~ as delimiters, b/c you rarely need to match a literal ~ in a regex.

like image 40
m4olivei Avatar answered Oct 07 '22 17:10

m4olivei