Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP Regex HTML - Extract URL

Tags:

html

regex

php

I am trying to extract multiple URLs from HTML file with regex. There are other URLs in the file, do the only pattern i have is "tableentries." and ""

HTML code example:

<tr class="tableentries2">
  <td>
    <a href="http://example.com/all-files/files/00000000789/">Click Here</a>
  </td>

PHP I wrote:

$html = "value of the code above"
if(preg_match_all('/<td>.*</td>/', $html, $match)){
foreach($match[0] as $x){

echo $x . "<br>";

}}
like image 902
Rajesh Muntari Avatar asked May 21 '26 15:05

Rajesh Muntari


2 Answers

Why not just look for href values? (Updated because the edited code now has quotation marks.)

preg_match_all('/href="([^\s"]+)/', $html, $match);

Then the URI would be in $match[1][0].

like image 57
sdleihssirhc Avatar answered May 23 '26 06:05

sdleihssirhc


You really shouldn't use regex to parse HTML. DOMDocument is actually very easy to use for this type of thing. here is a simple example.

<?php
error_reporting(E_ALL);
$html = "
<table>
    <tr>
        <td>
            <a href='http://www.test1-1.com'>test1-1</a>
        </td>
        <td>
            <a href='http://www.test1-2.com'>test1-2</a>
        </td>
        <td>
            <a href='http://www.test1-3.com'>test1-3</a>
        </td>
    </tr>
    <tr>
        <td>
            <a href='http://www.test2-1.com'>test2-1</a>
        </td>
        <td>
            <a href='http://www.test2-2.com'>test2-2</a>
        </td>
        <td>
            <a href='http://www.test2-3.com'>test2-3</a>
        </td>
    </tr>
</table>";

$DOM = new DOMDocument();
//load the html string into the DOMDocument
$DOM->loadHTML($html);
//get a list of all <A> tags
$a = $DOM->getElementsByTagName('a');
//loop through all <A> tags
foreach($a as $link){
    //echo out the href attribute of the <A> tag.
    echo $link->getAttribute('href').'<br />';
}
?>

This would output:

http://www.test1-1.com
http://www.test1-2.com
http://www.test1-3.com
http://www.test2-1.com
http://www.test2-2.com
http://www.test2-3.com
like image 23
Jonathan Kuhn Avatar answered May 23 '26 05:05

Jonathan Kuhn



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!