Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract link attributes from string of HTML

Tags:

php

What's the best way to extract HTML out of $var?

example of $var

$var = "<a href="http://stackoverflow.com/">Stack Overflow</a>"

I want

$var2 = "http://stackoverflow.com/"

example: preg_match();

what else?

like image 529
alexus Avatar asked Jan 15 '10 01:01

alexus


2 Answers

Instead of crafting long complicated regex, do it in steps

$str = '<a href="http://stackoverflow.com/"> Stack Overflow</a>';
$str = preg_replace("/.*<a\s+href=\"/","",$str);
print preg_replace("/\">.*/","",$str);

one way of "non regex", using explode

$str = '<a href="http://stackoverflow.com/"> Stack Overflow</a>';
$s = explode('href="',$str);
$t = explode('">',$s[1]);
print $t[0];
like image 123
ghostdog74 Avatar answered Oct 04 '22 00:10

ghostdog74


If it's a valid HTML string that you have, then the DOMDocument module's loadHTML() function will work, and you can navigate your structure very easily. This is a good way to do it if you have a lot of HTML to work with.

$doc = new DOMDocument();
$doc->loadHTML('<a href="http://stackoverflow.com/">Stack Overflow</a>');
$anchors = $doc->getElementsByTagName('a');
foreach($anchors as $node) {
    echo $node->textContent;
    if ($node->hasAttributes()) {
        foreach($node->attributes as $a) {
            echo ' | '.$a->name.': '.$a->value;
        }
    }
}

produces the following:

Stack Overflow | href: http://stackoverflow.com/ 
like image 26
zombat Avatar answered Oct 04 '22 00:10

zombat