Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP Xpath: Get all href's that contain "letter"

Say I have an html file that I have loaded, I run this query:

$url = 'http://www.fangraphs.com/players.aspx';
$html = file_get_contents($url);    
$myDom = new DOMDocument;
$myDom->formatOutput = true;
@$myDom->loadHTML($html);
$anchor = $xpath->query('//a[contains(@href,"letter")]');

That gives me a list of these anchors that look like the following:

<a href="players.aspx?letter=Aa">Aa</a>

But I need a way to only get "players.aspx?letter=Aa".

I thought I could try:

$anchor = $xpath->query('//a[contains(@href,"letter")]/@href');

But that gives me a php error saying I couldn't append node when I try the following:

$xpath = new DOMXPath($myDom);
$newDom = new DOMDocument;
$j = 0;
while( $myAnchor = $anchor->item($j++) ){
   $node = $newDom->importNode( $myAnchor, true );    // import node
   $newDom->appendChild($node);
}

Any idea how to obtain just the value of the href tags that the first query selects?? Thanks!

like image 367
sfgiants2010 Avatar asked Oct 22 '22 23:10

sfgiants2010


1 Answers

Use:

//a/@href[contains(., 'letter')]

this selects any href attribute of any a whose string value (of the attribute) contains the string "letter" .

like image 191
Dimitre Novatchev Avatar answered Nov 09 '22 13:11

Dimitre Novatchev