Say I have an html file that I have loaded, I run this query:
$url = 'http://www.fangraphs.com/players.aspx';
$html = file_get_contents($url);
$myDom = new DOMDocument;
$myDom->formatOutput = true;
@$myDom->loadHTML($html);
$anchor = $xpath->query('//a[contains(@href,"letter")]');
That gives me a list of these anchors that look like the following:
<a href="players.aspx?letter=Aa">Aa</a>
But I need a way to only get "players.aspx?letter=Aa".
I thought I could try:
$anchor = $xpath->query('//a[contains(@href,"letter")]/@href');
But that gives me a php error saying I couldn't append node when I try the following:
$xpath = new DOMXPath($myDom);
$newDom = new DOMDocument;
$j = 0;
while( $myAnchor = $anchor->item($j++) ){
$node = $newDom->importNode( $myAnchor, true ); // import node
$newDom->appendChild($node);
}
Any idea how to obtain just the value of the href tags that the first query selects?? Thanks!
Use:
//a/@href[contains(., 'letter')]
this selects any href
attribute of any a
whose string value (of the attribute) contains the string "letter"
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With