Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse HTML in PHP

Tags:

html

php

parsing

I've read the other posts here about this topic, but I can't seems to get what I want.

This is the original HTML:

<div class="add-to-cart"><form class=" ajax-cart-form ajax-cart-form-kit" id="uc-product-add-to-cart-form-20" method="post" accept-charset="UTF-8" action="/product/rainbox-river-lodge-guides-salomon-selection">
<div><div class="attributes"><div class="attribute attribute-1 odd"><div id="edit-attributes-1-wrapper" class="form-item">
 <label for="edit-attributes-1">Color: </label>
 <select id="edit-attributes-1" class="form-select" name="attributes[1]"><option value="4">Blue</option><option selected="selected" value="2">Brown</option><option value="1">Tan</option></select>
</div>
</div><div class="attribute attribute-2 even"><div id="edit-attributes-2-wrapper" class="form-item">
 <label for="edit-attributes-2">Rod Weight: </label>
 <select id="edit-attributes-2" class="form-select" name="attributes[2]"><option selected="selected" value="5">5</option><option value="6">6</option><option value="7">7</option></select>
</div>
</div></div><input type="hidden" value="1" id="edit-qty" name="qty">
<input type="submit" add_to_cart="{ &quot;qty&quot;: 1, &quot;nid&quot;: &quot;20&quot; }" class="form-submit node-add-to-cart ajax-submit-form" value="Add to cart" id="edit-submit-20" name="op">
<input type="hidden" value="form-688be703b34b0a9b0bb5bd98577ea203" id="form-688be703b34b0a9b0bb5bd98577ea203" name="form_build_id">
<input type="hidden" value="42cf9b00fa3c367125d06cbd4e033531" id="edit-uc-product-add-to-cart-form-20-form-token" name="form_token">
<input type="hidden" value="uc_product_add_to_cart_form_20" id="edit-uc-product-add-to-cart-form-20" name="form_id">
<input type="hidden" value="20" id="edit-pnid" name="pnid">

</div></form>
</div>

I only want to extract the two <select> tags and their contents.

This is what I've got at the moment:

$dom = new DOMDocument();
$dom->loadHTML($node->content['add_to_cart']['#value']);  // this loads the html above
$selects = $dom->getElementsByTagName('select');

$tempDom = new DOMDocument();
$tempImported = $tempDom->importNode($selects, true);
$tempDom->appendChild($tempImported);
$output = $tempDom->saveHTML();
var_dump($output);

But I'm getting an empty $output

Here is the working code:

$dom = new DOMDocument();
$dom->loadHTML($node->content['add_to_cart']['#value']);
$selects = $dom->getElementsByTagName('select');

$tempDom = new DOMDocument();
foreach ($selects as $select) {
  $tempImported = $tempDom->importNode($select, true);
  $tempDom->appendChild($tempImported);
}

$output = $tempDom->saveHTML();
print('<div class="attributes">'. $output .'</div>');
like image 256
deckerdev Avatar asked Apr 08 '10 18:04

deckerdev


People also ask

How parse HTML in PHP?

We should use loadHTML() function for parsing. Parameters: $source: This variable is the container of the HTML code which you want to parse, $options: You may use the options parameter to specify additional Libxml parameters.

How do you parse HTML?

HTML parsing involves tokenization and tree construction. HTML tokens include start and end tags, as well as attribute names and values. If the document is well-formed, parsing it is straightforward and faster. The parser parses tokenized input into the document, building up the document tree.

Can PHP read HTML file?

Make a PHP file to read HTML content from a text filetxt' file in read mode and then use fread() function to display file content. You may also like read and delete file from folder using PHP. That's all, this is how to read HTML content from text file using PHP.

What is parsing in PHP?

Definition and Usage. The parse_str() function parses a query string into variables. Note: If the array parameter is not set, variables set by this function will overwrite existing variables of the same name. Note: The magic_quotes_gpc setting in the php.


1 Answers

dom->getElementsByTagName() returns its results as an array, so...

$tempImported = $tempDom->importNode($selects, true);

at this point, $selects is actually an array, which you can't import. You'll have to loop over it and import each element (the result nodes) seperately.

like image 92
Marc B Avatar answered Sep 22 '22 02:09

Marc B