Lets say I have a bit of javascript code that is passed a string from php containing an entire html page. I write the string to the current document and then alter one of it's containing elements. Something like this:
<script type="text/javascript">
var foo = <?php echo $html_document;?>;
document.open();
document.write(foo);
document.close();
document.getElementById("some_id_within_html_document").innerHTML = "some stuff";
</script>
This gives me my desired output, everything looks great... except when you view the source of this page. If i wanted to scrape this page later and do the same thing it displays the javascript instead of the html interpreted by the browser. Using this method how could I scrape the desired HTML instead of the javascript generating it? I have already circumvented this issue by processing the string in php instead however I am still curious if it is possible to display the interpreted HTML this way when viewing the source/scraping the page.
Edit: Great responses across the board, I learned a lot about what is actually going on here and what practices I should stay away from. The simplest solution that would take the least effort in relation to my original problem was given by Justin Wood.
Not exactly sure what you are trying to do but you can see the HTML equivalent to the generated/modified DOM using something like:
document.documentElement.innerHTML
or:
document.getElementById("some_id").innerHTML
See DEMO.
You can create a bookmarklet that includes this code:
alert(document.documentElement.innerHTML);
to see the HTML of the DOM modified by JavaScript on every page that you view.
If you want to do some Web scraping on your server where you want to download some external Web page, execute its JavaScript and then see the HTML that corresponds to the DOM after the JavaScript is executed (with the document.write calls and all that) then try using Zombie or Phantom. See also Mink for a PHP tool that supports Zombie.
Generally search for a headless browser with JavaScript engine.
Contrary to what people write in other answers here, it is actually possible.
don't pass your PHP variable into the javascript. Just output the variable itself, then use javascript to edit whatever it is that you want to edit...
<?php
$html = "<html><head><title></title></head><body><p id='p'>Something</p></body></html>";
echo $html;
?>
<script type="text/javascript">
document.getElementById("p").innerHTML = "blah";
</script>
Something like that should work for you.
NOTE: I have only tested this in chrome, FF, and safari
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With