Is there a way to retrieve the fully rendered html from a page with javascript post rendering ? If I use curl, it simply retrieves the base html, but lacks the post rendering of iframes, javascript processing etc.
What would be the best way to accomplish this?
As no-one else has answered (except the copmment above, but I'll come to that later) I'll try to help as much as possible.
There no "simple" answer. PHP can't process javascript/navigate the DOM natively, so you need something that can.
Your options as I see it:
If you are after screen grab (which is what I'm hoping as you also want Flash to load), I suggest you use one of the commercial APIs that are out there for doing this. You can find some in this list http://www.programmableweb.com/apitag/?q=thumbnail, for example http://www.programmableweb.com/api/convertapi-web2image
Otherwise you need to run something yourself that can handle Javascript and the DOM on, orconnected to, your server. For this, you'd need an automated browser that you can run serverside and get the information you need. Follow the list in Bergi's comment above and you'd need to test a suitable solution - the main one Selinium is great for "unit testing" on a known website, but I'm not sure on how I'd script it to handle random sites, for example. As you would (presumably) only have one "automated browser" and you don't know how long each page will take to load, you'd need to queue the requests and handle one at a time. You'd also need to ensure pop-up alert()s are handled, all the third party libraries (you say you want flash?!) installed, handle redirects, timeouts and potential memory hogs (if running this non-stop, you'll periodically want to kill your browser and restart it to clean out the memory!). Also handle virus attacks, pop-up windows and requests to close the browser completely.
Thirdly, VB has a web-browser component. I used it for a project a long time ago to do something similarish, but on a known site. Whether it's possible with .NET (to me, it' a huge security risk), and how you program for unknowns (e.g. pop-ups and Flash) I have no idea. But if you're desparate an adventurous .NET developer may be able to suggest more.
In summary - if you want more than a screen grab and can choose option 1, good luck ;)
If you're looking for something scriptable with no GUI you could use a headless browser. I've used PhantomJS for similar tasks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With