Looking for a Linux application
(or Firefox extension) that will allow me to scrape an HTML mockup and keep the page's integrity.
Firefox does an almost perfect job but doesn't grab images referenced in the CSS.
The Scrapbook extension for Firefox gets everything, but flattens the directory structure.
I wouldn't terribly mind if all folders became children of the index
page.
See Website Mirroring With wget
wget --mirror –w 2 –p --HTML-extension –-convert-links http://www.yourdomain.com
Have you tried wget?
wget -r
does what you want, and if not, there are plenty of flags to configure it. See man wget
.
Another option is curl
, which is even more powerful. See http://curl.haxx.se/.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With