Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Archive a Dynamic (PHP) Website as Static HTML? [closed]

We're in the process of shutting down The Conversations Network (including the IT Conversations podcast). The plan is to render a static-HTML version of our websites for permanent hosting at the Internet Archive.

What's the easiest way to generate static HTML from the roughly 5,000 dynamic pages currently generated dynamically from PHP?

I know we could tweak the code to cache the PHP output, write it to files, then walk the sitemaps to generate every page. But I wonder if there are any options we should consider. Any tools for doing this and scraping the HTML as-is? (Something other than Acrobat Pro?)

Unfortunately, we also have a fair number of Ajax calls, which are going to make this more difficult. I imagine we'll have to un-Ajax them first.

like image 946
Doug Kaye Avatar asked Sep 26 '12 19:09

Doug Kaye


People also ask

How do I know if my website is static or dynamic?

Finally, the best way to tell if a website is static or dynamic is to actually visit the website and view its contents. Static websites will generally look the same every time you visit them, whereas the contents of a dynamic website may change depending on user interactions, the time of day, or other factors.

Is Google a dynamic website?

Google. Google is another great example of a dynamic website. Google indexes billions of new and refreshed webpages daily and uses an algorithm to select and rank the best results for user-based search queries. That's why search results are updated frequently: to account for updated content.

Which type of web page is static?

A static web page (sometimes called a flat page or a stationary page) is a web page that is delivered to the user's web browser exactly as stored, in contrast to dynamic web pages which are generated by a web application.


1 Answers

It might not be what you are looking for; but HTTrack will browse your website for links and save the HTML-version of it. This mirror will include all static content that is linked, such as images, css and javascript.

The only problem I can think of is if your AJAX-script is pulling vital data from a server that, but perhaps HTTrack has a setting for that.

like image 61
Zar Avatar answered Sep 22 '22 08:09

Zar