I have a site with 2000 pages and I want to iterate through each page to generate a sitemap, using the file_get_html()
function and regular expressions.
Obviously this can't be completed in one server-side execution as it will run out of time due to maximum execution time. I guess it needs to perform smaller actions, save the progress to the database and then queue the next task. Any suggestions?
One important aspect of PHP programs is that the maximum time taken to execute a script is 30 seconds. The time limit varies depending on the hosting companies but the maximum execution time is between 30 to 60 seconds.
Change the execution time by adding the php_value max_execution_time 300 code—just as outlined above. Save changes and upload the edited file to your server to overwrite the old one. You can also increase the max_execution_time value by editing the wp-config. php file.
When you run it command line there will be no maximum execution time.
You can also use set_time_limit(0);
for this if your provider allows manipulation.
I can't tell if your ip-address will get banned - as this depends on the security of the server you send your requests to.
Other solution
You can fetch one (or a few) page(s), and search for new URLs throughout the source code. You can then queue these in a database. Then on the next run, you process the queue.
Set max_execution_time to 0 in your php.ini. It will affect every script you run on the server, but if you're looking for a server-level fix, this will do it.
http://php.net/manual/en/info.configuration.php#ini.max-execution-time
max_execution_time = 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With