I need to scrape a large html file (eg: http://www.indianrail.gov.in/mail_express_trn_list.html) using simple html dom. I started with a simple script:
<?php
require "simple_html_dom.php";
echo file_get_html('http://www.indianrail.gov.in/mail_express_trn_list.html')->plaintext;
?>
which shows nothing, just a blank page with the error message in Apache error.log file
PHP Notice: Trying to get property of non-object in /var/www/index.php on line 3
PHP Notice: Trying to get property of non-object in /var/www/index.php on line 3
at the same time all other pages (eg: http://www.indianrail.gov.in/special_trn_list.html) works fine with the same script.
The issue appears to be MAX_FILE_SIZE
defined in simple_html_dom
.
you can adjust it by editing define('MAX_FILE_SIZE', 600000);
line in simple_html_dom.php file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With