I want to download some csv files from a webpage using wget. (This is the webpage http://sinca.mma.gob.cl/index.php/region/index/id/II). However using wget I only get some cgi-bin files and other format files which I suppose could build an csv file. Given that I have no knowledge at all on javascript or whatever is required to build the csv files, is there a way I could get those excel files using wget directly?
This is the log file after running wget --10:30:06-- http://sinca.mma.gob.cl/index.php/region/index/id/II => `sinca.mma.gob.cl/index.php/region/index/id/II' Resolving sinca.mma.gob.cl... 190.215.49.125 Connecting to sinca.mma.gob.cl[190.215.49.125]:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html]
0K .......... .......... .......... .......... .......... 28.17 KB/s
50K .......... .......... .......... .......... .......... 226.24 KB/s 100K . 1.44 MB/s
Last-modified header missing -- time-stamps turned off. 10:30:09 (50.81 KB/s) - `sinca.mma.gob.cl/index.php/region/index/id/II.html' saved [103911]
Removing sinca.mma.gob.cl/index.php/region/index/id/II.html since it should be rejected.
FINISHED --10:30:09-- Downloaded: 103,911 bytes in 1 files Converted 0 files in 0.00 seconds.
You need to provide wget the full url that generates the file you want, for example:
wget -O test.csv "http://sinca.mma.gob.cl/cgi-bin/APUB-MMA/apub.tsindico2.cgi?outtype=xcl¯o=./RII/237/Cal/PM25//PM25.diario.diario.ic&from=13060100&to=15110323&path=/usr/airviro/data/CONAMA/&lang=esp&rsrc=¯opath="
I tested the above and I get the exact same csv file as I do when I click the link on the site. The link runs some javascript which generates the URL used above. To get that URL I clicked on the link, and then copied the address that appeared in the address bar.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With