The following url:
http://www.cbs.gov.il/ts/ID40d250e0710c2f/databank/series_func_e_v1.html?level_1=31&level_2=1&level_3=7
Gives a data generator of information from the Israeli government which limits the number of data points extracted to a maximum of 50 series at a time. I wonder, is it possible (and if so, how) to write a webscraper (in your favorite language/software) that can follow the clicks on each step to be able to get all of the series in a specific topic.
Thanks.
Take a look at WWW::Mechanize and WWW::HtmlUnit.
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $m = WWW::Mechanize->new;
#get page
$m->get("http://www.cbs.gov.il/ts/ID40d250e0710c2f/databank/series_func_e_v1.html?level_1=31&level_2=1&level_3=7");
#submit the form on the first page
$m->submit_form(
with_fields => {
name_tatser => 2, #Orders for export
}
);
#now that we have the second page, submit the form on it
$m->submit_form(
with_fields => {
name_ser => 1576, #Number of companies that answered
}
);
#and so on...
#printing the source HTML is a good way
#to find out what you need to do next
print $m->content;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With