I am trying to load some air pollution background data directly into R as a data.frame using the RCurl package.
The website in question has 3 dropdown boxes to choose options before downloading the .csv file as shown in figure below:
I am trying to select 3 values from the drop down box and download the data using "Download CSV" button directly into R as a data.frame.
I want to download the different combinations of multiple years and multiple pollutants for a specific site.
In other posts on StackOverflow I have come across getForm
function from the RCurl package but I don't understand how to control the 3 dropdown boxes with this function.
The URL for the data source is: http://uk-air.defra.gov.uk/data/laqm-background-maps?year=2011
For this website you can construct a url and submit a GET
request to simply get the csv:
library(httr)
baseURL <- "http://uk-air.defra.gov.uk/data/laqm-background-maps.php"
queryList <- parse_url(baseURL)
queryList$query <- list("bkgrd-la" = 359, "bkgrd-pollutant" = "no2", "bkgrd-year" = 2011,
action = "data", year = 2011, submit = "Download+CSV")
res <- GET(build_url(queryList), write_disk("temp.csv"))
You can get the codes for the form by parsing the original page:
library(XML)
doc <- htmlParse("http://uk-air.defra.gov.uk/data/laqm-background-maps?year=2011")
councils <- doc["//*[@id='bkgrd-la']/option", fun = function(x){
data.frame(value = xmlGetAttr(x, "value"), council = xmlValue(x))
}]
councils <- do.call(rbind.data.frame, councils)
> head(councils)
value council
1 359 Aberdeen City Council
2 360 Aberdeenshire Council
3 1 Adur District Council
4 2 Allerdale Borough Council
5 4 Amber Valley Borough Council
6 401 Anglesey County Council
pollutants <- doc["//*[@id='bkgrd-pollutant']/option", fun = function(x){
data.frame(value = xmlGetAttr(x, "value"), council = xmlValue(x))
}]
pollutants <- do.call(rbind.data.frame, pollutants)
> head(pollutants)
value council
1 no2 NO2
2 nox NOx
3 pm10 PM10
4 pm25 PM2.5
5 no2 NO2
6 nox NOx
etc...
You can do it in one line:
read.csv("http://uk-air.defra.gov.uk/data/laqm-background-maps.php?bkgrd-la=359&bkgrd-pollutant=no2&bkgrd-year=2011&action=data&year=2011&submit=Download+CSV",
skip=5)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With