Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use R to fill html form and download the resulting file

I spent the day scouring the internet for examples of how to do this, however I'm still spinning in circles and could use a little direction. I am very new to html, have basic R coding experience, and minimal experience with any other coding languages.

I have a list of 500+ (potentially more) weather stations that I would like to download data for in the FW13 format from this website (https://fam.nwcg.gov/fam-web/kcfast/html/wxhmenu.htm). Within a browser, you fill out the form, submit it, and it will start downloading the FW13 file to my default downloads folder.

My goal is to use R to fill out the html form, submit it, then accept the download of the resulting file to a defined location. The form itself consists of text and radio buttons. Here is an example of a single query:

Station ID: 020207

Start Date: 2000-01-01

End Date: 2017-12-31

Observation Type: Hourly

Schedule Option: Run it now

I went down the rabbit holes of the RCurl and rvest packages, and even started to try out rSelenium. Most examples I have seen are scraping information straight off the website, but I would just like to accept the downloading of the resulting file.

If I can just submit a single request, and download a single file, I believe I can figure out how to loop that with a list of station IDs to achieve what I need.

I apologize for not having any example code here. All of my trials were blind shots in the dark and I'm not even sure if I'm using the right packages for this task. Any help or direction is much appreciated!

like image 227
shindig Avatar asked Jan 28 '23 22:01

shindig


1 Answers

library(httr)
library(tidyverse)

POST(
  url = "https://fam.nwcg.gov/FAMCognosRESTServices/rest/cognos/anonymous/reports/FW13",
  encode = "json",
  body = list(
    p_end_date = "2017-12-31",
    p_obs_type = "Hourly",
    p_start_date = "2000-01-01",
    p_station_id = "170701",
    reportName = "",
    reportSched = FALSE,
    reportTime = ""
  )
) -> res

out <- content(res)

readr::read_fwf(
  file = out$report, 
  fwf_widths(
    widths = c(3, 6, 8, 4, 1, 1, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 5, 1, 2, 2, 1, 1, 1, 4, 3, 3, 1),
    col_names = c("w13", "sta_id", "obs_dt", "obs_tm", "obs_type", "sow", "dry_temp", "rh", 
                  "wind_dir", "wind_sp", "fuel_10hr", "temp_max", "temp_min", "rh_max", 
                  "rh_min", "pp_dur", "pp_amt", "wet", "grn_gr", "grn_sh", "moist_tpe", 
                  "meas_type", "season_cd", "solar_radiation", "wind_dir_peak",
                  "wind_speed_peak", "snow_flg")
  ),
  skip = 1
) %>% 
  glimpse()
## Observations: 1,059
## Variables: 27
## $ w13             <chr> "W13", "W13", "W13", "W...
## $ sta_id          <int> 170701, 170701, 170701,...
## $ obs_dt          <int> 20000102, 20000103, 200...
## $ obs_tm          <int> 1300, 1300, 1300, 1300,...
## $ obs_type        <chr> "O", "O", "O", "O", "O"...
## $ sow             <int> 3, 1, 4, 0, 1, 5, 1, 7,...
## $ dry_temp        <int> 44, 40, 48, 26, 25, 37,...
## $ rh              <int> 66, 50, 100, 51, 53, 93...
## $ wind_dir        <int> 230, 360, 230, 360, 120...
## $ wind_sp         <int> 2, 10, 13, 14, 5, 0, 9,...
## $ fuel_10hr       <int> NA, NA, NA, NA, NA, NA,...
## $ temp_max        <int> 48, 52, 51, 54, 29, 37,...
## $ temp_min        <int> 10, 36, 28, 23, 8, 22, ...
## $ rh_max          <int> 100, 100, 100, 100, 80,...
## $ rh_min          <int> 60, 50, 100, 51, 38, 93...
## $ pp_dur          <int> 0, 0, 13, 8, 0, 8, 0, 0...
## $ pp_amt          <int> 0, 0, 150, 1020, 0, 100...
## $ wet             <chr> "N", "N", "Y", "Y", "N"...
## $ grn_gr          <chr> NA, NA, NA, NA, NA, NA,...
## $ grn_sh          <chr> NA, NA, NA, NA, NA, NA,...
## $ moist_tpe       <int> 2, 2, 2, 2, 2, 2, 2, 2,...
## $ meas_type       <int> 1, 1, 1, 1, 1, 1, 1, 1,...
## $ season_cd       <chr> NA, NA, NA, NA, NA, NA,...
## $ solar_radiation <chr> NA, NA, NA, NA, NA, NA,...
## $ wind_dir_peak   <chr> NA, NA, NA, NA, NA, NA,...
## $ wind_speed_peak <chr> NA, NA, NA, NA, NA, NA,...
## $ snow_flg        <chr> "N", "N", "N", "N", "N"...
like image 193
hrbrmstr Avatar answered Feb 03 '23 22:02

hrbrmstr