I'm attempting to scrape data from http://www.footballoutsiders.com/stats/snapcounts, but I can't change the fields in the drop down boxes on the site ("team", "week", "position", and "year"). My attempt to scrape the table associated with team = "ALL", week= "1", pos = "All", and year= "2015" with rvest is below.
url <- "http://www.footballoutsiders.com/stats/snapcounts"
pgsession <- html_session(url)
pgform <-html_form(pgsession)[[3]]
filled_form <-set_values(pgform,
            "team" = "ALL",
            "week" = "1",
            "pos"  = "ALL",
            "year" = "2015"             
 )
 submit_form(session=pgsession,form=filled_form, POST=url)
 y <- read_html("http://www.footballoutsiders.com/stats/snapcounts")
 y <- y %>%
    html_nodes("table") %>%
    .[[2]] %>%
    html_table(header=TRUE)
This code returns the table associated the default variables in the dropdown box which are team = "ALL", week= "20", pos = "QB", and year= "2015" which is a data frame that only contains 11 observations. If it had actually changed the fields it would have returned a data frame with 1,695 observations.
You can capture the session produced when the form is submitted and use that session as input to html_nodes:
d <- submit_form(session=pgsession, form=filled_form)
y <- d %>%
    html_nodes("table") %>%
    .[[2]] %>%
    html_table(header=TRUE)
dim(y)
#[1] 1695   11
Otherwise, if you use read_html(url) you are reading the original page.  
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With