Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why 'Error: length(url) == 1 is not TRUE' with rvest web scraping

I'm trying to scrape web data but first step requires a login. I've successfully been able to log into other websites but I a weird error with this website.

library("rvest")
library("magrittr")    

research <- html_session("https://www.fitchratings.com/")

signin <- research %>%
  html_nodes("form") %>%
  extract2(1) %>%
  html_form() %>%
  set_values (
    'userName' = "abc",
    'password' = "1234"
     )

research <- research %>%
  submit_form(signin)

When I run the 'submit_form' line I get the following error:

> research <- research %>%
+ submit_form(signin)
Submitting with '<unnamed>'
Error: length(url) == 1 is not TRUE

Submitting with unnamed is correct b/c there is no name assigned to the sign in button. Any help appreciated!

like image 934
Hugo S. Avatar asked Mar 17 '15 03:03

Hugo S.


1 Answers

I was having the same issue. I jumped through a few hoops to get the dev version of rvest running, and it's working smoothly now. Here's how I went about it:

First thing first. You need to install RTools. Make sure R is closed out. This can be found here: https://cran.r-project.org/bin/windows/Rtools/. And information for the installation of Rtools can be found here (if you're using Windows): github.com/stan-dev/rstan/wiki/Install-Rtools-for-Windows

Boot up R, then install libraries "httr" and "Rcpp" if you don't have them already.

Install "devtools" and the correlated github installer. Information can be found here, but I'll give you a quick summary from the linked repo.

Windows:

install.packages("devtools")
library(devtools)
build_github_devtools()

#### Restart R before continuing ####
install.packages("devtools.zip", repos = NULL, type = "source")

# Remove the package after installation
unlink("devtools.zip")

Mac/Linux:

devtools::install_github("hadley/devtools")

Now, to run the final steps.

library(httr)
library(Rcpp)
library(devtools)
install_github("hadley/rvest")

You should now be able to run submit_form(session, form) and not experience the error

Submitting with 'xxxx'
Error: length(url) == 1 is not TRUE
like image 98
robeot Avatar answered Oct 22 '22 17:10

robeot