Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rvest error: type 'externalptr'

Tags:

r

rvest

I am trying to use rvest to extract the date of birth for PGA golfers. Let's try Stuart Appleby. Here is his profile on the ESPN website http://espn.go.com/golf/player/_/id/11/stuart-appleby. Notice his DOB next to his headshot.

library("rvest")
url <- "http://espn.go.com/golf/player/_/id/11/stuart-appleby"
li_node <- url %>% html %>% html_nodes("li")

His DOB is contained in item 22 of li_node. Ideally, I wouldn't hard code [[22]] into my program, but even when I do, I run into errors.

li_node[[22]]

displays the info I want, but stuff like:

word(li_node[[22]], ...)
substr(li_node[[22]], ...)
pluck(li_node, 22)

all return an error:

> word(li_node[[22]], 1)
Error in rep(string, length = n) : 
  attempt to replicate an object of type 'externalptr'
> substr(li_node[[22]], 1, 2)
Error in as.vector(x, "character") : 
  cannot coerce type 'externalptr' to vector of type 'character'
> pluck(li_node, 22)
Error in FUN(X[[1L]], ...) : 
  object of type 'externalptr' is not subsettable

Is there a simple way for me to grab that DOB using rvest?

like image 771
hossibley Avatar asked Feb 26 '15 17:02

hossibley


1 Answers

library("rvest")
library("stringr")
url <- "http://espn.go.com/golf/player/_/id/11/stuart-appleby"
url %>% 
  html %>% 
  html_nodes(xpath='//li[contains(.,"Age")]') %>% 
  html_text() %>% 
  str_extract("[A-Z][a-z]{2,} [0-9]{1,2}, [0-9]{4}")

returns:

[1] "May 1, 1971"
like image 170
cory Avatar answered Sep 21 '22 03:09

cory