Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Error using readHTMLTable

Tags:

r

I am using the following code:

url  = "http://finance.yahoo.com/q/op?s=DIA&m=2013-07"

library(XML)
tabs = readHTMLTable(url, stringsAsFactors = F)

I get the following error:

Error: failed to load external entity "http://finance.yahoo.com/q/op?s=DIA&m=2013-07"

When I use the url in the browser it works fine. So, what am I doing incorrect here?

Thanks

like image 663
Zanam Avatar asked Jun 11 '13 13:06

Zanam


1 Answers

It's difficult to know for sure since I can't replicate your error, but according the package's author (see http://comments.gmane.org/gmane.comp.lang.r.mac/2284), XML's methods for getting web content are pretty minimalistic. A workaround is to use RCurl to get the content and XML to parse it:

library(XML)
library(RCurl)

url <- "http://finance.yahoo.com/q/op?s=DIA&m=2013-07"

tabs <- getURL(url)
tabs <- readHTMLTable(tabs, stringsAsFactors = F)

Or, if RCurl still throws an error, try the httr package:

library(httr)

tabs <- GET(url)
tabs <- readHTMLTable(rawToChar(tabs$content), stringsAsFactors = F)
like image 133
SchaunW Avatar answered Sep 21 '22 16:09

SchaunW