Simply put: if
x <- read.csv(url)
exists, then R will return the contents of that url. A good example, if you want to try it, might be "http://ichart.finance.yahoo.com/table.csv?s=IBM&a=00&b=1&c=2008&d=03&e=4&f=2014&g=d&ignore=.csv" . That particular url, if assigned to url and run as above, will load up a data.frame into x from the Yahoo website containing the past 5 years of IBM stock data.
But how to tell, beforehand, if any given url will get you 404'd ?
something like:
is.404.or.not(url)
or maybe
status(connect.to(url))
Thanks!
The 404 code means that a server could not find a client-requested webpage. Variations of the error message include "404 Error," "404 Page Not Found" and "The requested URL was not found."
The typical trigger for an error 404 message is when website content has been removed or moved to another URL. There are also other reasons why an error message could appear. These include: The URL or its content (such as files or images) was either deleted or moved (without adjusting any internal links accordingly)
You could use the RCurl
package:
R> library(RCurl)
Loading required package: bitops
R> url.exists("http://google.com")
[1] TRUE
R> url.exists("http://csgillespie.org")
[1] FALSE
Alternatively, you could use the httr
package
R> library(httr)
R> http_status(GET("http://google.com"))
$category
[1] "success"
$message
[1] "success: (200) OK"
R> http_status(GET("http://csgillespie.org"))
$category
[1] "server error"
$message
[1] "server error: (503) Service Unavailable"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With