Is there a way to decode tinyURL links in R so that I can see which web pages they actually refer to?
Below is a quick and dirty solution, but should get the job done:
library(RCurl)
decode.short.url <- function(u) {
x <- try( getURL(u, header = TRUE, nobody = TRUE, followlocation = FALSE) )
if(class(x) == 'try-error') {
return(u)
} else {
x <- strsplit(x, "Location: ")[[1]][2]
return(strsplit(x, "\r")[[1]][1])
}
}
The variable 'u' below contains one shortend url, and one regular url.
u <- c("http://tinyurl.com/adcd", "http://www.google.com")
You can then get the expanded results by doing the following.
sapply(u, decode.short.url)
The above should work for most services which shorten the URL, not just tinyURL. I think.
HTH
Tony Breyal
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With