I have the following code I don't know why I receive this error:
rm(list=ls())
require("XML")
# <a href="/music/The+Beatles/Sgt.+Pepper%27s+Lonely+Hearts+Club+Band"
beatles = "http://www.last.fm/music/The+Beatles/"
beatles.albums.page = paste(sep="", beatles, "+albums")
lines = readLines(beatles.albums.page)
album.lines = grep(pattern="href.*link-reference", lines, value=TRUE)
album.names = sub(pattern=".*<h3>(.*)</h3>.*", replacement="\\1", x=album.lines)
album.names = gsub(pattern=" ", replacement="+", x=album.names)
album.names = gsub(pattern="'", replacement="%27", x=album.names)
for (album in album.names[1]) {
print(album)
album.link = paste(sep="", beatles, album)
print(album.link)
tables = readHTMLTable(album.link)
}
Any idea?
The line
readHTMLTable(album.link)
is causing the error. Try changing it to
tables = readHTMLTable(album.link, header = FALSE)
But it still gives you the warning:
Warning message:
In readLines(beatles.albums.page) :
incomplete final line found on 'http://www.last.fm/music/The+Beatles/+albums'
Which you can get rid with
readLines(beatles.albums.page, warn = FALSE)
Also note that you're not 'saving' the tables, it changes at every loop, but maybe that's what you want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With