Question

I have the following code I don't know why I receive this error:

rm(list=ls())
require("XML")
# <a href="/music/The+Beatles/Sgt.+Pepper%27s+Lonely+Hearts+Club+Band" 
beatles = "http://www.last.fm/music/The+Beatles/"

beatles.albums.page = paste(sep="", beatles, "+albums")
lines = readLines(beatles.albums.page)
album.lines = grep(pattern="href.*link-reference", lines, value=TRUE)
album.names = sub(pattern=".*<h3>(.*)</h3>.*", replacement="\\1", x=album.lines)
album.names = gsub(pattern=" ", replacement="+", x=album.names)
album.names = gsub(pattern="'", replacement="%27", x=album.names)

for (album in album.names[1]) {
  print(album)
  album.link = paste(sep="", beatles, album)
  print(album.link)
  tables = readHTMLTable(album.link)

}

Any idea?

Was it helpful?

Solution

The line

readHTMLTable(album.link)

is causing the error. Try changing it to

tables = readHTMLTable(album.link, header = FALSE)

But it still gives you the warning:

Warning message:
In readLines(beatles.albums.page) :
  incomplete final line found on 'http://www.last.fm/music/The+Beatles/+albums'

Which you can get rid with

readLines(beatles.albums.page, warn = FALSE) 

Also note that you're not 'saving' the tables, it changes at every loop, but maybe that's what you want.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top