Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xmlvalue vs XMLAttributeValue

Tags:

r

xml

I parsed an XML file using the following code and got the results as below:

url = htmlTreeParse("http://www.appannie.com/app/ios/candy-crush-saga/", useInternalNodes = T)
ItemList =getNodeSet(url, "//li/a/@title")


>ItemList
[[1]]
           title 
"Angry Birds Star Wars HD" 
attr(,"class")
[1] "XMLAttributeValue"

[[2]]
           title 
"iShuffle Bowling 2" 
attr(,"class")
[1] "XMLAttributeValue"

 ....
[[15]]
           title 
"Angry Birds Star Wars Free" 
attr(,"class")
[1] "XMLAttributeValue"

attr(,"class")
[1] "XMLNodeSet"

My issue is I'd like to grab the names of the game by parsing it. So I tried this code (based on my experience dealing with xmlValue ) -

IL <- lapply(ItemList, function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))

But it ends up giving this error :

Error in UseMethod("xmlValue") : no applicable method for 'xmlValue' applied to an object of class "XMLAttributeValue"

I did extensive googling but cannot find the solution to deal with XMLAttributeValue. Can someone give me a hint and let me know the difference between xmlValue and xmlAttributeValue?

like image 624
user1486507 Avatar asked Jun 09 '26 10:06

user1486507


1 Answers

Thanks for the updated question and added example URL!

I think with the @title you are already into the attributes, that's why you could not parse the xmlValue. What about e.g.:

> htmlTreeParse("http://www.appannie.com/app/ios/candy-crush-saga/", useInternalNodes = TRUE)
> xpathSApply(url, "//li/a", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))

Update: to filter your results, you might try only xpathSApply the "Customers Also Bought" div:

> xpathSApply(url, "//div[@class='app_content_section']/ul/li/a", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))
like image 61
daroczig Avatar answered Jun 12 '26 10:06

daroczig