So, I parsed HTML code from FIFA worldcup website, and want to get all the matches:
wcup <- htmlTreeParse("http://www.fifa.com/worldcup/matches/", useInternalNodes=T)
However, the field for one country is 't-nText kern' and for the rest of countries is 't-nText '.
<span class="t-nText kern">Bosnia and Herzegovina</span>
Therefore, if I use this command, I will miss 'Bosnia and Herzegovina', like this command:
xpathSApply(wcup, "//span[@class='t-nText ']", xmlValue)
So, is there any way that I can search for both attributes 't-nText ' and 't-nText kern' at the same time? Or do you have any other solution? I want to keep the order of the matches as is.
xpath doesn't support logical OR:
xpathSApply(wcup, "//span[@class='t-nText ' || 't-nText kern']", xmlValue)
XPath error : Invalid expression
//span[@class='t-nText ' || 't-nText kern']
^
XPath error : Invalid expression
//span[@class='t-nText ' || 't-nText kern']
^
Error in xpathApply.XMLInternalDocument(doc, path, fun, ..., namespaces = namespaces, :
error evaluating xpath expression //span[@class='t-nText ' || 't-nText kern']
Use 'or' or perhaps 'starts-with()',
wcup["//span[@class='t-nText kern' or @class='t-nText ']"]
wcup["//span[starts-with(@class, 't-nText ')]"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With