I'm trying to extract data on invasive plant species locations from the CABI invasive species compendium using the rvest package.
Having looked at a few tutorials I have figured out that I should be able to scrape data from tables fairly easily. However, I keep running into difficulties.
Let's say I want location data for the species Brassica tournefortii. I should be able to use this code, which uses the techniques outlined here to get details of the locations the species has been recorded.
library(rvest)
isc<-read_html("http://www.cabi.org/isc/datasheet/50069")
isc %>%
html_node("#toDistributionTable td:nth-child(1)") %>%
html_text()
However, running this code I get the error
Error: No matches
I am completely new to webscraping. Am I doing something horribly wrong?
First, I wish I could upvote you more. Finally a scraping question that is not $SPORTSBALL or $MONEY related! :-)
That site is evil. It uses embedded namespaces which need to be dealt with, which also means using the xml2
package:
library(rvest)
library(xml2)
isc <- read_html("http://www.cabi.org/isc/datasheet/50069")
ns <- xml_ns(isc)
xml_text(xml_find_all(isc, xpath="//div[@id='toDistributionTable']/table/tbody/tr/td[1]", ns))
## [1] "ASIA" "Azerbaijan"
## [3] "Bhutan" "China"
## [5] "-Tibet" "India"
## [7] "-Delhi" "-Indian Punjab"
## [9] "-Rajasthan" "-Uttar Pradesh"
## [11] "Iran" "Iraq"
## [13] "Israel" "Jordan"
## [15] "Kuwait" "Lebanon"
## [17] "Oman" "Pakistan"
## [19] "Qatar" "Saudi Arabia"
## [21] "Syria" "Turkey"
## [23] "Turkmenistan" "United Arab Emirates"
## [25] "Uzbekistan" "Yemen"
## [27] "AFRICA" "Algeria"
## [29] "Egypt" "Libya"
## [31] "Morocco" "South Africa"
## [33] "Tunisia" "NORTH AMERICA"
## [35] "Mexico" "USA"
## [37] "-Arizona" "-California"
## [39] "-Nevada" "-New Mexico"
## [41] "-Texas" "-Utah"
## [43] "SOUTH AMERICA" "Chile"
## [45] "EUROPE" "Belgium"
## [47] "Cyprus" "Denmark"
## [49] "France" "Greece"
## [51] "Ireland" "Italy"
## [53] "Spain" "Sweden"
## [55] "UK" "-England and Wales"
## [57] "-Scotland" "OCEANIA"
## [59] "Australia" "-Australian Northern Territory"
## [61] "-New South Wales" "-Queensland"
## [63] "-South Australia" "-Tasmania"
## [65] "-Victoria" "-Western Australia"
## [67] "New Zealand"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With