To clarify, this data shows information for active and historical fires in British Columbia. So far, I've successfully been able to pull all of the data out of the HTML table using the following code:
Interface_html <- html_nodes(webpage,'td:nth-child(1)')
Interface_data <- html_text(Interface_html)
head(Interface_data)
(...)
Geocoding_df<-data.frame(Fire_no = Fire_no_data, Geographic =
Geographic_data, Discovery = Discovery_Date_data, Status = Status_data,
Hectares = Hectares_data, Interface = Interface_data, Updatetime =
Updatetime_data, Updatetime_stg = Updatetime_data_stg)
However, in the first column some rows contain an image of a small house. This image acts as an indicator that the fire is an 'interface' fire, meaning that it is threatening structures.
Basically, I need a way to pull whether or not the image is present in the row, (ideally the image alt text "Interface" but even a yes/no indicator would be fine for my purposes.
Is there a way to pull the image properties from this table by modifying the code that I've already got?
The main purpose, is that I want to pull the entire table into SQL for some data visualization using PowerBI.
Include a screenshot:

The website: http://bcfireinfo.for.gov.bc.ca/hprScripts/WildfireNews/Fires.asp?Mode=normal&AllFires=1&FC=0
The variable "Interface_html" is a list of all of the lines from the webpage. So one method is to look at each node to see if it contains an img tag. html_node (without the s) will always return a result whether or not it is successful.
In this case html_node(Interface_html, "img") will return NA if the does not exist, otherwise it will return the html code.
library(rvest)
url<-"http://bcfireinfo.for.gov.bc.ca/hprScripts/WildfireNews/Fires.asp?Mode=normal&AllFires=1&FC=0"
webpage<-read_html(url)
#list of all nodes
Interface_html <- html_nodes(webpage,'td:nth-child(1)')
#search each node in list to see if it contains an image tag and return node number.
withimage<- which(!is.na(html_node(Interface_html, "img")))
withimage
#[1] 109 145
#to add the column of True/Falses onto your dataframe use:
Interface = !is.na(html_node(Interface_html, "img"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With