How to extract text after the br
tags in the following lines:
<div id='population'>
The Snow Leopard Survival Strategy (McCarthy <em>et al.</em> 2003, Table
II) compiled national snow leopard population estimates, updating the work
of Fox (1994). Many of the estimates are acknowledged to be rough and out
of date, but the total estimated population is 4,080-6,590, as follows:<br>
<br>
Afghanistan: 100-200?<br>
Bhutan: 100-200?<br>
China: 2,000-2,500<br>
India: 200-600<br>
Kazakhstan: 180-200<br>
Kyrgyzstan: 150-500<br>
Mongolia: 500-1,000<br>
Nepal: 300-500<br>
Pakistan: 200-420<br>
Russia: 150-200<br>
Tajikistan: 180-220<br>
Uzbekistan: 20-50
</div>
I got as far as:
xpathSApply(h, '//div[@id="population"]', xmlValue)
but I'm stuck now...
It helps if you realize text is a node too. All text in the div than follows <br/>
's can be retrieved by:
//div[@id="population"]/text()[preceding-sibling::br]
Technically, between <br/>
tags would mean:
//div[@id="population"]/text()[preceding-sibling::br and following-sibling::br]
... but I guess that's not what you want at this point.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With