I have recently downloaded the KML file from this map, and tried to use the package xml2
to extract the information of the campsites e.g. the geolocation, the facilities around the sites etc; but I got {xml_nodeset (0)}
at the end.
Belows are the codes I have used,
library(xml2)
campsites <- read_xml("file_path")
xml_find_all(campsites, ".//Placemark")
Here is the structure of the KML file (you may also try xml_structure(campsites)
),
> library(magrittr)
> campsites
{xml_document}
<kml>
[1] <Document>\n<description><![CDATA[powered by <a href="http://www.wordpress.org">WordPress</a> & <a href="https://www.mapsmarker.com">MapsMarker.com</a>]] ...
>
> campsites %>% xml_children %>% xml_children %>% xml_children
{xml_nodeset (55)}
[1] <IconStyle>\n <Icon>\n <href>http://www.mountaineering-lohas.org/wp-content/uploads/leaflet-maps-marker-icons/tents.png</href>\n </Icon>\n</IconStyle>
[2] <IconStyle>\n <Icon>\n <href>http://www.mountaineering-lohas.org/wp-content/uploads/leaflet-maps-marker-icons/tents-1.png</href>\n </Icon>\n</IconStyle>
[3] <IconStyle>\n <Icon>\n <href>http://www.mountaineering-lohas.org/wp-content/uploads/leaflet-maps-marker-icons/tents1.png</href>\n </Icon>\n</IconStyle>
[4] <name>香港營地 Hong Kong Camp Site</name>
[5] <Placemark id="marker-1">\n<styleUrl>#tents</styleUrl>\n<name>æµæ°´éŸ¿ç‡Ÿåœ° ( Lau Shui Heung Camp Site )</name>\n<TimeStamp><when>2013-02-21T04:02:29+08: ...
[6] <Placemark id="marker-2">\n<styleUrl>#tents</styleUrl>\n<name>鶴藪營地(Hok Tau Camp Site)</name>\n<TimeStamp><when>2013-02-21T04:02:18+08:00</when></Tim ...
[7] <Placemark id="marker-3">\n<styleUrl>#tents</styleUrl>\n<name>涌背營地(Chung Pui Camp Site)</name>\n<TimeStamp><when>2013-02-22T11:02:02+08:00</when></T ...
[8] <Placemark id="marker-4">\n<styleUrl>#tents</styleUrl>\n<name>æ±å¹³æ´²ç‡Ÿåœ° (Tung Ping Chau Campsite)</name>\n<TimeStamp><when>2013-02-22T11:02:39+08:00</ ...
[9] <Placemark id="marker-5">\n<styleUrl>#tents</styleUrl>\n<name>ç£ä»”å—營地(Wan Tsai Peninsula South Campsite)</name>\n<TimeStamp><when>2013-02-22T11:02:2 ...
[10] <Placemark id="marker-6">\n<styleUrl>#tents</styleUrl>\n<name>ç£ä»”西營地 (Wan Tsai Peninsula West Campsite)</name>\n<TimeStamp><when>2013-02-22T11:02:3 ...
...
As you can see there are nodes named as "Placemark", why I can't find the nodes using xml_find_all
? Did I make any mistakes in my codes?
Thanks!
It looks like you have a few namespaces. If you add the prefix to your xpath you can get the nodeset.
xml_ns(campsites)
# d1 <-> http://www.opengis.net/kml/2.2
# atom <-> http://www.w3.org/2005/Atom
# gx <-> http://www.google.com/kml/ext/2.2
xml_find_all(campsites, ".//d1:Placemark", xml_ns(campsites))
# {xml_nodeset (45)}
# [1] <Placemark id="marker-1">\n<styleUrl>#tents</styleUrl>\n<name>流水響營地 ( La ...
# [2] <Placemark id="marker-2">\n<styleUrl>#tents</styleUrl>\n<name>鶴藪營地(Hok T ...
# ...
To get the text in the cdata, you could use something like
xml_text(xml_find_all(campsites, "//d1:description", xml_ns(campsites)))
# or "//d1:description/text()"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With