Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using pyKML to parse KML Document

I'm using the pyKML module for extracting coordinates from a given KML file.

My Python code is as follows:

from pykml import parser
fileobject = parser.fromstring(open('MapSource.kml', 'r').read())
root = parser.parse(fileobject).getroot()
print(xml.Document.Placemark.Point.coordinates)

However, on running this, I get the following error:

ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

Looking for solutions, I came across this solution http://twigstechtips.blogspot.in/2013/06/python-lxml-strings-with-encoding.html from where I've tried this (which I'm not sure is the correct method):

from pykml import parser
from lxml import etree
from os import path
kml_file = open('MapSource.kml', 'r')
parser = etree.XMLParser(recover=True)
xml = etree.fromstring(kml_file, parser)
print(xml.Document.Placemark.Point.coordinates)

This gives me ValueError: can only parse strings. What is the correct way for me to parse the KML and get the coordinates at that structure?

like image 754
Newtt Avatar asked Sep 27 '14 11:09

Newtt


1 Answers

In above example, root = parser.parse(fileobject).getroot() is calling parse() on file contents as a string returned from fromstring() function from the previous line.

There are two methods to parse a KML file using pyKML:

1: Using parse.parse() to parse the file.

from pykml import parser
with open('MapSource.kml', 'r') as f:
  root = parser.parse(f).getroot()
print(root.Document.Placemark.Point.coordinates)

2: Using parse.parsestring() to parse the string contents.

from pykml import parser
with open('MapSource.kml', 'rb') as f:
  s = f.read()
root = parser.fromstring(s)
print(root.Document.Placemark.Point.coordinates)

Method #2 can fail if the KML file has an XML prolog header as the first line with non-UTF8 encoding and try to read the file with 'r' as text vs 'rb' for binary format.

Note parsing can fail if the encoding is incorrectly specified in the KML document. ISO-8859-1 encoding is used in example below because of the international and graphic characters in the name and description. Omitting the encoding or using "UTF-8" would make it an invalid XML file.

<?xml version="1.0" encoding="ISO-8859-1"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document>
  <Placemark>
    <name>Río Grande</name> 
    <description>
      Location: 18° 22′ 49″ N, 65° 49′ 53″ W
    </description>
    ...
</kml>
like image 84
CodeMonkey Avatar answered Oct 04 '22 07:10

CodeMonkey