Python lxml parsing svg file

Tags:

I'm trying to parse .svg files from http://kanjivg.tagaini.net/ , but I can't successfully extract the information inside.

Edit 1:(full file) http://www.filedropper.com/0f9ab

A part of 0f9ab.svg looks like this:

<svg xmlns="http://www.w3.org/2000/svg" width="109" height="109" viewBox="0 0 109 109">
<g id="kvg:StrokePaths_0f9ab" style="fill:none;stroke:#000000;stroke-width:3;stroke-linecap:round;stroke-linejoin:round;">
<g id="kvg:0f9ab" kvg:element="嶺">
    <g id="kvg:0f9ab-g1" kvg:element="山" kvg:position="top" kvg:radical="general">
        <path id="kvg:0f9ab-s1" kvg:type="㇑a" d="M53.26,9.38c0.99,0.99,1.12,2.09,1.12,3.12c0,0.67,0.06,8.38,0.06,13.01"/>
        <path id="kvg:0f9ab-s2" kvg:type="㇄a"
    </g>
</g>
</g>

My .py file:

import lxml.etree as ET

svg = ET.parse('0f9ab.svg')
print(svg)  # <lxml.etree._ElementTree object at 0x7f3a2f659ec8>

# AttributeError: 'lxml.etree._ElementTree' object has no attribute 'tag'
print(svg.tag)

# TypeError: 'lxml.etree._ElementTree' object is not subscriptable
print(svg[0])

# TypeError: 'lxml.etree._ElementTree' object is not iterable
for child in svg:
    print(child)

# None
print(svg.find("./svg"))

# []
print(svg.findall("//g"))

# []
print(svg.xpath("//g"))

Purpose

I tried all kinds of operations I could think of, but nothing gets me any data from the .svg file. I want to extract the kanji (Japanese character) in kvg:element="kanji" (which are at different depth levels).

Question

Is using lxml the wrong package for this?
If not, how do I extract information from my parsed .svg file?

1 Answers

.parse() returns an ElementTree, which represents the tree as a whole. To query individual nodes, you need an Element, most likely the root element of the tree.

Replace part of your code with this:

xml = ET.parse('0f9ab.svg')
svg = xml.getroot()
print(svg)  # <lxml.etree._ElementTree object at 0x7f3a2f659ec8>

and I think you'll have some success.

Note also that .findall() requires a relative path and, in your case, a namespace qualifier:

print(svg.findall(".//{http://www.w3.org/2000/svg}g"))

answered Sep 21 '22 12:09

Robᵩ

Related questions
                            
                                How to sum and to mean one DataFrame to create another DataFrame
                            
                                Mock property return value gets overridden when instantiating mock object
                            
                                Matplotlib figure size in Jupyter reset by inlining in Jupyter
                            
                                Multiprocessing Pool - how to cancel all running processes if one returns the desired result?
                            
                                Compute the running (cumulative) maximum for a series in pandas
                            
                                how to initialize multiple columns to existing pandas DataFrame
                            
                                plot 2d lines by line equation in Python using Matplotlib
                            
                                Python, error with web driver (Selenium)
                            
                                Matplotlib "pick_event" not working in embedded graph with FigureCanvasTkAgg
                            
                                vlookup between 2 Pandas dataframes
                            
                                Python:Update list of tuples
                            
                                Replace 0 with blank in dataframe Python pandas
                            
                                Why are many Python built-in/standard library functions actually classes
                            
                                Pycharm does not recognize Cython modules located in path
                            
                                Is there anyway Google App Engine apps can communicate or control Machine Learning models or tasks?
                            
                                PonyORM: What is the most efficient way to add new items to a pony database without knowing which items already exist?
                            
                                Extracting headings' text from word doc
                            
                                Check if numpy array is masked or not
                            
                                Installing lxml, libxml2, libxslt for Python 3.5 on Windows 10
                            
                                Django just sort ListView by date

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python lxml parsing svg file

Tags:

python

svg

lxml

Purpose

Question

Other solution

NumesSanguis

People also ask

1 Answers

Robᵩ

Recent Activity

Donate For Us