Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use lxml in Python 3?

In my project I need to parse an XML document using lxml.etree. I'm novice in Python, so I can't understand, how to find all categories with some tag. Let's describe it more accurately.

I have an XML like:

<cinema>
  <name>BestCinema</name>
  <films>
    <categories>
      <category>Action</category>
      <category>Thriller</category>
      <category>Soap opera</category>
    </categories>
  </films>
</cinema>

Now I need to get the list of all categories. In this case it will be:

      <category>Action</category>
      <category>Thriller</category>
      <category>Soap opera</category>

I need to use:

tree = etree.parse(file)

Thank you, any help is welcome.

like image 498
PepeHands Avatar asked Apr 08 '26 16:04

PepeHands


1 Answers

it should be as simple as:

from lxml import etree
el = etree.parse('input.xml')
categories = el.xpath("//category")
print(categories)
...

Everything else you should find in the tutorial.

like image 191
mata Avatar answered Apr 11 '26 01:04

mata