Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract value from tag python

Tags:

python

How can i extract the value of name4 from the following? the example provided is a sample ? how can i do the same with xml.parsers.expat, i am using python 2.4 that doesnt have xml.etree

<test name1="" name2="" name3="0.0.0.0" name4="Linux">
</test>
like image 557
Sandeep Krishnan Avatar asked Mar 06 '26 06:03

Sandeep Krishnan


2 Answers

using lxml.html

import lxml.html as lh

doc=lh.fromstring('<test name1="" name2="" name3="0.0.0.0" name4="Linux"></test>')

doc.xpath('.//@name4')
Out[298]: ['Linux']

Note1: regex can be used for this simple example but using regex to parse xml/html is a bad practice and you should not get into a habit of doing so.

Note2: if you are to into installing lxml, xml.etree.ElementTree is as good (lightweight?) alternative that comes with python, especially for simpler tasks.

like image 56
root Avatar answered Mar 08 '26 20:03

root


Sometimes it's really easy to use BeautifulSoup

from BeautifulSoup import BeautifulSoup as bs

your_string = """<test name1="" name2="" name3="0.0.0.0" name4="Linux"></test>"""

soup = bs(your_string)
res = soup.findAll('test')
for i in res:
    print i.get('name4')

Also you can find more examples on documentation page

Update how to change name of attribute and print whole xml:

from BeautifulSoup import BeautifulSoup as bs

your_string = """<test name1="" name2="" name3="0.0.0.0" name4="Linux"></test>"""

soup = bs(your_string)
s = soup.test
s['name4'] = 'Ubuntu'
print soup
like image 36
Ishikawa Yoshi Avatar answered Mar 08 '26 20:03

Ishikawa Yoshi