Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to maintain case-sensitive tags in BeautifulSoup.BeautifulStoneSoup?

I am writing a script that edits an XML file with BeautifulStoneSoup, but the library converts all tags to lower case. Is there an option to conserve the case?

import BeautifulSoup    
xml = "<TestTag>a string</TestTag>"    
soup = BeautifulSoup.BeautifulStoneSoup(xml, markupMassage=False)    
print soup.prettify() # or soup.renderContents()
#prints
>>> <testtag>a string</testtag> 
#instead of the expected
>>> <TestTag>a string</TestTag>
like image 301
TankorSmash Avatar asked Aug 28 '12 16:08

TankorSmash


1 Answers

You could use Beautiful Soup 4, as follows (requires the lxml XML library):

In [10]: from bs4 import BeautifulSoup

In [11]: xml = "<TestTag>a string</TestTag>"

In [12]: soup = BeautifulSoup(xml, "xml")

In [13]: print soup
<?xml version="1.0" encoding="utf-8"?>
<TestTag>a string</TestTag>

In [14]:
like image 90
mzjn Avatar answered Oct 08 '22 04:10

mzjn