I'm trying to complete a simple task in Python and I'm new to the language (I'm C++). I hope someone might be able to point me in the right direction.
Problem: I have an XML file (12mb) full of data and within the file there are start tags 'xmltag' and end tags '/xmltag' that represent the start and end of the data sections I would like to pull out.
I would like to navigate through this open file with a loop and for each instance locate a start tag and copy the data within the section to a new file until the end tag. I would then like to repeat this to the end of the file.
I'm happy with the file I/O but not the most efficient looping, searching and extracting of the data.
I really like the look of the language and hopefully I'm going to get more involved so I can give back to the community.
Big thanks!
Check BeautifulSoup
from BeautifulSoup import BeautifulSoup
with open('bigfile.xml', 'r') as xml:
soup = BeautifulSoup(xml):
for xmltag in soup('xmltag'):
print xmltag.contents
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With