Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Basic Python file searching and I/O

I'm trying to complete a simple task in Python and I'm new to the language (I'm C++). I hope someone might be able to point me in the right direction.

Problem: I have an XML file (12mb) full of data and within the file there are start tags 'xmltag' and end tags '/xmltag' that represent the start and end of the data sections I would like to pull out.

I would like to navigate through this open file with a loop and for each instance locate a start tag and copy the data within the section to a new file until the end tag. I would then like to repeat this to the end of the file.

I'm happy with the file I/O but not the most efficient looping, searching and extracting of the data.

I really like the look of the language and hopefully I'm going to get more involved so I can give back to the community.

Big thanks!

like image 869
Joseph Darkins Avatar asked Oct 13 '22 22:10

Joseph Darkins


1 Answers

Check BeautifulSoup

from BeautifulSoup import BeautifulSoup

with open('bigfile.xml', 'r') as xml:
    soup = BeautifulSoup(xml):
    for xmltag in soup('xmltag'):
        print xmltag.contents
like image 133
eumiro Avatar answered Oct 18 '22 00:10

eumiro