I am an amateur using Python on and off for some time now. Sorry if this is a silly question, but I was wondering if anyone knew an easy way to grab a bunch of lines if the format in the input file is like this: " Heading 1 Line 1 Line 2 Line 3 Heading 2 Line 1 Line 2 Line 3 " I won't know how many lines are after each heading, but I want to grab them all. All I know is the name, or a regular expression pattern for the heading. The only way I know to read a file is the "for line in file:" way, but I don't know how to grab the lines AFTER the line I'm currently on. Hope this makes sense, and thanks for the help! *Thanks for all the responses! I have tried to implement some of the solutions, but my problem is that not all the headings are the same name, and I'm not sure how to work around it. I need a different regular expression for each... any suggestions?*

You could use a variable to mark where which heading you are currently tracking, and if it is set, grab every line until you find another heading: <pre class="prettyprint"><code>data = {} for line in file: line = line.strip() if not line: continue if line.startswith('Heading '): if line not in data: data[line] = [] heading = line continue data[heading].append(line) </code></pre> Here's a http://codepad.org snippet that shows how it works: http://codepad.org/KA8zGS9E Edit: If you don't care about the actual heading values and just want a list at the end, you can use this: <pre class="prettyprint"><code>data = [] for line in file: line = line.strip() if not line: continue if line.startswith('Heading '): continue data.append(line) </code></pre> Basically, you don't really need to track a variable for the heading, instead you can just filter out all lines that match the Heading pattern.

How to grab the lines AFTER a matched line in python

Tags:

python

text

file-io

I am an amateur using Python on and off for some time now. Sorry if this is a silly question, but I was wondering if anyone knew an easy way to grab a bunch of lines if the format in the input file is like this:

" Heading 1

Line 1

Line 2

Line 3

Heading 2

Line 1

Line 2

Line 3 "

I won't know how many lines are after each heading, but I want to grab them all. All I know is the name, or a regular expression pattern for the heading.

The only way I know to read a file is the "for line in file:" way, but I don't know how to grab the lines AFTER the line I'm currently on. Hope this makes sense, and thanks for the help!

*Thanks for all the responses! I have tried to implement some of the solutions, but my problem is that not all the headings are the same name, and I'm not sure how to work around it. I need a different regular expression for each... any suggestions?*

626

asked Jan 04 '11 15:01

toofly

2 Answers

Generator Functions

def group_by_heading( some_source ):
    buffer= []
    for line in some_source:
        if line.startswith( "Heading" ):
            if buffer: yield buffer
            buffer= [ line ]
        else:
            buffer.append( line )
    yield buffer

with open( "some_file", "r" ) as source:
    for heading_and_lines in group_by_heading( source ):
        heading= heading_and_lines[0]
        lines= heading_and_lines[1:]
        # process away.

181

answered Nov 14 '22 23:11

S.Lott

You could use a variable to mark where which heading you are currently tracking, and if it is set, grab every line until you find another heading:

data = {}
for line in file:
    line = line.strip()
    if not line: continue

    if line.startswith('Heading '):
        if line not in data: data[line] = []
        heading = line
        continue

    data[heading].append(line)

Here's a http://codepad.org snippet that shows how it works: http://codepad.org/KA8zGS9E

Edit: If you don't care about the actual heading values and just want a list at the end, you can use this:

data = []
for line in file:
    line = line.strip()
    if not line: continue

    if line.startswith('Heading '):
        continue

    data.append(line)

Basically, you don't really need to track a variable for the heading, instead you can just filter out all lines that match the Heading pattern.

answered Nov 14 '22 23:11

Alex Vidal

Related questions
                            
                                Python regex for Java package names
                            
                                Python - Dynamic Nested List
                            
                                Recognizing notes within recorded sound - Python [closed]
                            
                                Improving __init__ where args are assigned directly to members
                            
                                Creating a new virtualenv hangs
                            
                                Why are collections not handled uniformly in Python?
                            
                                Selecting a Python Web Framework
                            
                                Python, using os.system - Is there a way for Python script to move past this without waiting for call to finish?
                            
                                How to distinguish between a sequence and a mapping
                            
                                what is a quick way to delete all elements from a list that do not satisfy a constraint?
                            
                                What is the most general python type to which I can add attributes?
                            
                                What is the best way to control Twisted's reactor so that it is nonblocking?
                            
                                ImportError : No module named _sqlite3 on GAE
                            
                                How to programmatically edit Excel sheets? [closed]
                            
                                Python: how to cut off sequences of more than 2 equal characters in a string
                            
                                Python style: lowercase class names for "namespaces"?
                            
                                Web service that returns any http status code you specify for API testing purposes?
                            
                                How do I create a subprocess in Python?
                            
                                Pyparsing: How can I parse data and then edit a specific value in a .txt file?
                            
                                Python threads garbage collection

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With