Nested loops iterating on a single file

Question

I want to delete some specific lines in a file. The part I want to delete is enclosed between two lines (that will be deleted too), named STARTING_LINE and CLOSING_LINE. If there is no closing line before the end of the file, then the operation should stop.

Example:

...blabla...
[Start] <-- # STARTING_LINE
This is the body that I want to delete
[End] <-- # CLOSING_LINE
...blabla...

I came out with three different ways to achieve the same thing (plus one provided by tdelaney's answer below), but I am wondering which one is the best. Please note that I am not looking for a subjective opinion: I would like to know if there are some real reasons why I should choose one method over another.

1. A lot of `if` conditions (just one `for` loop):

def delete_lines(filename):
    with open(filename, 'r+') as my_file:
        text = ''
        found_start = False
        found_end = False

        for line in my_file:
            if not found_start and line.strip() == STARTING_LINE.strip():
                found_start = True
            elif found_start and not found_end:
                if line.strip() == CLOSING_LINE.strip():
                    found_end = True
                continue
            else:
                print(line)
                text += line

        # Go to the top and write the new text
        my_file.seek(0)
        my_file.truncate()
        my_file.write(text)

2. Nested `for` loops on the open file:

def delete_lines(filename):
    with open(filename, 'r+') as my_file:
        text = ''
        for line in my_file:
            if line.strip() == STARTING_LINE.strip():
                # Skip lines until we reach the end of the function
                # Note: the next `for` loop iterates on the following lines, not
                # on the entire my_file (i.e. it is not starting from the first
                # line). This will allow us to avoid manually handling the
                # StopIteration exception.
                found_end = False
                for function_line in my_file:
                    if function_line.strip() == CLOSING_LINE.strip():
                        print("stop")
                        found_end = True
                        break
                if not found_end:
                    print("There is no closing line. Stopping")
                    return False
            else:
                text += line

        # Go to the top and write the new text
        my_file.seek(0)
        my_file.truncate()
        my_file.write(text)

3. `while True` and `next()` (with `StopIteration` exception)

def delete_lines(filename):
    with open(filename, 'r+') as my_file:
        text = ''
        for line in my_file:
            if line.strip() == STARTING_LINE.strip():
                # Skip lines until we reach the end of the function
                while True:
                    try:
                        line = next(my_file)
                        if line.strip() == CLOSING_LINE.strip():
                            print("stop")
                            break
                    except StopIteration as ex:
                        print("There is no closing line.")
            else:
                text += line

        # Go to the top and write the new text
        my_file.seek(0)
        my_file.truncate()
        my_file.write(text)

4. `itertools` (from tdelaney's answer):

def delete_lines_iter(filename):
    with open(filename, 'r+') as wrfile:
        with open(filename, 'r') as rdfile:
            # write everything before startline
            wrfile.writelines(itertools.takewhile(lambda l: l.strip() != STARTING_LINE.strip(), rdfile))
            # drop everything before stopline.. and the stopline itself
            try:
                next(itertools.dropwhile(lambda l: l.strip() != CLOSING_LINE.strip(), rdfile))
            except StopIteration:
                pass
            # include everything after
            wrfile.writelines(rdfile)
        wrfile.truncate()

It seems that these four implementations achieve the same result. So...

Question: which one should I use? Which one is the most Pythonic? Which one is the most efficient?

Is there a better solution instead?

Edit: I tried to evaluate the methods on a big file using timeit. In order to have the same file on each iteration, I removed the writing parts of each code; this means that the evaluation mostly regards the reading (and file opening) task.

t_if = timeit.Timer("delete_lines_if('test.txt')", "from __main__ import delete_lines_if")
t_for = timeit.Timer("delete_lines_for('test.txt')", "from __main__ import delete_lines_for")
t_while = timeit.Timer("delete_lines_while('test.txt')", "from __main__ import delete_lines_while")
t_iter = timeit.Timer("delete_lines_iter('test.txt')", "from __main__ import delete_lines_iter")

print(t_if.repeat(3, 4000))
print(t_for.repeat(3, 4000))
print(t_while.repeat(3, 4000))
print(t_iter.repeat(3, 4000))

Result:

# Using IF statements:
[13.85873354100022, 13.858520206999856, 13.851908310999988]
# Using nested FOR:
[13.22578497800032, 13.178281234999758, 13.155530822999935]
# Using while:
[13.254994718000034, 13.193942980999964, 13.20395484699975]
# Using itertools:
[10.547019549000197, 10.506679693000024, 10.512742852999963]

tdelaney · Accepted Answer

You can make it fancy with itertools. I'd be interested in how timing compares.

import itertools
def delete_lines(filename):
    with open(filename, 'r+') as wrfile:
        with open(filename, 'r') as rdfile:
            # write everything before startline
            wrfile.writelines(itertools.takewhile(lambda l: l.strip() != STARTING_LINE.strip(), rdfile))
            # drop everything before stopline.. and the stopline itself
            next(itertools.dropwhile(lambda l: l.strip() != CLOSING_LINE.strip(), rdfile))
            # include everything after 
            wrfile.writelines(rdfile)
        wrfile.truncate()

Nested loops iterating on a single file

Tags:

performance

python

for-loop

if-statement

while-loop

1. A lot of `if` conditions (just one `for` loop):

2. Nested `for` loops on the open file:

3. `while True` and `next()` (with `StopIteration` exception)

4. `itertools` (from tdelaney's answer):

Kurt Bourbaki

1 Answers

tdelaney

Recent Activity

Donate For Us

Nested loops iterating on a single file

Tags:

performance

python

for-loop

if-statement

while-loop

1. A lot of if conditions (just one for loop):

2. Nested for loops on the open file:

3. while True and next() (with StopIteration exception)

4. itertools (from tdelaney's answer):

Kurt Bourbaki

1 Answers

tdelaney

Related questions

Recent Activity

Donate For Us

1. A lot of `if` conditions (just one `for` loop):

2. Nested `for` loops on the open file:

3. `while True` and `next()` (with `StopIteration` exception)

4. `itertools` (from tdelaney's answer):