I'm new to Python and trying to do a nested loop. I have a very large file (1.1 million rows), and I'd like to use it to create a file that has each line along with the next N lines, for example with the next 3 lines: <pre class="prettyprint"><code>1 2 1 3 1 4 2 3 2 4 2 5 </code></pre> Right now I'm just trying to get the loops working with rownumbers instead of the strings since it's easier to visualize. I came up with this code, but it's not behaving how I want it to: <pre class="prettyprint"><code>with open('C:/working_file.txt', mode='r', encoding = 'utf8') as f: for i, line in enumerate(f): line_a = i lower_bound = i + 1 upper_bound = i + 4 with open('C:/working_file.txt', mode='r', encoding = 'utf8') as g: for j, line in enumerate(g): while j >= lower_bound and j <= upper_bound: line_b = j j = j+1 print(line_a, line_b) </code></pre> Instead of the output I want like above, it's giving me this: <pre class="prettyprint"><code>990 991 990 992 990 993 990 994 990 992 990 993 990 994 990 993 990 994 990 994 </code></pre> As you can see the inner loop is iterating multiple times for each line in the outer loop. It seems like there should only be one iteration per line in the outer loop. What am I missing? EDIT: My question was answered below, here is the exact code I ended up using: <pre class="prettyprint"><code>from collections import deque from itertools import cycle log = open('C:/example.txt', mode='w', encoding = 'utf8') try: xrange except NameError: # python3 xrange = range def pack(d): tup = tuple(d) return zip(cycle(tup[0:1]), tup[1:]) def window(seq, n=2): it = iter(seq) d = deque((next(it, None) for _ in range(n)), maxlen=n) yield pack(d) for e in it: d.append(e) yield pack(d) for l in window(open('c:/working_file.txt', mode='r', encoding='utf8'),100): for a, b in l: print(a.strip() + '\t' + b.strip(), file=log) </code></pre>

You can do this with slices. This is easiest if you read the whole file into a list first: <pre class="prettyprint"><code>with open('C:/working_file.txt', mode='r', encoding = 'utf8') as f: data = f.readlines() for i, line_a in enumerate(data): for j, line_b in enumerate(data[i+1:i+5], start=i+1): print(i, j) </code></pre> When you change it to printing the lines instead of the line numbers, you can drop the second <code>enumerate</code> and just do <code>for line_b in data[i+1:i+5]</code>. Note that the slice includes the item at the start index, but not the item at the end index, so that needs to be one higher than your current upper bound.

Python nested loop - get next N lines

Tags:

python

loops

nested

I'm new to Python and trying to do a nested loop. I have a very large file (1.1 million rows), and I'd like to use it to create a file that has each line along with the next N lines, for example with the next 3 lines:

Right now I'm just trying to get the loops working with rownumbers instead of the strings since it's easier to visualize. I came up with this code, but it's not behaving how I want it to:

with open('C:/working_file.txt', mode='r', encoding = 'utf8') as f: 
for i, line in enumerate(f):
     line_a = i
     lower_bound = i + 1
     upper_bound = i + 4
     with open('C:/working_file.txt', mode='r', encoding = 'utf8') as g:
        for j, line in enumerate(g):
            while j >= lower_bound and j <= upper_bound:
                line_b = j
                j = j+1
                print(line_a, line_b)

Instead of the output I want like above, it's giving me this:

990     991
990     992
990     993
990     994
990     992
990     993
990     994
990     993
990     994
990     994

As you can see the inner loop is iterating multiple times for each line in the outer loop. It seems like there should only be one iteration per line in the outer loop. What am I missing?

EDIT: My question was answered below, here is the exact code I ended up using:

from collections import deque
from itertools import cycle
log = open('C:/example.txt', mode='w', encoding = 'utf8') 
try:
    xrange 
except NameError: # python3
    xrange = range

def pack(d):
    tup = tuple(d)
    return zip(cycle(tup[0:1]), tup[1:])

def window(seq, n=2):
    it = iter(seq)
    d = deque((next(it, None) for _ in range(n)), maxlen=n)
    yield pack(d)
    for e in it:
        d.append(e)
        yield pack(d)

for l in window(open('c:/working_file.txt', mode='r', encoding='utf8'),100):
    for a, b in l:
        print(a.strip() + '\t' + b.strip(), file=log)

563

asked Dec 10 '13 00:12

raspberry_door

2 Answers

Based on window example from old docs you can use something like:

from collections import deque
from itertools import cycle

try:
    xrange 
except NameError: # python3
    xrange = range

def pack(d):
    tup = tuple(d)
    return zip(cycle(tup[0:1]), tup[1:])

def window(seq, n=2):
    it = iter(seq)
    d = deque((next(it, None) for _ in xrange(n)), maxlen=n)
    yield pack(d)
    for e in it:
        d.append(e)
        yield pack(d)

Demo:

>>> for l in window([1,2,3,4,5], 4):
...     for l1, l2 in l:
...         print l1, l2
...
1 2
1 3
1 4
2 3
2 4
2 5

So, basically you can pass your file to window to get desired result:

window(open('C:/working_file.txt', mode='r', encoding='utf8'), 4)

112

answered Oct 06 '22 01:10

alko

You can do this with slices. This is easiest if you read the whole file into a list first:

with open('C:/working_file.txt', mode='r', encoding = 'utf8') as f: 
    data = f.readlines()

for i, line_a in enumerate(data):
    for j, line_b in enumerate(data[i+1:i+5], start=i+1):
        print(i, j)

When you change it to printing the lines instead of the line numbers, you can drop the second enumerate and just do for line_b in data[i+1:i+5]. Note that the slice includes the item at the start index, but not the item at the end index, so that needs to be one higher than your current upper bound.

answered Oct 06 '22 00:10

lvc

Related questions
                            
                                Sending mail error with python smtplib
                            
                                Fastest algorithm possible to pick number pairs
                            
                                Appending a level to a Pandas Series Index
                            
                                Different fill methods for different columns in pandas
                            
                                Why does list(my_list) modify the object?
                            
                                Boolean Python Value confusion
                            
                                ipython run without destroying global variables defined in the target file
                            
                                Use BeautifulSoup to Iterate over XML to pull specific tags and store in variable
                            
                                Generating weighted random numbers
                            
                                how can we riffle shuffle the elements of a list in python?
                            
                                Python:can I assume that conditions are tested from left to right and stop when met?
                            
                                python + matplotlib: use locale to format y axis
                            
                                Loading modules in IronPython
                            
                                To get Parent and ChildProcess ID from process ID in Python
                            
                                Random "pythonw.exe has stopped working" crashing
                            
                                python 2.6 - django TestCase - assertRaises ValidationError clean() method
                            
                                How can I rank images by only comparing them to each other? [closed]
                            
                                Convert negative y axis to positive (matplotlib)
                            
                                Python Wand converts from PDF to JPG background is incorrect
                            
                                How do I execute a python script on Heroku?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With