I'm new to Python and trying to do a nested loop. I have a very large file (1.1 million rows), and I'd like to use it to create a file that has each line along with the next N lines, for example with the next 3 lines:
1 2
1 3
1 4
2 3
2 4
2 5
Right now I'm just trying to get the loops working with rownumbers instead of the strings since it's easier to visualize. I came up with this code, but it's not behaving how I want it to:
with open('C:/working_file.txt', mode='r', encoding = 'utf8') as f:
for i, line in enumerate(f):
line_a = i
lower_bound = i + 1
upper_bound = i + 4
with open('C:/working_file.txt', mode='r', encoding = 'utf8') as g:
for j, line in enumerate(g):
while j >= lower_bound and j <= upper_bound:
line_b = j
j = j+1
print(line_a, line_b)
Instead of the output I want like above, it's giving me this:
990 991
990 992
990 993
990 994
990 992
990 993
990 994
990 993
990 994
990 994
As you can see the inner loop is iterating multiple times for each line in the outer loop. It seems like there should only be one iteration per line in the outer loop. What am I missing?
EDIT: My question was answered below, here is the exact code I ended up using:
from collections import deque
from itertools import cycle
log = open('C:/example.txt', mode='w', encoding = 'utf8')
try:
xrange
except NameError: # python3
xrange = range
def pack(d):
tup = tuple(d)
return zip(cycle(tup[0:1]), tup[1:])
def window(seq, n=2):
it = iter(seq)
d = deque((next(it, None) for _ in range(n)), maxlen=n)
yield pack(d)
for e in it:
d.append(e)
yield pack(d)
for l in window(open('c:/working_file.txt', mode='r', encoding='utf8'),100):
for a, b in l:
print(a.strip() + '\t' + b.strip(), file=log)
Python file method next() is used when a file is used as an iterator, typically in a loop, the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit.
Microsoft BASIC had a nesting limit of 8. @Davislor: The error message refers to the stack size in the compiler, which is also a program, and which is recursively processing the nested-looped construct.
When a loop is nested inside another loop, the inner loop runs many times inside the outer loop. In each iteration of the outer loop, the inner loop will be re-started. The inner loop must finish all of its iterations before the outer loop can continue to its next iteration.
Based on window example from old docs you can use something like:
from collections import deque
from itertools import cycle
try:
xrange
except NameError: # python3
xrange = range
def pack(d):
tup = tuple(d)
return zip(cycle(tup[0:1]), tup[1:])
def window(seq, n=2):
it = iter(seq)
d = deque((next(it, None) for _ in xrange(n)), maxlen=n)
yield pack(d)
for e in it:
d.append(e)
yield pack(d)
Demo:
>>> for l in window([1,2,3,4,5], 4):
... for l1, l2 in l:
... print l1, l2
...
1 2
1 3
1 4
2 3
2 4
2 5
So, basically you can pass your file to window to get desired result:
window(open('C:/working_file.txt', mode='r', encoding='utf8'), 4)
You can do this with slices. This is easiest if you read the whole file into a list first:
with open('C:/working_file.txt', mode='r', encoding = 'utf8') as f:
data = f.readlines()
for i, line_a in enumerate(data):
for j, line_b in enumerate(data[i+1:i+5], start=i+1):
print(i, j)
When you change it to printing the lines instead of the line numbers, you can drop the second enumerate
and just do for line_b in data[i+1:i+5]
. Note that the slice includes the item at the start index, but not the item at the end index, so that needs to be one higher than your current upper bound.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With