Don't understand Python's csv.reader object

Tags:

I've come across a behavior in python's built-in csv module that I've never noticed before. Typically, when I read in a csv, it's following the doc's pretty much verbatim, using 'with' to open the file then looping over the reader object with a 'for' loop. However, I recently tried iterating over the csv.reader object twice in a row, only to find out that the second 'for' loop did nothing.

import csv

with open('smallfriends.csv','rU') as csvfile:
readit = csv.reader(csvfile,delimiter=',')

for line in readit:
    print line

for line in readit:
    print 'foo'

Console Output:

Austins-iMac:Desktop austin$ python -i amy.py 
['Amy', 'James', 'Nathan', 'Sara', 'Kayley', 'Alexis']
['James', 'Nathan', 'Tristan', 'Miles', 'Amy', 'Dave']
['Nathan', 'Amy', 'James', 'Tristan', 'Will', 'Zoey']
['Kayley', 'Amy', 'Alexis', 'Mikey', 'Sara', 'Baxter']
>>>
>>> readit
<_csv.reader object at 0x1023fa3d0>
>>>

So the second 'for' loop basically does nothing. One thought I had is the csv.reader object is being released from memory after being read once. This isn't the case though since it still retains it's memory address. I found a post that mentions a similar problem. The reason they gave is that once the object is read, the pointer stay's at the end of the memory address ready to write data to the object. Is this correct? Could someone go into greater detail as to what is going on here? Is there a way to push the pointer back to the beginning of the memory address to reread it? I know it's bad coding practices to do that but I'm mainly just curious and wanting to learn more about what goes on under Python's hood.

Thanks!

383

asked Dec 03 '14 06:12

Austin A

2 Answers

I'll try to answer your other questions about what the reader is doing and why reset() or seek(0) might help. In the most basic form, the csv reader might look something like this:

def csv_reader(it):
    for line in it:
        yield line.strip().split(',')

That is, it takes any iterator producing strings and gives you a generator. All it does is take an item from your iterator, process it and return the item. When it is consumed, the csv_reader will quit. The reader has no idea where the iterator came from or how to properly make a fresh one, so it doesn't even try to reset itself. That is left to the programmer.

We can either modify the iterator in place without the reader knowing or just make a new reader. Here are some examples to demonstrate my point.

data = open('data.csv', 'r')
reader = csv.reader(data)

print(next(reader))               # Parse the first line
[next(data) for _ in range(5)]    # Skip the next 5 lines on the underlying iterator
print(next(reader))               # This will be the 7'th line in data
print(reader.line_num)            # reader thinks this is the 2nd line
data.seek(0)                      # Go back to the beginning of the file
print(next(reader))               # gives first line again

data = ['1,2,3', '4,5,6', '7,8,9']
reader = csv.reader(data)         # works fine on lists of strings too
print(next(reader))               # ['1', '2', '3']

In general if you need a 2nd pass, its best to close/reopen your files and use a new csv reader. Its clean and ensures nice bookkeeping.

answered Sep 24 '22 08:09

kalhartt

Iterating over a csvreader simply wraps iterating over the lines in the underlying file object. On each iteration the reader gets the next line from the file, converts and returns it.

So iterating over a csvreader follows the same conventions as iterating over files. That is, once the file reached its end you'd have to seek to the start before iterating a second time.

The below should do, though I haven't tested it:

import csv

with open('smallfriends.csv','rU') as csvfile:
    readit = csv.reader(csvfile,delimiter=',')

    for line in readit:
        print line

    # go back to the start of the file
    csvfile.seek(0)

    for line in readit:
        print 'foo

answered Sep 26 '22 08:09

sebastian

Related questions
                            
                                Can't remove line breaks from BeautifulSoup text output (Python 2.7.5)
                            
                                Python list inside string to list [duplicate]
                            
                                Should I define functions inside or outside of main()?
                            
                                Getting standard error associated with parameter estimates from scipy.optimize.curve_fit
                            
                                sqlalchemy, hybrid property case statement
                            
                                Panda's read_csv always crashes on small file
                            
                                Pandas group by and sum two columns
                            
                                can't import is_secure_transport
                            
                                Python: making color bar that runs from red to blue
                            
                                Python faster alternative to dictionary? [duplicate]
                            
                                Pandas divide one row by another and output to another row in the same dataframe
                            
                                What is the "&=" operator and why does Twilio use it when comparing strings?
                            
                                Django admin. Hide field on change select field
                            
                                passing arrays with ctypes
                            
                                How to print regex match results in python 3?
                            
                                How to delete files from the server with Flask
                            
                                Python+OpenCV 3 - cant use SIFT
                            
                                Is there a way to find out if A is a submatrix of B?
                            
                                How to get a online video's duration without downloading the full video?
                            
                                Read file until specific line in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Don't understand Python's csv.reader object

Tags:

python

object

pointers

memory

csv

Austin A

People also ask

2 Answers

kalhartt

sebastian

Recent Activity

Donate For Us