Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading first n lines of a CSV into a dictionary

I have a CSV file I'd like to read into a dictionary for subsequent insertion into a MongoDB collection entitled projects.

I accomplished this with the following:

with open('opendata_projects.csv') as f:
    records = csv.DictReader(f)
    projects.insert(records)

However, I found my poor sandbox account couldn't hold all the data. In turn, I'd like to read in the first n lines so I can play around with the data and get used to working with MongoDB.

First I checked the docs for the csv.DictReader function:

class csv.DictReader(csvfile, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)

But the function doesn't seem to allow for entering in the number of rows I'd like as a parameter.

So I attempted to do so by writing the following code:

with open('opendata_projects.csv') as f:
    records = csv.DictReader(f)
    for i in records:
        if i <= 100:
            projects.insert(i)

Which was followed by the error:

TypeError: unorderable types: dict() <= int()

This prompted me to look into dictionaries further, and I found they are unordered. Nevertheless, it seems an example from the Python csv docs suggests I can iterate with csv.DictReader:

with open('names.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row['first_name'], row['last_name'])

Is there a way to accomplish what I'd like to do by using these functions?

like image 530
Chuck Avatar asked Apr 28 '15 06:04

Chuck


1 Answers

You can use itertools.islice, like this

import csv, itertools

with open('names.csv') as csvfile:
    for row in itertools.islice(csv.DictReader(csvfile), 100):
        print(row['first_name'], row['last_name'])

islice will create an iterator from the iterable object you pass and it will allow you iterate till the limit, you pass as the second parameter.


Apart from that, if you want to count yourself, you can use enumerate function, like this

for index, row in enumerate(csv.DictReader(csvfile)):
    if index >= 100:
        break
    print(row['first_name'], row['last_name'])
like image 98
thefourtheye Avatar answered Oct 07 '22 13:10

thefourtheye