Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Looping through CSV files and their columns

so I've seen this done is other questions asked here but I'm still a little confused. I've been learning python3 for the last few days and figured I'd start working on a project to really get my hands dirty. I need to loop through a certain amount of CSV files and make edits to those files. I'm having trouble with going to a specific column and also for loops in python in general. I'm used to the convention (int i = 0; i < expression; i++), but in python it's a little different. Here's my code so far and I'll explain where my issue is.

import os
import csv

pathName = os.getcwd()

numFiles = []
fileNames = os.listdir(pathName)
for fileNames in fileNames:
    if fileNames.endswith(".csv"):
        numFiles.append(fileNames)

for i in numFiles:
    file = open(os.path.join(pathName, i), "rU")
    reader = csv.reader(file, delimiter=',')
    for column in reader:
        print(column[4])

My issue falls on this line:

for column in reader:
        print(column[4])

So in the Docs it says column is the variable and reader is what I'm looping through. But when I write 4 I get this error:

IndexError: list index out of range

What does this mean? If I write 0 instead of 4 it prints out all of the values in column 0 cell 0 of each CSV file. I basically need it to go through the first row of each CSV file and find a specific value and then go through that entire column. Thanks in advance!

like image 737
humbleCoder Avatar asked Aug 29 '17 20:08

humbleCoder


People also ask

How do I iterate over a csv file in Python?

Step 1: Load the CSV file using the open method in a file object. Step 2: Create a reader object with the help of DictReader method using fileobject. This reader object is also known as an iterator can be used to fetch row-wise data. Step 3: Use for loop on reader object to get each row.


2 Answers

It could be that you don't have 5 columns in your .csv file.

Python is base0 which means it starts counting at 0 so the first column would be column[0], the second would be column[1].

Also you may want to change your

for column in reader:

to

for row in reader:

because reader iterates through the rows, not the columns.

This code loops through each row and then each column in that row allowing you to view the contents of each cell.

for i in numFiles:
    file = open(os.path.join(pathName, i), "rU")
    reader = csv.reader(file, delimiter=',')
    for row in reader:
        for column in row:
            print(column)
            if column=="SPECIFIC VALUE":
                #do stuff
like image 176
Bigbob556677 Avatar answered Sep 22 '22 04:09

Bigbob556677


Welcome to Python! I suggest you to print some debugging messages.

You could add this to you printing loop:

for row in reader:
    try:
        print(row[4])
    except IndexError as ex:
        print("ERROR: %s in file %s doesn't contain 5 colums" % (row, i))

This will print bad lines (as lists because this is how they are represented in CSVReader) so you could fix the CSV files.

Some notes:

  1. It is common to use snake_case in Python and not camelCase
  2. Name your variables appropriately (csv_filename instead of i, row instead of column etc.)
  3. Use the with close to handle files (read more)

Enjoy!

like image 45
Doron Cohen Avatar answered Sep 20 '22 04:09

Doron Cohen