Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I open all files of a certain type in Python and process them?

Tags:

python

I'm trying to figure out how to make python go through a directory full of csv files, process each of the files and spit out a text file with a trimmed list of values.

In this example, I'm iterating through a CSV with lots of different types of columns but all I really want are the first name, last name, and keyword. I have a folder full of these csvs with different columns (except they all share first name, last name, and keyword somewhere in the csv). What's the best way to open that folder, go through each csv file, and then spit it all out as either its own csv file for just a text list as I have in the example below.

import csv
reader = csv.reader(open("keywords.csv"))
rownum = 0
headnum = 0
F = open('compiled.txt','w')
for row in reader:
    if rownum == 0:
        header = row;
        for col in row:
            if header[headnum]=='Keyword':
                keywordnum=headnum;
            elif header[headnum]=='First Name':
                firstnamenum=headnum;
            elif header[headnum]=='Last Name':
                lastnamenum=headnum;
            headnum +=1
    else:
        currentrow=row
        print(currentrow[keywordnum] + '\n' + currentrow[firstnamenum] + '\n' + currentrow[lastnamenum]) 
        F.write(currentrow[keywordnum] + '\n')

    rownum +=1
like image 272
Imran Avatar asked Dec 09 '22 18:12

Imran


2 Answers

The best way is probably to use the shell's globbing ability, or alternatively the glob module of Python.

Shell (Linux, Unix)

Shell:

python myapp.py folder/*.csv

myapp.py:

import sys
for filename in sys.argv[1:]:
    with open(filename) as f:
        # do something with f

Windows (Or no shell available.)

import glob
for filename in glob.glob("folder/*.csv"):
    with open(filename) as f:
        # do something with f

Note: Python 2.5 needs from __future__ import with_statement

like image 143
Georg Schölly Avatar answered May 16 '23 00:05

Georg Schölly


The "get all the CSV files" part of the question has been answered several times (including by the OP), but the "get the right named columns" hasn't yet: csv.DictReader makes it trivial -- the "process one CSV file" loop becomes just:

reader = csv.DictReader(open(thecsvfilename))
for row in reader:
    print('\n'.join(row['Keyword'], row['First Name'], row['Last Name'])) 
    F.write(row['Keyword'] + '\n')
like image 27
Alex Martelli Avatar answered May 16 '23 01:05

Alex Martelli