Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check for number of columns in each row of CSV

I have the following Python code:

import os
import csv
import sys

g = open('Consolidated.csv', "wb")
for root, dirs, files in os.walk('D:\\XXX\\YYY\\S1'):
    for filename in files:
            pathname = os.path.join(root, filename)
            symbol = filename.rpartition('_')[-1].rpartition('.')[0]
            reader = csv.reader(open(pathname, 'rU'))
            writer = csv.writer(g, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL)

            for row in reader:
                row.insert(0, symbol.upper())
                if len(row[2]) == 3:
                    row[2] = '0'+row[2]
                writer.writerow(row)

The basic idea is that I have a couple of CSV files in S1 that I need to merge to a large CSV. The files are named in a funny way, which leads to the rpartition and row manipulations in the code.

This code works fine, but my question is as follows: how does one check the number of columns in EACH row of the CSV file? An example: if an input CSV file is in the following format, expected to have five columns: 1,2,3,4,5, the code would display "1" "2" "3" "4" "5" (seperated by tabs) in the consolidated file. Now let's say for whatever reason one row entry in the CSV file is like: 6,7,8. So it stops abruptly without all the columns filled in. In this case, I want the code to ignore this line and not produce "6" "7" "8" into the consolidation.

Could someone kindly provide code on how to do so? For each row in the input CSVs I want to check if it is a full row before manipulating it.

Any help would be massively appreciated.

Warm Regards.

like image 226
genesis Avatar asked Aug 31 '25 01:08

genesis


1 Answers

len(row)

will give the number of columns in the row.

You can do

for row in reader:
    if not len(row)<desired_number_of_columns:
        # process the row here

For example, if your csv file looks like this

1,2,3,4,5
a,b,c,d,e
l1,l2
d,e,f,g,h

running

import csv
reader = csv.reader(open("csvfile.csv","r"))
for row in reader:
    if not len(row)<5:
        print(" ".join(row))

will produce the output

1 2 3 4 5
a b c d e
d e f g h

ignoring the row with length 2.

like image 62
Matthew Avatar answered Sep 02 '25 14:09

Matthew