Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add a new column to the beginning of the rows of a CSV file?

I have one csv file in which I have 6 to 8 column.
Ex:

ID Test Description file-name module view path1 path2 

I want to add new column (Node) to the beginning.
Ex:

Node ID Test Description file-name module view path1 path2 
like image 289
learner Avatar asked Feb 02 '11 08:02

learner


2 Answers

It would be fairly easy to do using the csv module's DictReader and DictWriter classes. Here's an example that reads the old file and writes the new one in single pass.

A DictReader instance returns each logical line or row of the file as a dictionary whose keys are the field names. You can explicitly specify the field names or they can be read from the first line of the file (as is done in the example below).

You must specify the desired field names when creating a DictWriter instance and the order of the field names defines the order they will appear on each line of the output file. In this case the new field name is simply added to beginning of the list of names from the input file — whatever they may be.

import csv

with open('testdata.txt', 'r', newline='') as inf, \
     open('testdata2.txt', 'w', newline='') as outf:
    csvreader = csv.DictReader(inf)
    fieldnames = ['Node'] + csvreader.fieldnames  # Add column name to beginning.
    csvwriter = csv.DictWriter(outf, fieldnames)
    csvwriter.writeheader()
    for node, row in enumerate(csvreader, start=1):
        csvwriter.writerow(dict(row, Node='node %s' % node))

If this was the contents of the input file:

ID,Test Description,file-name,module,view,path1,path2
id 1,test 1 desc,test1file.txt,test1module,N,test1path1,test1path2
id 2,test 2 desc,test2file.txt,test2module,Y,test2path1,test2path2
id 3,test 3 desc,test3file.txt,test3module,Y,test3path1,test3path2
id 4,test 4 desc,test4file.txt,test4module,N,test4path1,test4path2
id 5,test 5 desc,test5file.txt,test5module,Y,test5path1,test5path2

This would be the contents of the resulting output file after running the script:

Node,ID,Test Description,file-name,module,view,path1,path2
node 1,id 1,test 1 desc,test1file.txt,test1module,N,test1path1,test1path2
node 2,id 2,test 2 desc,test2file.txt,test2module,Y,test2path1,test2path2
node 3,id 3,test 3 desc,test3file.txt,test3module,Y,test3path1,test3path2
node 4,id 4,test 4 desc,test4file.txt,test4module,N,test4path1,test4path2
node 5,id 5,test 5 desc,test5file.txt,test5module,Y,test5path1,test5path2

Note that adding the data for a field to each row with dict(row, Node='node %s' % node) as shown only works when the field name is a valid keyword argument (i.e. valid Python identifier) — like Node.

Valid identifiers consist only of letters, digits, and underscores but not start with a digit or underscore, and cannot be language keyword such as class, for, return, global, pass, etc.

The workaround for this limitation is to update each row dictionary manually since the field name cannot be used as a keyword argument:

    fieldnames = ['Invalid-Identifier''] + csvreader.fieldnames  # Add column name.
    ...
    for node, row in enumerate(csvreader, 1):
        row['Invalid-Identifier'] = 'node %s' % node  # add new field and value
        csvwriter.writerow(row)
like image 174
martineau Avatar answered Oct 17 '22 00:10

martineau


You can use the CSV module to read in your CSV file and write out an edited version with an appended column. Remember that adding a column is adding an extra entry to the end of each line.

An example of outputting with the CSV module (http://docs.python.org/library/csv.html)

>>> import csv
>>> spamWriter = csv.writer(open('eggs.csv', 'wb'), delimiter=' ',
...                         quotechar='|', quoting=csv.QUOTE_MINIMAL)
>>> spamWriter.writerow(['Spam'] * 5 + ['Baked Beans'])
>>> spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])
like image 38
sichinumi Avatar answered Oct 16 '22 23:10

sichinumi