I have a large Comma-Separated File (6GB) and would like to add an index column to it. I'm looking at Unix type solutions for efficiency. I'm using a Mac.
I have this:
V1  V2  V3
0.4625  0.9179  0.8384
0.9324  0.2486  0.1114 
0.6691  0.7813  0.6705
0.1935  0.3303  0.4336
Would like to get this:
ID  V1  V2  V3
1   0.4625  0.9179  0.8384
2   0.9324  0.2486  0.1114
3   0.6691  0.7813  0.6705
4   0.1935  0.3303  0.4336
                You can load index information from a CSV file.
An index column is also added to an Excel worksheet when you load it. To open a query, locate one previously loaded from the Power Query Editor, select a cell in the data, and then select Query > Edit. For more information see Create, load, or edit a query in Excel (Power Query). Select Add Column > Index Column.
This will probably work:
awk -F'\t' -v OFS='\t' '
  NR == 1 {print "ID", $0; next}
  {print (NR-1), $0}
' input.csv > output.csv
In awk, the NR variable is "the total number of input records seen so far", which in general means "the current line number".  So the NR == 1 in the first line is how we match the first record and add the "ID" column header, and for the remaining lines we use NR-1 as the index.
The -F'\t' argument sets the input field separator, and -vOFS='\t' sets the output field separator.
Since no technology is specified in the original post, I'd be happy here to keep it simple.
(all the fancy Vim/bash solutions are fine if you know what you're doing).

This last step will fill the index column with 1,2,3... etc.
You can then save the resulting spreadsheet as a CSV file again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With