Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split csv file based on column from command line

I have some data in a file in the form of csv of the form:

ID,DATE,EARNING
1,12 May 2018,5
1,13 May 2018,15
2,12 May 2018,25

I want to split this into multiple files such that file_1_May_report contains:

ID,DATE,EARNING
1,12 May 2018,5
1,13 May 2018,15

and another file file_2_May_report that contains:

ID,DATE,EARNING
2,12 May 2018,25

I have tried :

awk -F, '{print >> $1}' input.csv 

However I only get one file 1 with only one record, that is the last record in the input file. How do I get it to split into multiple files based on ID?

like image 308
user2759617 Avatar asked Jun 20 '26 11:06

user2759617


1 Answers

You may use this awk:

awk -F, 'NR==1{hdr=$0; next} !seen[$1]++{fn="file_" $1 "_May_report"; print hdr > fn} {print > fn}' input.csv

Or with a more readable format:

awk -F, 'NR == 1 {
   hdr = $0
   next
}
!seen[$1]++ {
   fn = "file_" $1 "_May_report"
   print hdr > fn
}
{
   print > fn
}' input.csv
like image 197
anubhava Avatar answered Jun 23 '26 10:06

anubhava



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!