Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split file on the value of a certain column into separate files and also include the header

Tags:

bash

csv

awk

fullfile.csv:

animal,number
rabbit,1
fish,2
mouse,1
dog,1
lizard,2
cat,2

And I want to split the file on the value in the second column, and used this command:

awk 'BEGIN {FS = ","}; {print > ("file"$2".csv")}' fullfile.csv

Outputs:

file1.csv

rabbit,1
mouse,1
dog,1

file2.csv

fish,2
lizard,2
cat,2

However there is no header in file1.csv or file2.csv so I tried to add it like so:

awk 'BEGIN {FS = ","}; NR==1 { print } {print > ("file"$2".csv")}' fullfile.csv

But the header prints to the command line instead of going to each file. How do I get the header to be included in each file?

like image 495
SonicProtein Avatar asked Oct 19 '25 12:10

SonicProtein


1 Answers

You can also specify the field separator outside of the awk script with awk -F",".

You can could store the header as a variable when NR==1. Store the file numbers in an array and write the header only once if the number is NOT in the array yet. Once the value is in the array, you will just write the lines to their respective file as you set it up before:

awk -F"," 'NR==1{header=$0}NR>1&&!a[$2]++{print header > ("file"$2".csv")}NR>1{print > ("file"$2".csv")}' fullfile.csv

Output:

file1.csv

animal,number
rabbit,1
mouse,1
dog,1

file2.csv

animal,number
fish,2
lizard,2
cat,2
like image 85
isosceleswheel Avatar answered Oct 21 '25 00:10

isosceleswheel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!