How can I delete multiple headers from a file? I tried to use the below code after finding it from How can I delete duplicate lines in a file in Unix?.
awk '!x[$0]++' file.txt
It is deleting all the duplicate records in the file. But in my case, I just need the header duplicates to be removed, not the duplicate records in the file. For example, I have a file with the below data:
column1, column2, column3, column4, column5
value11, value12, value13, value14, value14
value21, value22, value23, value24, value25
value31, value32, value33, value34, value35
value41, value42, value43, value44, value45
value51, value52, value53, value54, value55
value21, value22, value23, value24, value25
column1, column2, column3, column4, column5
value11, value12, value13, value14, value14
value21, value22, value23, value24, value25
column1, column2, column3, column4, column5
column1, column2, column3, column4, column5
I am expecting the output as below:
column1, column2, column3, column4, column5
value11, value12, value13, value14, value14
value21, value22, value23, value24, value25
value31, value32, value33, value34, value35
value41, value42, value43, value44, value45
value51, value52, value53, value54, value55
value21, value22, value23, value24, value25
value11, value12, value13, value14, value14
value21, value22, value23, value24, value25
If you know that the first line contains the header, just delete all other instances of that.
awk 'FNR==1 { header = $0; print }
$0 != header' file
If that won't work, please tell us how we can identify a header line. If it's just a static string, grep -vF 'that string'
or if it matches a particular regex, grep -v 'that regex'
.
This might work for you (GNU sed):
sed -r '1h;1!G;/^(.*)\n\1/d;P;D' file
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With