I want to combine columns 1 and 2 and add them as a new column in my data frame. Then I want to print all the old columns and the newly created column. I can combine the columns using the script below, but not sure how to print all columns, not only the combined:
awk ' { print $1 $2 "_" $NF } ' input_file
in
c1 c2 c3
12 1 12
4 4 57
out
c1 c2 c3 c4
12 1 12 12_1
4 4 57 4_4
Join columns using the Merge Cells add-in for Excel With the Merge Cells add-in, you can combine data from several cells using any separator you like (e.g. space, comma, carriage return or line break). You can join values row by row, column by column or merge data from the selected cells into one without losing it.
Here is a Generic solution in awk
. Just mention field numbers in awk
variable named fields
eg: 1,2,3,4,7,8
(example) and it will add all fields values to last column. Written and tested in GNU awk
should work in any awk
.
awk -v fields="1,2" '
BEGIN{
num=split(fields,arr,",")
for(i=1;i<=num;i++){
field[arr[i]]
}
}
FNR==1{
print
next
}
{
val=""
for(i=1;i<=NF;i++){
if(i in field){
val=(val?val "_":"")$i
}
}
print $0,val
}
' Input_file
$ awk '{print $0, (NR>1 ? $1"_"$2 : "c4")}' file
c1 c2 c3 c4
12 1 12 12_1
4 4 57 4_4
or to get tab-separated output if your input is tab-separated:
$ awk 'BEGIN{FS=OFS="\t"} {print $0, (NR>1 ? $1"_"$2 : "c4")}' file
c1 c2 c3 c4
12 1 12 12_1
4 4 57 4_4
or if it isn't:
$ awk -v OFS='\t' '{$(NF+1)=(NR>1 ? $1"_"$2 : "c4")} 1' file
c1 c2 c3 c4
12 1 12 12_1
4 4 57 4_4
If you want to print the _
between field 1 and 2, then the first output would be c1 c2 c3 c1_c2
instead of c1 c2 c3 c4
You can add a column at the end with the value of $1
and $2
and then print the whole line:
awk ' { $(NF+1) = $1"_"$2 }1' input_file
Output
c1 c2 c3 c1_c2
12 1 12 12_1
4 4 57 4_4
Or you can print the whole line followed by field $1
and $2
awk '{print $0, $1"_"$2}' input_file
Output
c1 c2 c3 c1_c2
12 1 12 12_1
4 4 57 4_4
Another awk which at FNR==1
uses the field name in $NF
to create the field name for the next field (c3
-> c4
, c
-> c1
, etc):
$ awk '{
printf "%s%s%s\n",
$0,
OFS,
(FNR>1?$1 "_" $2:(match($3,/[0-9]+$/)?substr($3,1,RSTART-1) substr($3,RSTART)+1:$3 1))
}' file
Output:
c1 c2 c3 c4
12 1 12 12_1
4 4 57 4_4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With