Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linux awk merge two files

Tags:

linux

merge

awk

I have below script to combine two files.

awk -F"\t" '
    {key = $1}
    !(key in result) {result[key] = $0; next;}
    { for (i=2; i <= NF; i++) result[key] = result[key] FS $i }
    END {
        PROCINFO["sorted_in"] = "@ind_str_asc"    # if using GNU awk
        for (key in result) print result[key]
    }
' $1 $2 > $3 

First column is key and both $1 and $2. But if $2 column has a key but $1 column does not have key.

then also it's combining except $1 row.

I want to combine only incase $1 key is existed. How can I simply combine this two files?

For example,

File 1

Key    Column1  Column2  Column3  
Test1    500     400     200               
Test2    499     400     200               
Test5    600     200     150               
Test6    600     199     150               
Test7    599     199     100               

File2

Key    Column4   Column5
Test1    Good     Good                    
Test2    Good     Good
Test3    Good     Good                    
Test4    Good     Good
Test5    Good     Good                    
Test6    Good     Good
Test7    Good     Good

Current Combine

Key    Column1  Column2  Column3  Column4   Column5
Test1    500     400     200       Good     Good     
Test2    499     400     200       Good     Good      
Test5    600     200     150       Good     Good          
Test6    600     199     150       Good     Good          
Test7    599     199     100       Good     Good  
Test3    Good    Good  
Test4    Good    Good  

Expected Combine.

Key    Column1  Column2  Column3  Column4   Column5
Test1    500     400     200       Good     Good     
Test2    499     400     200       Good     Good      
Test5    600     200     150       Good     Good          
Test6    600     199     150       Good     Good          
Test7    599     199     100       Good     Good 

Thank you!

like image 945
clear.choi Avatar asked Jun 05 '15 21:06

clear.choi


1 Answers

You're going about this wrong. What you are describing is a join operation and there is a perfectly good UNIX tool for that with a very obvious name:

$ join file1 file2 | column -t
Key    Column1  Column2  Column3  Column4  Column5
Test1  500      400      200      Good     Good
Test2  499      400      200      Good     Good
Test5  600      200      150      Good     Good
Test6  600      199      150      Good     Good
Test7  599      199      100      Good     Good

or if you insist on awk:

$ awk 'NR==FNR{m[$1]=$2" "$3; next} {print $0, m[$1]}' file2 file1 | column -t
Key    Column1  Column2  Column3  Column4  Column5
Test1  500      400      200      Good     Good
Test2  499      400      200      Good     Good
Test5  600      200      150      Good     Good
Test6  600      199      150      Good     Good
Test7  599      199      100      Good     Good
like image 177
Ed Morton Avatar answered Sep 24 '22 00:09

Ed Morton