I would like to merge multiple tables by row names. The tables differ in the amount of rows and they have unique and shared rows, which should all appear in output. If possible I would like to solve the problem with <code>awk</code>, but I am also fine with other solutions. table1.tab <pre class="prettyprint"><code>a 5 b 5 d 9 </code></pre> table2.tab <pre class="prettyprint"><code>a 1 b 2 c 8 e 11 </code></pre> The output I would like to obtain the following table: table3.tab <pre class="prettyprint"><code>a 5 1 b 5 2 d 9 0 c 0 8 e 0 11 </code></pre> I tried using <code>join</code> <pre class="prettyprint"><code>join table1.tab table2.tab > table3.tab </code></pre> but I get table3.tab <pre class="prettyprint"><code>a 5 1 b 5 2 </code></pre> row <code>c</code>, <code>d</code> and <code>e</code> are not in the output.

You want to do a full outer join: <pre class="prettyprint"><code>join -a1 -a2 -o 0 1.2 2.2 -e "0" table1.tab table2.tab a 5 1 b 5 2 c 0 8 d 9 0 e 0 11 </code></pre>

this awk oneliner should work for your example: <pre class="prettyprint"><code>awk 'NR==FNR{a[$1]=$2;k[$1];next}{b[$1]=$2;k[$1]} END{for(x in k)printf"%s %d %d\n",x,a[x],b[x]}' table1 table2 </code></pre> test <pre class="prettyprint"><code>kent$ head f1 f2 ==> f1 <== a 5 b 5 d 9 ==> f2 <== a 1 b 2 c 8 e 11 kent$ awk 'NR==FNR{a[$1]=$2;k[$1];next}{b[$1]=$2;k[$1]}END{for(x in k)printf"%s %d %d\n",x,a[x],b[x]}' f1 f2 a 5 1 b 5 2 c 0 8 d 9 0 e 0 11 </code></pre>

Join multiple tables by row names [duplicate]

Tags:

bash

shell

join

awk

I would like to merge multiple tables by row names. The tables differ in the amount of rows and they have unique and shared rows, which should all appear in output. If possible I would like to solve the problem with awk, but I am also fine with other solutions.

table1.tab

a 5
b 5
d 9

table2.tab

a 1
b 2
c 8
e 11

The output I would like to obtain the following table:

table3.tab

I tried using join

join table1.tab table2.tab > table3.tab

but I get

table3.tab

a 5 1
b 5 2

row c, d and e are not in the output.

203

asked Aug 25 '13 09:08

user2715173

2 Answers

You want to do a full outer join:

join -a1 -a2 -o 0 1.2 2.2 -e "0" table1.tab table2.tab

a 5 1
b 5 2
c 0 8
d 9 0
e 0 11

147

answered Oct 04 '22 02:10

Clayton Stanley

this awk oneliner should work for your example:

awk 'NR==FNR{a[$1]=$2;k[$1];next}{b[$1]=$2;k[$1]}
END{for(x in k)printf"%s %d %d\n",x,a[x],b[x]}' table1 table2

test

kent$  head f1 f2
==> f1 <==
a 5
b 5
d 9

==> f2 <==
a 1
b 2
c 8
e 11

kent$  awk 'NR==FNR{a[$1]=$2;k[$1];next}{b[$1]=$2;k[$1]}END{for(x in k)printf"%s %d %d\n",x,a[x],b[x]}'  f1 f2
a 5 1
b 5 2
c 0 8
d 9 0
e 0 11

answered Oct 04 '22 02:10

Kent

Related questions
                            
                                How do I escape the "return" function in a bash script
                            
                                Ignore $ in first of commands
                            
                                Test whether a command option is supported
                            
                                How to set $TERM to a value when running /bin/bash via command line?
                            
                                awk/gsub - print everything between double quotes in multiple occurrences per line
                            
                                Download all .tar.gz files from website/directory using WGET
                            
                                weird behaviour of wildcharacter * in shell script
                            
                                Wildcard single file
                            
                                Redirect bash output to python script
                            
                                Realtime removal of carriage return in shell
                            
                                How to compare contents of two directoriers in bash?
                            
                                Print package dependency tree
                            
                                Substitute a bash script variable twice
                            
                                Bash script: syntax error: unexpected end of file [duplicate]
                            
                                File with the most lines in a directory NOT bytes
                            
                                How do I retrieve the header when I want to look at grep output of something in bash?
                            
                                How to move and rename files based on parent folder in Linux?
                            
                                Bash script/command to bulk remove "@2x" from filename (retina image -> normal)
                            
                                Filling in gaps with awk or anything
                            
                                Concatenate string literals

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With