Subtracting lines in one file from another file

Question

I couldn't find an answer that truly subtracts one file from another.

My goal is to remove lines in one file that occur in another file. Multiple occurences should be respected, which means for exammple if one line occurs 4 times in file A and only once in file B, file C should have 3 of those lines.

File A:

File B:

1
3
4

File C (desired output)

3
3
4

Thanks in advance

James Brown · Accepted Answer

In awk:

$ awk 'NR==FNR{a[$0]--;next} ($0 in a) && ++a[$0] > 0' f2 f1
3
3
4

Explained:

NR==FNR {                  # for each record in the first file
    a[$0]--;               # for each identical value, decrement a[value] (of 0)
    next
} 
($0 in a) && ++a[$0] > 0'  # if record in a, increment a[value]
                           # once over remove count in first file, output

If you want to print items in f1 that are not in f2 you can lose ($0 in a) &&:

$ echo 5 >> f1
$ awk 'NR==FNR{a[$0]--;next} (++a[$0] > 0)' f2 f1
3
3
4
5

Subtracting lines in one file from another file

Tags:

unix

sed

awk

Hawk

1 Answers

James Brown

Recent Activity

Donate For Us

Subtracting lines in one file from another file

Tags:

unix

sed

awk

Hawk

1 Answers

James Brown

Related questions

Recent Activity

Donate For Us