Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash set subtraction

Tags:

bash

How to subtract a set from another in Bash?

This is similar to: Is there a "set" data structure in bash? but different as it asks how to perform the subtraction, with code

  • set1: N lines as output by a filter
  • set2: M lines as output by a filter

how to get:

  • set3: with all lines in N which don't appear in M
like image 524
Robottinosino Avatar asked Aug 15 '12 03:08

Robottinosino


People also ask

What is set in bash?

set allows you to change the values of shell options and set the positional parameters, or to display the names and values of shell variables.

Why we use set in Shell?

Set command: It is used to set or unset specific flags and settings( determines the behavior of the script and helps in executing the tasks without any issue.) inside the shell environment. It can be used to change or display the shell attributes and parameters.


3 Answers

comm -23 <(command_which_generate_N|sort) <(command_which_generate_M|sort)

comm without option display 3 columns of output: 1: only in first file, 2: only in second file, 3: in both files. -23 removes the second and third columns.

$ cat > file1.list
A
B
C
$ cat > file2.list
A
C
D
$ comm file1.list file2.list 
        A
B
        C
    D
$ comm -12 file1.list file2.list # In both
A
C
$ comm -23 file1.list file2.list # Only in set 1
B
$ comm -13 file1.list file2.list # Only in set 2
D

Input files must be sorted.

GNU sort and comm depends on locale, for example output order may be different (but content must be the same)

(export LC_ALL=C; comm -23 <(command_which_generate_N|sort) <(command_which_generate_M|sort))
like image 175
Nahuel Fouilleul Avatar answered Oct 02 '22 21:10

Nahuel Fouilleul


uniq -u (manpage) is often the simplest tool for list subtraction:

Usage

uniq [OPTION]... [INPUT [OUTPUT]] 
[...]
-u, --unique
    only print unique lines

Example: list files found in directory a but not in b

$ ls a
file1  file2  file3
$ ls b
file1  file3

$ echo "$(ls a ; ls b)" | sort | uniq -u
file2
like image 37
YSC Avatar answered Oct 02 '22 21:10

YSC


I've got a dead-simple 1-liner:

$ now=(ConfigQC DBScripts DRE DataUpload WFAdaptors.log)

$ later=(ConfigQC DBScripts DRE DataUpload WFAdaptors.log baz foo)

$ printf "%s\n" $now $later | sort | uniq -c | grep -vE '[ ]+2.*' | awk '{print $2}'
baz
foo

By definition, 2 sets intersect if they have elements in common. In this case, there are 2 sets, so any count of 2 is an intersection - simply "subtract" them with grep

like image 28
Christian Bongiorno Avatar answered Oct 02 '22 20:10

Christian Bongiorno