Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find the missing integers in a unique and sequential list (one per line) in a unix terminal?

Tags:

bash

unix

awk

Suppose I have a file as follows (a sorted, unique list of integers, one per line):

1
3
4
5
8
9
10

I would like the following output (i.e. the missing integers in the list):

2
6
7

How can I accomplish this within a bash terminal (using awk or a similar solution, preferably a one-liner)?

like image 894
Jake Sebright Avatar asked Jul 20 '16 22:07

Jake Sebright


4 Answers

Using awk you can do this:

awk '{for(i=p+1; i<$1; i++) print i} {p=$1}' file

2
6
7

Explanation:

  • {p = $1}: Variable p contains value from previous record
  • {for ...}: We loop from p+1 to the current row's value (excluding current value) and print each value which is basically the missing values
like image 178
anubhava Avatar answered Nov 18 '22 19:11

anubhava


Using seq and grep:

seq $(head -n1 file) $(tail -n1 file) | grep -vwFf file -

seq creates the full sequence, grep removes the lines that exists in the file from it.

like image 24
choroba Avatar answered Nov 18 '22 21:11

choroba


perl -nE 'say for $a+1 .. $_-1; $a=$_'
like image 2
JJoao Avatar answered Nov 18 '22 19:11

JJoao


Using Raku (formerly known as Perl_6)

raku -e 'my @a = lines.map: *.Int; say @a.Set (^) @a.minmax.Set;' 

Sample Input:

1
3
4
5
8
9
10

Sample Output:

Set(2 6 7)

I'm sure there's a Raku solution similar to @JJoao's clever Perl5 answer, but in thinking about this problem my mind naturally turned to Set operations.

The code above reads lines into the @a array, mapping each line so that elements in the @a array are Ints, not strings. In the second statement, @a.Set converts the array to a Set on the left-hand side of the (^) operator. Also in the second statement, @a.minmax.Set converts the array to a second Set, on the right-hand side of the (^) operator, but this time because the minmax operator is used, all Int elements from the min to max are included. Finally, the (^) symbol is the symmetric set-difference (infix) operator, which finds the difference.

To get an unordered whitespace-separated list of missing integers, replace the above say with put. To get a sequentially-ordered list of missing integers, add the explicit sort below:

~$ raku -e 'my @a = lines.map: *.Int; .put for (@a.Set (^) @a.minmax.Set).sort.map: *.key;' file
2
6
7

The advantage of all Raku code above is that finding "missing integers" doesn't require a "sequential list" as input, nor is the input required to be unique. So hopefully this code will be useful for a wide variety of problems in addition to the explicit problem stated in the Question.

OTOH, Raku is a Perl-family language, so TMTOWTDI. Below, a @a.minmax array is created, and grepped so that none of the elements of @a are returned (none junction):

~$ raku -e 'my @a = lines.map: *.Int;  .put for @a.minmax.grep: none @a;'  file
2
6
7

https://docs.raku.org/language/setbagmix
https://docs.raku.org/type/Junction
https://raku.org

like image 1
jubilatious1 Avatar answered Nov 18 '22 20:11

jubilatious1