Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

awk: read pattern from file, awk '$2 !~ /{newline delimited file}/ && $1 > 5000'

Tags:

file

bash

list

awk

I have a command that that pipes output like:

   1365 8.67.53.0.9
   2657 1.2.3.4
   5956 127.9.9.0
  10463 127.0.0.1
  15670 6.5.4.3.2
  17984 -

to:

awk '$2 !~ /-|127.0.0.1|6.5.4.3.2/ && $1 > 5000'

which should print:

   5956 127.9.9.0

or all the ones where $2 doesn't contain -, 127.0.0.1, or 6.5.4.3.2 and where $1 is greater than 5000.

I would like to keep all of the values that should be ignored in a newline delimited file like:

-
127.0.0.1
6.5.4.3.2

rather than within the regex /-|127.0.0.1|6.5.4.3.2/ because my list of these will be growing.

Ideally, this could be within a single command and not a function or awk program file. Also, if possible I would like the matching to be more exact (less greedy?). I think the current regex will also match something like 127.0.0.11 or 6.5.4.3.22.

like image 995
Special Monkey Avatar asked Oct 25 '25 16:10

Special Monkey


2 Answers

You can keep value to be skipped in a file called skip like this:

cat skip

-
127.0.0.1
6.5.4.3.2

Then run awk using both files as:

awk 'NR == FNR {omit[$1]; next} $1 > 5000 && !($2 in omit)' skip file

  5956 127.9.9.0

Here:

  • While processing first file i.e. skip we store all the values in an array omit.
  • Then while processing main file we simply check if $1 > 5000 and $2 doesn't exist in array omit.
like image 55
anubhava Avatar answered Oct 27 '25 05:10

anubhava


Given input file:

127.0.0.1
6.5.4.3.2

and file file:

   1365 8.67.53.0.9
   2657 1.2.3.4
   5956 127.9.9.0
  10463 127.0.0.1
  15670 6.5.4.3.2
  17984 -
# read input file and perform parameter substitution
$ ips=$(< input); ips=${ips//$'\n'/|}; ips=${ips//./[.]};
# create variable for regex
$ regex="^(-|${ips})$"
# pass regex to awk as variable and run logic
$ awk -v regex="$regex" '$2 !~ regex && $1 >5000' file
   5956 127.9.9.0
like image 45
Paolo Avatar answered Oct 27 '25 05:10

Paolo