I have a long file of the type
Processin SCRIPT10 file..
Submitted batch job 1715572
Processin SCRIPT100 file..
Processin SCRIPT1000 file..
Submitted batch job 1715574
Processin SCRIPT10000 file..
Processin SCRIPT10001 file..
Processin SCRIPT10002 file..
Submitted batch job 1715577
Processin SCRIPT10003 file..
Submitted batch job 1715578
Processin SCRIPT10004 file..
Submitted batch job 1715579
I want to find out jobs (script names) that were not submitted. That means there is not line submitted batch job right after processing line.
So far I have tried to do that task using
pcregrep -M "Processin.*\n.*Processin" execScripts2.log | awk 'NR % 2 == 0'
But it does not handle properly the situation when multiple scripts does not get processed. It outputs, surprisingly, only SCRIPT1000 and SCRIPT10001 lines. Can you show me a better one-liner?
Ideally the output would be only the lines without 'Submitted' on the next line (or just script names) that means:
SCRIPT100
SCRIPT10000
SCRIPT10001
Thanks.
This awk
can do the job:
awk -v s='Submitted' '$1 != s{if(p != "") print p; p=$2} $1 == s{p=""}' file
SCRIPT100
SCRIPT10000
SCRIPT10001
Reference: Effective AWK Programming
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With