I have a file like this:
bar 1
foo 1
how now
manchu 50
foo 2
brown cow
manchu 55
foo 3
the quick brown
manchu 1
bar 2
foo 1
fox jumped
manchu 8
foo 2
over the
manchu 20
foo 3
lazy dog
manchu 100
foo 4
manchu 5
foo 5
manchu 7
bar 3
bar 4
I want to search 'manchu 55' and receive:
FOONUMBER=2
(The foo # above 'manchu 55')
BARNUMBER=1
(The bar # above that foo)
PHRASETEXT="brown cow"
(The text on the line above 'manchu 55')
So I can ultimately output:
brown cow, bar 1, foo 2.
Thus far I've accomplished this with some really ugly grep code like:
FOONUMBER=`grep -e "manchu 55" -e ^" foo" -e ^"bar" | grep -B 1 "manchu 55" | grep "foo" | awk '{print $2}'`
BARNUMBER=`grep -e ^" foo $FOONUMBER" -e ^"bar" | grep -B 1 "foo $FOONUMBER" | grep "bar" | awk '{print $2}'`
PHRASETEXT=`grep -B 1 "manchu 55" | grep -v "manchu 55"`
There are 3 problems with this code:
I suspected I could do this with sed, doing something like:
FOONUMBER=`sed -n '/foo/,/manchu 55/p' | grep foo | awk '{print $2}'
Unfortunately sed is too greedy. I've been reading on AWK and state machines, which seems like it might be a better way to do this, but I still don't understand it well enough to set it up.
As you may have been able to determine by now, programming is not what I do for a living, but ultimately I have had this thrust upon me. I'm hoping to rewrite what I already have to be more efficient and hopefully not too complicated as some other poor sod without a programming degree will probably end up having to support any changes to it at some future date.
with awk:
awk -v nManchu=55 -v OFS=", " '
$1 == "bar" {bar = $0} # store the most recently seen "bar" line
$1 == "foo" {foo = $0} # store the most recently seen "foo" line
$1 == "manchu" && $2 == nManchu {print prev, bar, foo}
{prev = $0} # remember the previous line
' file
outputs
brown cow, bar 1, foo 2
Running with "nManchu=100" outputs
lazy dog, bar 2, foo 3
This has the advantage of only taking a single pass through the file, instead of parsing the file 3 times to get "bar", "foo" and the prev line.
I would suggest
sed -n '/foo/ { s/.*foo\s*//; h }; /manchu 55/ { x; p }' filename
This is very simple:
/foo/ { # if you see a line with "foo" in it,
s/.*foo\s*// # isolate the number
h # and put it in the hold buffer
}
/manchu 55/ { # if you see a line with "manchu 55" in it,
x # exchange hold buffer and pattern space
p # and print the pattern space.
}
This will then print the last number seen after a foo
before the manchu 55
line. The bar number can be extracted essentially the same way, and for the phrase text you could use
sed -n '/manchu 55/ { x; p }; h'
to get the line held before manchu 55
is seen. Or possibly
sed -n '/manchu 55/ { x; p }; s/^\s*//; h'
to remove leading white spaces in such a line.
If you are certain that only one manchu 55
line exists in the file or you only want the first match, you can replace x; p
with x; p; q
. The q
will then quit directly after the result is printed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With