I'm very new at awk and have been banging my head trying to get this to work. I'm trying to take a list of files in "image.list" and create an "info" file out of it. I need to grab the string matching a regex (a number 8-11 digits long) from the middle of the filename and print just that match into the designated spot in my "info file". That last part is the part I'm having trouble pulling off. Would love some help fixing that.
Here is my test file list:
SURGERY0001275678image1.jpg
SURGERY11134900211image2.jpg
SURGERY19257012image3.jpg
SURGERY273142590image4.jpg
Here is my current code:
awk 'BEGIN {print "-----TEST TAG FILE\tENCOUNTERS-----";}
> {print "FILE: /tmp/imagetest/"$1,"\t","ENCOUNTER: ",($1~/^[0-9]{8,11}$/);}
> END{print "END REPORT";
> }' image.list > upload.tag
And here is my current output:
-----TEST TAG FILE ENCOUNTERS-----
FILE: /tmp/imagetest/SURGERY0001275678image1.jpg ENCOUNTER: 0
FILE: /tmp/imagetest/SURGERY11134900211image2.jpg ENCOUNTER: 0
FILE: /tmp/imagetest/SURGERY19257012image3.jpg ENCOUNTER: 0
FILE: /tmp/imagetest/SURGERY273142590image4.jpg ENCOUNTER: 0
END REPORT
What i need it to display is the 8-11 digit number in the middle of the file name after "ENCOUNTER:". So far everything I've tried outputs either the whole filename or "0".
I'm probably way off course so I'd love to get some help from you experts!
Re-using your existing code:
$ awk '
BEGIN {
print "-----TEST TAG FILE\tENCOUNTERS-----";
}
match($0,/[^0-9]+([0-9]+)[^0-9]+/,ary) {
print "FILE: /tmp/imagetest/"$1,"\t","ENCOUNTER:"ary[1]
}
END {
print "END REPORT";
}' testfile
$ cat testfile
SURGERY0001275678image1.jpg
SURGERY11134900211image2.jpg
SURGERY19257012image3.jpg
SURGERY273142590image4.jpg
$ awk '
> BEGIN {
> print "-----TEST TAG FILE\tENCOUNTERS-----";
> }
> match($0,/[^0-9]+([0-9]+)[^0-9]+/,ary) {
> print "FILE: /tmp/imagetest/"$1,"\t","ENCOUNTER:"ary[1]
> }
> END {
> print "END REPORT";
> }' testfile
-----TEST TAG FILE ENCOUNTERS-----
FILE: /tmp/imagetest/SURGERY0001275678image1.jpg ENCOUNTER:0001275678
FILE: /tmp/imagetest/SURGERY11134900211image2.jpg ENCOUNTER:11134900211
FILE: /tmp/imagetest/SURGERY19257012image3.jpg ENCOUNTER:19257012
FILE: /tmp/imagetest/SURGERY273142590image4.jpg ENCOUNTER:273142590
END REPORT
As Ed Morton suggested in the comments, using array argument to match() this solution is GNU awk only.
sed -r -e 's#(.*)#FILE:\t/tmp/imagetest/\1#;s/([0-9]*)(i[^i]*)$/\1\2\tENCOUNTER:\1/;1i -----TEST TAG FILE ENCOUNTERS-----' -e '$aEND REPORT' file
-----TEST TAG FILE ENCOUNTERS----- FILE: /tmp/imagetest/SURGERY0001275678image1.jpg ENCOUNTER:0001275678 FILE: /tmp/imagetest/SURGERY11134900211image2.jpg ENCOUNTER:11134900211 FILE: /tmp/imagetest/SURGERY19257012image3.jpg ENCOUNTER:19257012 FILE: /tmp/imagetest/SURGERY273142590image4.jpg ENCOUNTER:273142590 END REPORT
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With