I'm learning awk and I have trouble passing a variable to the script AND using it as part of a regex search pattern.
The example is contrived but shows my probem.
My data is the following:
Eddy Smith 0600000000 1981-07-16 Los Angeles
Frank Smith 0611111111 1947-04-29 Chicago
Victoria McSmith 0687654321 1982-12-16 Los Angeles
Barbara Smithy 0633244321 1984-06-24 Boston
Jane McSmithy 0612345678 1947-01-15 Chicago
Grace Jones 0622222222 1985-10-07 Los Angeles
Bernard Jones 0647658763 1988-01-01 New York
George Jonesy 0623428948 1983-01-01 New York
Indiana McJones 0698732298 1952-01-01 Miami
Philip McJonesy 0644238523 1954-01-01 Miami
I want an awk script that I can pass a variable and then have the awk script do a regex for the variable. I've got this script now called "003_search_persons.awk".
#this awk script looks for a certain name, returns firstName, lastName and City
#print column headers
BEGIN {
printf "firstName lastName City\n";
}
#look for the name, print firstName, lastName and City
$2 ~ name {
printf $1 " " $2 " " $5 " " $6;
printf "\n";
}
I call the script like this:
awk -f 003_search_persons.awk name=Smith 003_persons.txt
It returns the following, which is good.
firstName lastName City
Eddy Smith Los Angeles
Frank Smith Chicago
Victoria McSmith Los Angeles
Barbara Smithy Boston
Jane McSmithy Chicago
But now I want to look for a certain prefix "Mc". I could ofcourse hardcode this, but I want an awk script that is flexible. I wrote the following in 003_search_persons_prefix.awk.
#this awk script looks for a certain prefix to a name, returns firstName, lastName and City
#print column headers
BEGIN {
printf "firstName lastName City\n";
}
#look for the prefix, print firstName, lastName and City
/^prefix/{
printf $1 " " $2 " " $5 " " $6;
printf "\n";
}
I call the script like this:
awk -f 003_search_persons_prefix.awk prefix=Mc 003_persons.txt
But now it finds no records.
The problem is the search pattern "/^prefix/". I know I can replace that search pattern by a non-regex one, as in the first script, but suppose I want to do it with a regex, because I need the prefix to really be at the start of the lastName field, as it should be, being a prefix and all ;-)
How do I do this?
In awk, regular expressions (regex) allow for dynamic and complex pattern definitions. You're not limited to searching for simple strings but also patterns within patterns.
If we try to pass a variable to the regex literal pattern it won't work. The right way of doing it is by using a regular expression constructor new RegExp() .
A regular expression enclosed in slashes (' / ') is an awk pattern that matches every input record whose text belongs to that set. The simplest regular expression is a sequence of letters, numbers, or both. Such a regexp matches any string that contains that sequence.
you can try this
BEGIN{
printf "firstName lastName City\n";
split(ARGV[1], n,"=")
prefix=n[2]
pat="^"prefix
}
$0 ~ pat{
print "found: "$0
}
output
$ awk -f test.awk name=Jane file
firstName lastName City
found: Jane McSmithy 0612345678 1947-01-15 Chicago
Look at the awk documentation for more. (and read it from start to finish!)
Change your script to:
BEGIN {
print "firstName", "lastName", "City"
ORS = "\n\n"
}
$0 ~ "^" prefix {
print $1, $2, $5, $6
}
and call it as
awk -v prefix="Mc" -f 003_search_persons.awk 003_persons.txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With