Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bash: For each line in a txt file match a regex and save it to a variable array

I am trying to read each line of the text file and extract the name before the .tst and store each match into a variable array. here is an example of the txt file:

    someTest.tst (/blah/blah/blah),
    someOtherfile.tst (/some/other/blah),
    hello.tst (/not/the/same/blah),
    hi.tst (/is/this/blah),

There is a bunch of whitespace on each line before the characters.

I would like to extract the following values and store them in a variable array:

someTest
someOtherfile
hello
hi

I have tried using sed and awk but my knowledge with either is not expert level status and thus I am having trouble achieving what I want. Any insight?

like image 624
john Avatar asked Sep 13 '25 04:09

john


1 Answers

You don't need a regex for this at all.

arr=( )
while read -r name _; do
  [[ $name = *.tst ]] || continue # skip lines not containing .tst
  arr+=( "${name%.tst}" )
done <input.txt

declare -p arr # print array contents
  • read accepts a list of destinations; fields (as determined by splitting input on the characters in IFS) are populated into variables as they're read, and the last destination receives all remaining content on a line (including whitespace). Thus, read -r name _ puts the first field into name, and all remaining contents on the input line into a variable named _.
  • [[ $name = *.tst ]] || continue skips all lines where the first field doesn't end in .tst.
  • "${name%.tst}" expands to the contents of "$name", with the suffix .tst removed if present.
  • The while read; do ...; done <inputfile pattern is described in more detail in BashFAQ #1.

However, if you wanted to use a regex, that might look like this:

re='^[[:space:]]*([^[:space:]]+)[.]tst[[:space:]]'

arr=( )
while IFS= read -r line; do
  [[ $line =~ $re ]] && arr+=( "${BASH_REMATCH[1]}" )
done <input.txt

declare -p arr # print array contents

Using [[ $string =~ $regex ]] evaluates $regex as an ERE and, if it matches, puts the entirety of the matched content into BASH_REMATCH[0], and any match groups into BASH_REMATCH[1] and onward.

like image 122
Charles Duffy Avatar answered Sep 15 '25 20:09

Charles Duffy