Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

julia to regex match lines in a file like grep

Tags:

julia

I would like to see a code snippet of julia that will read a file and return lines (string type) that match a regular expression.

I welcome multiple techniques, but output should be equivalent to the following:

$> grep -E ^AB[AJ].*TO' 'webster-unabridged-dictionary-1913.txt'

ABACTOR
ABATOR
ABATTOIR
ABJURATORY

I'm using GNU grep 3.1 here, and the first line of each entry in the file is the all caps word on its own.

like image 374
Merlin Avatar asked Dec 23 '22 12:12

Merlin


1 Answers

You could also use the filter function to do this in one line.

filter(line -> ismatch(r"^AB[AJ].*TO",line),readlines(open("webster-unabridged-dictionary-1913.txt")))

filter applies a function returning a Boolean to an array, and only returns those elements of the array which are true. The function in this case is an anonymous function line -> ismatch(r"^AB[AJ].*TO",line)", which basically says to call each element of the array being filtered (each line, in this case) line.

I think this might not be the best solution for very large files as the entire file needs to be loaded into memory before filtering, but for this example it seems to be just as fast as the for loop using eachline. Another difference is that this solution returns the results as an array rather than printing each of them, which depending on what you want to do with the matches might be a good or bad thing.

like image 71
Ian Marshall Avatar answered Dec 29 '22 18:12

Ian Marshall