Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Joining regular expressions in julia

Tags:

regex

julia

x = r"abc"
y = r"def"
z = join([x,y], "|")

z # => r"r\"abc\"|r\"def\""

Is there a way to join (and in general manipulate) Regex that deals only with the regex content (i.e. does not treat the r modifier as if it's part of the content). The desired output for z is:

z # => r"abc|def"
like image 954
Sean Mackesey Avatar asked Dec 09 '13 19:12

Sean Mackesey


1 Answers

macro p_str(s) s end
x = p"abc"
y = p"def"
z = Regex(join([x,y], "|"))

The r"quote" operator actually compiles a regular expression for you which takes time. If you have just parts of a regular expression that you want to use to build a bigger one then you should store the parts using "regular quotes".

But what about the sketchy escaping rules that you get with r"quote" versus "regular quotes" you ask? If you want the sketchy r"quote" rules but not to compile a regular expression immediately then you can use a macro like:

macro p_str(s) s end

Now you have a p"quote" that escapes like an r"quote" but just returns a string.

Not to go off topic but you might define a bunch of quotes for getting around tricky alphabets. Here's some convenient ones:

                                       # "baked\nescape"    -> baked\nescape
macro p_mstr(s) s end                  # p"""raw\nescape""" -> raw\\nescape
macro dq_str(s) "\"" * s * "\"" end    # dq"with quotes"    -> "with quotes"
macro sq_str(s) "'" * s * "'" end      # sq"with quotes"    -> 'with quotes'
macro s_mstr(s) strip(lstrip(s))  end  # s"""  "stripme" """-> "stripme"

When you're done making fragments you can do your join and make a regex like:

myre = Regex(join([x, y], "|"))

Just like you thought.

If you want to learn more about what members an object has (such as Regex.pattern) try:

julia> dump(r"pat")
Regex 
  pattern: ASCIIString "pat"
  options: Uint32 33564672
  regex: Array(Uint8,(61,)) [0x45,0x52,0x43,0x50,0x3d,0x00,0x00,0x00,0x00,0x28  …   0x1d,0x70,0x1d,0x61,0x1d,0x74,0x72,0x00,0x09,0x00]
like image 155
Michael Fox Avatar answered Oct 29 '22 14:10

Michael Fox