Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do Erlang pattern matching using regular expressions?

When I write Erlang programs which do text parsing, I frequently run into situations where I would love to do a pattern match using a regular expression.

For example, I wish I could do something like this, where ~ is a "made up" regular expression matching operator:

my_function(String ~ ["^[A-Za-z]+[A-Za-z0-9]*$"]) ->
    ....

I know about the regular expression module (re) but AFAIK you cannot call functions when pattern matching or in guards.

Also, I wish matching strings could be done in a case-insensitive way. This is handy, for example, when parsing HTTP headers, I would love to do something like this where "Str ~ {Pattern, Options}" means "Match Str against pattern Pattern using options Options":

handle_accept_language_header(Header ~ {"Accept-Language", [case_insensitive]}) ->
    ...

Two questions:

  1. How do you typically handle this using just standard Erlang? Is there some mechanism / coding style which comes close to this in terms of conciseness and easiness to read?

  2. Is there any work (an EEP?) going on in Erlang to address this?

like image 840
Bruno Rijsman Avatar asked Nov 02 '09 11:11

Bruno Rijsman


3 Answers

You really don't have much choice other than to run your regexps in advance and then pattern match on the results. Here's a very simple example that approaches what I think you're after, but it does suffer from the flaw that you need to repeat the regexps twice. You could make this less painful by using a macro to define each regexp in one place.

-module(multire).

-compile(export_all).

multire([],_) ->
    nomatch;
multire([RE|RegExps],String) ->
    case re:run(String,RE,[{capture,none}]) of
    match ->
        RE;
    nomatch ->
        multire(RegExps,String)
    end.


test(Foo) ->
    test2(multire(["^Hello","world$","^....$"],Foo),Foo).

test2("^Hello",Foo) ->
    io:format("~p matched the hello pattern~n",[Foo]);
test2("world$",Foo) ->
    io:format("~p matched the world pattern~n",[Foo]);
test2("^....$",Foo) ->
    io:format("~p matched the four chars pattern~n",[Foo]);
test2(nomatch,Foo) ->
    io:format("~p failed to match~n",[Foo]).
like image 73
Rob Charlton Avatar answered Oct 26 '22 01:10

Rob Charlton


A possibility could be to use Erlang Web-style annotations (macros) combined with the re Erlang module. An example is probably the best way to illustrate this.

This is how your final code will look like:

[...]
?MATCH({Regexp, Options}).
foo(_Args) ->
  ok.
[...]

The MATCH macro would be executed just before your foo function. The flow of execution will fail if the regexp pattern is not matched.

Your match function will be declared as follows:

?BEFORE.
match({Regexp, Options}, TgtMod, TgtFun, TgtFunArgs) ->
String = proplists:get_value(string, TgtArgs),
case re:run(String, Regexp, Options) of
  nomatch ->
    {error, {TgtMod, match_error, []}};
  {match, _Captured} ->
    {proceed, TgtFunArgs}
end.

Please note that:

  • The BEFORE says that macro will be executed before your target function (AFTER macro is also available).
  • The match_error is your error handler, specified in your module, and contains the code you want to execute if you fail a match (maybe nothing, just block the execution flow)
  • This approach has the advantage of keeping the regexp syntax and options uniform with the re module (avoid confusion).

More information about the Erlang Web annotations here:

http://wiki.erlang-web.org/Annotations

and here:

http://wiki.erlang-web.org/HowTo/CreateAnnotation

The software is open source, so you might want to reuse their annotation engine.

like image 43
Roberto Aloi Avatar answered Oct 26 '22 02:10

Roberto Aloi


  1. For string, you could use the 're' module : afterwards, you iterate over the result set. I am afraid there isn't another way to do it AFAIK: that's why there are regexes.

  2. For the HTTP headers, since there can be many, I would consider iterating over the result set to be a better option instead of writing a very long expression (potentially).

  3. EEP work : I do not know.

like image 29
jldupont Avatar answered Oct 26 '22 01:10

jldupont