Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Erlang pattern matching performance

new to Erlang. I am about to start writing some code. The solution I have in mind can go one of two ways. I can do a bunch of mathematical calculations, or I can code this up largely as pattern matching. When I say "pattern matching," I DO NOT mean regex or anything like that - I mean pattern matching in clause heads.

Performance is usually not a concern, however in this app it is. I am not asking you which method would be faster - I'm sure you would say you don't know (depends on many factors). What I am asking is the general performance of Erlang pattern matching in clause heads. In other words, in Prolog, the engine is optimized to do this sort of thing, so all other things being equal, you are "encouraged" to design a solution that takes advantage of pattern matching and unification in clause heads.

Is the same true of Erlang, i.e. is Erlang optimized for pattern matching in clause heads, similar to Prolog? Rather than ask the question here, I actually tried to profile this in Erlang, and wrote a toy program to do pattern matching of clause heads a couple million times vs. summing a list a couple million times. But the system would crash if set to do for "couple million times." But set to any less than a couple million, the results would come back too fast for me to know anything about performance.

Thanks for any insights.

like image 723
Ultranewb Avatar asked Jan 22 '11 12:01

Ultranewb


2 Answers

It is basically as @(NOT VERY) CRAP ANSWERS :-) has said and his example shows.

Prolog doesn't really do pattern matching which is unidirectional, it does unification which is sort of like a bidirectional pattern matching with logical variables. It is much easier to optimize pattern matching and serious functional languages like Erlang and Haskell put quite a bit of work into an optimising pattern matching compiler. This is especially noticeable with deep patterns.

So, yes, Erlang will do pattern matching faster than Prolog.

like image 45
rvirding Avatar answered Oct 11 '22 23:10

rvirding


In general, pattern matching in functional languages are as fast or faster than in Prolog. I would expect Erlang to perform inside a factor of 2 compared to Prolog, faster or slower. Since a functional program is almost nothing but pattern matching it is one of those areas you optimize a lot.

The internals usually have a pattern match compiler which turns an advanced pattern match into a simpler series of checks with the goal to minimize the number of checks.


Ok, the first gotcha: "Anything introduced in the shell is interpreted". So we compile modules instead:

-module(z).

-compile(export_all).

%% This pattern is highly uninteresting because it only matches
%% on a single value. Any decent system can do this quickly.
cl(0) -> 0;
cl(1) -> 0;
cl(2) -> 0;
cl(3) -> 0;
cl(4) -> 0;
cl(5) -> 0;
cl(6) -> 0;
cl(7) -> 0;
cl(8) -> 0;
cl(9) -> 0.

mcl(L) ->
    [cl(E) || E <- L].

This gives us a runner of your example.

cl2(a, 0) -> a0;
cl2(a, 1) -> a1;
cl2(a, 2) -> a2;
cl2(b, 0) -> b0;
cl2(b, 1) -> b1;
cl2(b, 2) -> b2;
cl2(c, 0) -> c0;
cl2(c, 1) -> c0;
cl2(c, 2) -> c0.

mcl2(L) ->
    [cl2(A, V) || {A, V} <- L].

A runner for a more interesting example. Here, we can exploit skips in the pattern. If (a, 0) fails to match on the a we know we can skip to the case (b, 0) straight away because the negative match information can be propagated as information in the system. The pattern match compiler usually makes this optimization.

test1() ->
    L1 = [X rem 10 || X <- lists:seq(1, 2000000)],
    %% A Core 2 Duo 2.4Ghz P8600 eats this in 132984 microseconds without HiPE
    %% c(z).
    %% With HiPE it is 91195 or in 0.6857591890753775 of the time the non-HiPE variant use
    %% c(z, [native, {hipe, [o3]}]).
    timer:tc(z, mcl, [L1]).

You must run this example yourself and evaluate if you think it is fast enough for your use case. Note that some time is also spent in the mapping code and quite a lot of time must be spent to pull data from the main memory through the CPU caches and onto the CPU.

test2() ->
    random:seed(erlang:now()),
    L2 = [{case random:uniform(3) of
                   1 -> a;
                   2 -> b;
                   3 -> c
                   end, V rem 3} || V <- lists:seq(1, 2000000)],
    %% With HiPE this is 220937
    %% Without HiPE this is 296980
    timer:tc(z, mcl2, [L2]).

Naturally this example is slower, as we need to match more data before we have a hit. But it is a more interesting case because it gives some indication of the real speed of the match compiler.


Parallel versions were tried, but they are about 10 times slower in this case since the overhead of creating 2 million worker processes far dominates the actual processing :)

like image 102
I GIVE CRAP ANSWERS Avatar answered Oct 12 '22 00:10

I GIVE CRAP ANSWERS