In Haskell, there is a language feature called "as"-operator (sometimes known as alias). The idea is the following: given you have a function that for instance takes as input a list and wants to return all tails, you can implement this as:
tails a@(_:xs) = a : tails xs
tails [] = [[]]
The @
ensures that you have a reference to both the entire argument as well as a reference to some parts of the structure of the parameter. This is intelligent performance-wise (it is more a performance hack, because reconstructing an array (x:xs)
) in the body of the first line, will - if not optimized by the compiler - result in allocating a new object, modifying fields, etc. See here for more information.
I was wondering if Prolog has something equivalent: for instance if you want to implement tails in Prolog, it can be done the following way:
tails([H|T],[[H|T]|TA]) :-
tails(T,TA).
tails([],[[]]).
But it could be more efficient if there was an "as"-operator like:
tails(L@[_|T],[L|TA]) :- %This does not compile
tails(T,TA).
tails([],[[]]).
Is there any such construct, or a language extension?
TL;DR: Good idea1! The speedup appears to be limited to ~20% (for most list sizes).
In this answer, we compare three different predicates that differ regarding @
-like data reuse:
list_tails([], [[]]). % (1) like `tails/2` given by the OP ... list_tails([E|Es], [[E|Es]|Ess]) :- % ....... but with a better name :-) list_tails(Es, Ess). list_sfxs1(Es, [Es|Ess]) :- % (2) "re-use, mutual recursion" aux_list_sfxs1(Es, Ess). % "sfxs" is short for "suffixes" aux_list_sfxs1([], []). aux_list_sfxs1([_|Es], Ess) :- list_sfxs1(Es, Ess). list_sfxs2([], [[]]). % (3) "re-use, direct recursion" list_sfxs2(Es0, [Es0|Ess]) :- Es0 = [_|Es], list_sfxs2(Es, Ess).
To measure runtime, we use the following code:
:-( dif(D,sicstus), current_prolog_flag(dialect,D) ; use_module(library(between)) ). run_benchs(P_2s, P_2, L, N, T_ms) :- between(1, 6, I), L is 10^I, N is 10^(8-I), length(Xs, L), member(P_2, P_2s), garbage_collect, call_walltime(run_bench_core(P_2,Xs,N), T_ms). run_bench_core(P_2, Xs, N) :- between(1, N, _), call(P_2, Xs, _), false. run_bench_core(_, _, _).
To measure wall-time2, we utilize call_walltime/2
—a variation of call_time/2
:
call_walltime(G, T_ms) :- statistics(walltime, [T0|_]), G, statistics(walltime, [T1|_]), T_ms is T1 - T0.
Let's put above code variations to the test...
L
...N
(for better accuracy).First, we use swi-prolog version 7.3.14 (64-bit):
?- run_benchs([list_sfxs1,list_sfxs2,list_tails], P_2, L, N, T_ms). P_2 = list_sfxs1, L*N = 10*10000000, T_ms = 7925 ; P_2 = list_sfxs2, L*N = 10*10000000, T_ms = 7524 ; P_2 = list_tails, L*N = 10*10000000, T_ms = 6936 ; P_2 = list_sfxs1, L*N = 100*1000000, T_ms = 6502 ; P_2 = list_sfxs2, L*N = 100*1000000, T_ms = 5861 ; P_2 = list_tails, L*N = 100*1000000, T_ms = 5618 ; P_2 = list_sfxs1, L*N = 1000*100000, T_ms = 6434 ; P_2 = list_sfxs2, L*N = 1000*100000, T_ms = 5817 ; P_2 = list_tails, L*N = 1000*100000, T_ms = 9916 ; P_2 = list_sfxs1, L*N = 10000*10000, T_ms = 6328 ; P_2 = list_sfxs2, L*N = 10000*10000, T_ms = 5688 ; P_2 = list_tails, L*N = 10000*10000, T_ms = 9442 ; P_2 = list_sfxs1, L*N = 100000*1000, T_ms = 10255 ; P_2 = list_sfxs2, L*N = 100000*1000, T_ms = 10296 ; P_2 = list_tails, L*N = 100000*1000, T_ms = 14592 ; P_2 = list_sfxs1, L*N = 1000000*100, T_ms = 6955 ; P_2 = list_sfxs2, L*N = 1000000*100, T_ms = 6534 ; P_2 = list_tails, L*N = 1000000*100, T_ms = 9738.
Then, we repeat the previous query3 using sicstus-prolog version 4.3.2 (64-bit):
?- run_benchs([list_sfxs1,list_sfxs2,list_tails], P_2, L, N, T_ms). P_2 = list_sfxs1, L*N = 10*10000000, T_ms = 1580 ; P_2 = list_sfxs2, L*N = 10*10000000, T_ms = 1610 ; P_2 = list_tails, L*N = 10*10000000, T_ms = 1580 ; P_2 = list_sfxs1, L*N = 100*1000000, T_ms = 710 ; P_2 = list_sfxs2, L*N = 100*1000000, T_ms = 750 ; P_2 = list_tails, L*N = 100*1000000, T_ms = 840 ; P_2 = list_sfxs1, L*N = 1000*100000, T_ms = 650 ; P_2 = list_sfxs2, L*N = 1000*100000, T_ms = 660 ; P_2 = list_tails, L*N = 1000*100000, T_ms = 740 ; P_2 = list_sfxs1, L*N = 10000*10000, T_ms = 620 ; P_2 = list_sfxs2, L*N = 10000*10000, T_ms = 650 ; P_2 = list_tails, L*N = 10000*10000, T_ms = 740 ; P_2 = list_sfxs1, L*N = 100000*1000, T_ms = 670 ; P_2 = list_sfxs2, L*N = 100000*1000, T_ms = 650 ; P_2 = list_tails, L*N = 100000*1000, T_ms = 750 ; P_2 = list_sfxs1, L*N = 1000000*100, T_ms = 12610 ; P_2 = list_sfxs2, L*N = 1000000*100, T_ms = 12560 ; P_2 = list_tails, L*N = 1000000*100, T_ms = 33460.
Summary:
Footnote 1: Why do the stunt of putting (@)/2
in the rule head?
To end up with non-idiomatic Prolog code?
Footnote 2: We are interested in the total runtime. Why? Because garbage-collection costs show with larger data sizes!
Footnote 3: The answer sequence has been post-processed for the sake of readability.
Footnote 4: Available since release 4.3.0. Current target architectures include IA-32 and AMD64.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With