Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract Facts as Matrix in Prolog

Suppose we have the following:

edge(a, 1, 10).
edge(b, 2, 20).
edge(c, 3, 30).
edge(d, 4, 40).

I want to extract a matrix representation (M) of these facts, such that

M = [[a,b,c,d],[1,2,3,4],[10,20,30,40]]

Here's a no-brainer solution:

edgeMatrix(M) :-
  findall(A, edge(A, _, _), As),
  findall(B, edge(_, B, _), Bs),
  findall(C, edge(_, _, C), Cs),
  M = [As, Bs, Cs].

There are some problems to this approach, however, viz:

  1. we traverse the database n times, where n is the number of arguments; and
  2. this doesn't generalize very well to an arbitrary n.

So the question is: what is the most idiomatic way to achieve this in Prolog?

like image 668
Hugo Sereno Ferreira Avatar asked Mar 11 '23 05:03

Hugo Sereno Ferreira


2 Answers

What about:

edgeMatrix(M) :-
    findall([A,B,C],edge(A,B,C),Trans),
    transpose(Trans,M).

Now you can simply import the transpose/2 matrix from clpfd module, or implement one yourself like in this answer (yeah I know that's quite lazy, but what is the point of reinventing the wheel?).

If I run this in swipl, I get:

?- edgeMatrix(M).
M = [[a, b, c, d], [1, 2, 3, 4], [10, 20, 30, 40]].

which looks exactly like you want.

You can of course say that there is still some computational overhead to calculate the transpose/2, but the collecting phase is done only once (and if these are not simply facts, but answers from clauses as well) which can be expensive as well, and furthermore I think a module will implement clauses probably very efficiently anyway.

like image 178
Willem Van Onsem Avatar answered Mar 19 '23 23:03

Willem Van Onsem


I don't think you'll find a solution that is both completely general and maximally efficient. Here's a simple solution for N = 3:

edges(Edges) :-
    Goal = edge(_A, _B, _C),
    findall(Goal, Goal, Edges).

edges_abcs_([], [], [], []).
edges_abcs_([edge(A,B,C)|Edges], [A|As], [B|Bs], [C|Cs]) :-
    edges_abcs_(Edges, As, Bs, Cs).

edges_abcs([As, Bs, Cs]) :-
    edges(Edges),
    edges_abcs_(Edges, As, Bs, Cs).

After adding 100,000 additional edge/3 facts, this performs as follows:

?- time(edges_abcs(M)).
% 200,021 inferences, 0.063 CPU in 0.065 seconds (97% CPU, 3176913 Lips)
M = [[a, b, c, d, 1, 2, 3, 4|...], [1, 2, 3, 4, 1, 2, 3|...], [10, 20, 30, 40, 1, 2|...]].

For comparison, here is the measurement for the implementation from the question:

?- time(edgeMatrix_orig(M)).
% 300,043 inferences, 0.061 CPU in 0.061 seconds (100% CPU, 4896149 Lips)
M = [[a, b, c, d, 1, 2, 3, 4|...], [1, 2, 3, 4, 1, 2, 3|...], [10, 20, 30, 40, 1, 2|...]].

And here is the more general solution based on transpose/2 by Willem:

?- time(edgeMatrix_transpose(M)).
% 700,051 inferences, 0.098 CPU in 0.098 seconds (100% CPU, 7142196 Lips)
M = [[a, b, c, d, 1, 2, 3, 4|...], [1, 2, 3, 4, 1, 2, 3|...], [10, 20, 30, 40, 1, 2|...]].

So my solution seems best in terms of number of inferences: 100,000 inferences for the findall/3 and 100,000 inferences for traversing the list. The solution from the question has 100,000 inferences for each findall/3, but nothing more. It is, however, slightly faster because it's more memory efficient: Everything that is allocated ends up in the final solution, whereas my program allocates a list of 100,000 edge/3 terms which must then be garbage collected. (In SWI-Prolog, you can see the garbage collections if you turn on the profiler and/or GC tracing.)

If I really needed this to be as fast as possible and to be generalizable to many different values of N, I'd write a macro that expands to something like the solution in the question.

Edit: If the "idiomatic" requirement is lifted, I would resort to storing the edge database as a list in an SWI-Prolog global variable. In that case, my single-pass implementation would work without the findall/3 overhead and without producing intermediate garbage.

like image 45
Isabelle Newbie Avatar answered Mar 19 '23 23:03

Isabelle Newbie