Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prolog - handling binary data with DCGs

It seems to me one ought to be able to process binary data with DCGs on a list of bytes. To make it work generally, though, one has to use bitwise operations which means is/2 is involved, which means instantiation order is an issue, which may confound using DCGs to both parse and generate. The idea here is to serialize/deserialize binary data but I think this example is simple enough to illustrate the problem.

Let me illustrate with some code. Suppose I have a binary protocol. I want to read two 4-bit integers from a byte. My naive self tries this:

two_four_bit_ints(High, Low) -->
  [B],
  {
    High is B >> 4,
    Low  is B /\ 0xF
  }.

This seems to work for parsing:

?- phrase(two_four_bit_ints(H,L), [255]).
H = L, L = 15.

?- phrase(two_four_bit_ints(H,L), [0]).
H = L, L = 0.

?- phrase(two_four_bit_ints(H,L), [15]).
H = 0,
L = 15.

?- phrase(two_four_bit_ints(H,L), [240]).
H = 15,
L = 0.

But this is not going to generate:

?- phrase(two_four_bit_ints(15,15), [X]).
ERROR: is/2: Arguments are not sufficiently instantiated

?- phrase(two_four_bit_ints(15,15), X).
ERROR: is/2: Arguments are not sufficiently instantiated

Not sure what to do. I'm bracing for someone to shout "Use clpfd" but it does not appear to support bit shift operations, and I'd be concerned about the performance effect of invoking such a powerful system in the middle of low-level code.

Since I don't see a lot of helpers for binary around, is there some other more preferred way of doing binary extraction/encoding in Prolog? I'm only using SWI for now so I'm happy to accept suggestions that are not portable to ISO, but if it is portable that's nice too. I was rather hoping to find someone had ported something like Erlang's bit syntax but didn't have any luck searching.

like image 982
Daniel Lyons Avatar asked Oct 31 '14 21:10

Daniel Lyons


1 Answers

Better support for binary data would be a very nice to have feature in Prolog. However, the relational nature of Prolog makes a general solution quite difficult. So you are faced with a serious decision: Either, you map some library of other languages directly into Prolog, thereby ignoring Prolog's relational nature (and ideally warding off all borders with clean instantiation errors) or you opt for a more relational approach.

When opting for a more relational solution you might either use existing libraries as library(clfd) or implement the entire constraint mechanism yourself. With some clever restrictions you might get away with a much simpler approach, but I doubt that this will work out. The tradeoffs are in areas of correctness and efficiency. Note that the clpfd systems in SICStus or SWI needed literally decades to reach their level of quality.

No matter which way you will take, some remarks:

Efficiency of library(clpfd)

library(clpfd) in SWI-Prolog has been specifically optimized to be (in some cases) comparable in performance to the traditional (is)/2. To see this, compile the rule:

list_len([_|Es], N0) :- N0 #> 0, N1 #= N0-1, list_len(Es, N1).

and look at the generated code with listing(list_len):

list_len([_|C], A) :-
    (   integer(A)
    ->  A>=0+1
    ;   clpfd:clpfd_geq(A, 1)
    ),
    (   integer(A)
    ->  B is A+ -1
    ;   clpfd:clpfd_equal(B, A-1)
    ),
    list_len(C, B).

Effectively, the built-ins for evaluable expressions like (is)/2 and (>=)/2 are used for those cases that correspond to those primitive operations directly.

To simulate bitshift operations fully, you would, however, need (div)/2 which is currently only supported by SICStus' library(clpfd) but not SWI's. So some extra headache would await you here. But as long as you are using unsigned nonnegative values, no problem should arise. For general shifts you will need (^)/2 which is supported by SWI - but not by SICStus.

And here is the CLPFD-version:

two_four_bit_ints(High, Low) -->
  [B],
  { B in 0..255,
    Low in 0..15,
    High in 0..15,
    B #= Low + High*16
  }.

Note that your original program unintentionally defines behavior in rather unintended situations, like B = -1234, B = 1+1. You could add between(0, 255, B) but then you would get combinatorial enumerations (read: explosions) easily.

The current implementations of library(clpfd) might be improved significantly for such cases even further, but in order to improve them, one must use them!

I/O and pio

ISO Prolog supports elementary I/O operations on

  • bytes (get_byte/1),
  • codes (get_code/1), and
  • characters (get_char/1).

If you want to use DCGs, you will certainly want to use library(pio). Currently, SWI's library(pio) only supports codes.

like image 95
false Avatar answered Sep 19 '22 12:09

false