Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Datalog computational class?

Datalog is not Turing complete.

But what is its computational class?

Is it equivalent to Finite state machine or Pushdown machine (i.e. context free) ... or is it something in between?

like image 951
Mustafa Avatar asked Oct 16 '22 07:10

Mustafa


1 Answers

Let's assume we have access, whether built-in or defined in the language by us, to the following predicates:

Head(x, y) iff y is a list and x is the first element in the list
Tail(x, y) iff x and y are lists and x is the same as y but is missing y's first element
Equal(x, y) iff x and y are the the same thing

First off, I think it's clear that this language can accept all the regular languages. By the Myhill-Nerode theorem, all states in a minimal DFA for a regular language correspond to a unique equivalence class under the indistinguishability relation. It seems like we could have one predicate per equivalence class/state to represent whether a list corresponding to an input string belongs to that class, and then another predicate that is true only if one of the predicates corresponding to an accepting state is true. So, for the language over {a, b} with an even number of a and an odd number of b, a minimal DFA has four states:

 O
 |
 V
q0<---a--->q1
 ^          ^
 |          |
 b          b
 |          |
 V          V
q2<---a--->q3

Here, q2 is the only accepting state. Our DataLog program might look like:

Q0(()).
Q0(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x), Q1(z).
Q0(x) :- Head(y, x), Equal(y, 'b'), Tail(z, x), Q2(z).
Q1(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x), Q0(z).
Q1(x) :- Head(y, x), Equal(y, 'b'), Tail(z, x), Q3(z).
Q2(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x), Q3(z).
Q2(x) :- Head(y, x), Equal(y, 'b'), Tail(z, x), Q0(z).
Q3(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x), Q2(z).
Q3(x) :- Head(y, x), Equal(y, 'b'), Tail(z, x), Q1(z).
EvenAOddB(x) :- Q2(x).

Based on this example I think it's clear we can always encode transitions in this fashion and so any regular language can be accepted. Thus, DataLog is at least as powerful as deterministic finite automata.

We can define this:

// Last(x, y) iff x is the last element of y
Last(x, y) :- Head(x, y), Tail(z, y), Equal(z, ()).

// AllButLast(x, y) iff x and y are the same list but x is missing the last element of y
AllButLast((), (x)).
AllButLast(x, y) :- Head(w, x), Head(z, y), Equal(w, z),
                    Tail(w', x), Tail(z', y), AllButLast(w', z').

Now we can recognize lists corresponding to strings in the context-free language a^n b^n:

// ANBN(x) iff x is a list beginning with n 'a's followed by n 'b's
ANBN(()).
ANBN(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x),
           Last(w, z), Equal(w, 'b'), AllButLast(z', z),
           ANBN(z').

It's easy to tweak that predicate to find the language of even-length palindromes, and from there it's easy to tweak to find the language of all palindromes. I feel confident we can also get it to accept languages like balanced parentheses, etc. Based on this experience, I am guessing that we can accept all the context-free languages.

Can we get a context-sensitive language? Let's try a^n b^n c^n. If we assume DataLog has built-in predicates like this for integer types:

Zero(x) iff x is equal to zero
Successor(x, y) iff x and y are integers and x = y + 1

Then I think we can, as follows:

ANBNCN(()).
ANBNCN(x) :- Zero(y), ANBNCNZ(x, y).

ANBNCNZ(x, y) :- BN(x, y).
ANBNCNZ(x, y) :- Head(w, x), Equal(w, 'a'),
                 Last(z, x), Equal(z, 'c'),
                 Tail(u, x), AllButLast(v, u),
                 Successor(r, y), ANBNCNZ(v, r).

BN(x, y) :- Head(w, x), Equal(w, 'b'),
            Successor(y, z), Tail(u, x),
            BN(u, z).

What the above says is the following:

  • the empty string is in a^n b^n c^n
  • otherwise, a string is in a^n b^n c^n if f(s, 0) is true
  • f(s, n) is true if s consists only of n instances of 'b'
  • f(s, n) is true if s starts with a, ends with c, and f(s', n + 1) is true of everything in the middle

This should work since each recursive call of f(s, n) strips one a and one c off the ends and remembers how many it has counted. It then counts off that many instances of b once all the a and c are gone.

My feeling based on this is that we can probably do some or all of the context-sensitive languages as well. Probably, the lack of unbounded execution is precisely what distinguishes the linear-bounded automata (productions in phrase-structure grammars must have a RHS no longer than the LHS) from general unrestricted grammars (whose intermediate forms can grow and shrink arbitrarily).

like image 199
Patrick87 Avatar answered Oct 21 '22 03:10

Patrick87