I'm very impressed by Prolog's DCG and how quickly I can produce all the possible structures that fit a particular grammar.
But I'd like to combine this search with other constraints. For example, define a complex grammar and ask Prolog to generate all sentences with not more than 10 words. Or all sentences that don't repeat the same word twice.
Is it possible to add extra constraints like this to a DCG grammer? Or do I basically have to translate the DCG back into normal Prolog clauses and start modifying them?
If you only want to see all sentences that are generated, it is very convenient to use the following:
?- length(Xs, N), phrase(mynonterminal, Xs).
Of course that generates all sentences. But it is very useful and it saves you the time to think of a concrete limit. If you want to restrict that further, add the goal between(0,10,N)
in front.
If you want to say within a grammar, that a certain non-terminal should take a certain length, it is best to say this explicitly:
seq([]) --> [].
seq([E|Es]) --> [E], seq(Es).
a --> {length(Es,10)}, seq(Es), {phrase(mynonterminal,Es)}.
If you are still not happy, then you want to express the intersection of two non-terminals. This is tantamount to asking the intersection of two context free languages which is in the general case undecidable. But much earlier, you will have problems with termination. So be aware of that in what follows:
:- op( 950, xfx, &).
(NT1 & NT2) -->
call(Xs0^Xs^(phrase(NT1,Xs0,Xs),phrase(NT2,Xs0,Xs))).
The following is only needed if you do not use library(lambda):
^(V0, Goal, V0, V) :-
call(Goal,V).
^(V, Goal, V) :-
call(Goal).
So this permits you now to express the intersection of two non-terminals. But please, be aware that termination is very brittle here. In particular, the termination of the first non-terminal does not necessarily limit the second.
well, you can always use {} and write any kind of prolog predicate in-between, for example:
foo(X)-->
{ valid(X) },
[a].
foo(X)-->
[b].
so you could add some sort of word counter. of course, if each token is a word you could simply write something like: length(L,N), N<11, start(L,[]).
on the other hand, perhaps it will be better, depending on the complexity of the constrains, to encode them in a different part. something like parser->semantic checker in compilers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With