I'm playing with a formalisation of a certified regular expression matcher in Idris (I believe that the same problem holds in any type theory based proof assistant, such as Agda and Coq) and I'm stuck on how to define semantics of the complement operation. I have the following data type to represent semantics of regular expressions:
data InRegExp : List Char -> RegExp -> Type where
InEps : InRegExp [] Eps
InChr : InRegExp [ a ] (Chr a)
InCat : InRegExp xs l ->
InRegExp ys r ->
zs = xs ++ ys ->
InRegExp zs (Cat l r)
InAltL : InRegExp xs l ->
InRegExp xs (Alt l r)
InAltR : InRegExp xs r ->
InRegExp xs (Alt l r)
InStar : InRegExp xs (Alt Eps (Cat e (Star e))) ->
InRegExp xs (Star e)
InComp : Not (InRegExp xs e) -> InRegExp xs (Comp e)
My problem is to represent the type of InComp constructor since it has a non-strictly positive occurrence of InRegExp
due to the usage of Not
. Since such data types can be used to define non-terminating functions, they are rejected by terminations checker. I would like to define such semantics in a way that it is accepted by Idris termination checker.
Is there some way that could I represent semantics of complement operation without have negative occurrences of InRegExp
?
You can define InRegex
by recursion on Regex
. In that case, strict positivity is no issue, but we have to recurse structurally:
import Data.List.Quantifiers
data Regex : Type where
Chr : Char -> Regex
Eps : Regex
Cat : Regex -> Regex -> Regex
Alt : Regex -> Regex -> Regex
Star : Regex -> Regex
Comp : Regex -> Regex
InRegex : List Char -> Regex -> Type
InRegex xs (Chr x) = xs = x :: []
InRegex xs Eps = xs = []
InRegex xs (Cat r1 r2) = (ys ** (zs ** (xs = ys ++ zs, InRegex ys r1, InRegex zs r2)))
InRegex xs (Alt r1 r2) = Either (InRegex xs r1) (InRegex xs r2)
InRegex xs (Star r) = (yss ** (All (\ys => InRegex ys r) yss, xs = concat yss))
InRegex xs (Comp r) = Not (InRegex xs r)
We would need an inductive type for the Star
case if we wanted to use our old definition. The following obviously doesn't work:
InRegex xs (Star r) = InRegex (Alt Eps (Cat r (Star r)))
However, we can just simply change the definition and make finiteness explicit:
InRegex xs (Star r) = (yss ** (All (\ys => InRegex ys r) yss, xs = concat yss))
This has the intended meaning and I don't see any drawbacks to it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With