Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Formalising regular expressions with a complement operation

I'm playing with a formalisation of a certified regular expression matcher in Idris (I believe that the same problem holds in any type theory based proof assistant, such as Agda and Coq) and I'm stuck on how to define semantics of the complement operation. I have the following data type to represent semantics of regular expressions:

data InRegExp : List Char -> RegExp -> Type where
 InEps : InRegExp [] Eps
 InChr : InRegExp [ a ] (Chr a)
 InCat : InRegExp xs l ->
         InRegExp ys r ->
         zs = xs ++ ys ->
         InRegExp zs (Cat l r)
 InAltL : InRegExp xs l ->
          InRegExp xs (Alt l r)
 InAltR : InRegExp xs r ->
          InRegExp xs (Alt l r)
 InStar : InRegExp xs (Alt Eps (Cat e (Star e))) ->
          InRegExp xs (Star e)
 InComp : Not (InRegExp xs e) -> InRegExp xs (Comp e)

My problem is to represent the type of InComp constructor since it has a non-strictly positive occurrence of InRegExp due to the usage of Not. Since such data types can be used to define non-terminating functions, they are rejected by terminations checker. I would like to define such semantics in a way that it is accepted by Idris termination checker.

Is there some way that could I represent semantics of complement operation without have negative occurrences of InRegExp?

like image 927
Rodrigo Ribeiro Avatar asked Dec 08 '22 23:12

Rodrigo Ribeiro


1 Answers

You can define InRegex by recursion on Regex. In that case, strict positivity is no issue, but we have to recurse structurally:

import Data.List.Quantifiers

data Regex : Type where
  Chr  : Char -> Regex
  Eps  : Regex
  Cat  : Regex -> Regex -> Regex
  Alt  : Regex -> Regex -> Regex
  Star : Regex -> Regex
  Comp : Regex -> Regex

InRegex : List Char -> Regex -> Type
InRegex xs (Chr x)     = xs = x :: []
InRegex xs Eps         = xs = []
InRegex xs (Cat r1 r2) = (ys ** (zs ** (xs = ys ++ zs, InRegex ys r1, InRegex zs r2)))
InRegex xs (Alt r1 r2) = Either (InRegex xs r1) (InRegex xs r2)
InRegex xs (Star r)    = (yss ** (All (\ys => InRegex ys r) yss, xs = concat yss))
InRegex xs (Comp r)    = Not (InRegex xs r)

We would need an inductive type for the Star case if we wanted to use our old definition. The following obviously doesn't work:

InRegex xs (Star r) = InRegex (Alt Eps (Cat r (Star r)))

However, we can just simply change the definition and make finiteness explicit:

InRegex xs (Star r) = (yss ** (All (\ys => InRegex ys r) yss, xs = concat yss))

This has the intended meaning and I don't see any drawbacks to it.

like image 138
András Kovács Avatar answered Dec 10 '22 13:12

András Kovács