Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nullability (Regular Expressions)

In Brzozowski's "Derivatives of Regular Expressions" and elsewhere, the function δ(R) returning λ if a R is nullable, and ∅ otherwise, includes clauses such as the following:

δ(R1 + R2) = δ(R1) + δ(R2)
δ(R1 · R2) = δ(R1) ∧ δ(R2)

Clearly, if both R1 and R2 are nullable then (R1 · R2) is nullable, and if either R1 or R2 is nullable then (R1 + R2) is nullable. It is unclear to me what the above clauses are supposed to mean, however. My first thought, mapping (+), (·), or the Boolean operations to regular sets is nonsensical, since in the base case,

δ(a) = ∅ (for all a ∈ Σ)
δ(λ) = λ
δ(∅) = ∅

and λ is not a set (nor is a set the return type of δ, which is a regular expression). Furthermore, this mapping isn't indicated, and there is a separate notation for it. I understand nullability, but I'm lost on the definition of the sum, product, and Boolean operations in the definition of δ: how are λ or ∅ returned from δ(R1) ∧ δ(R2), for instance, in the definition off δ(R1 · R2)?

like image 551
emi Avatar asked Jan 02 '11 08:01

emi


2 Answers

I think you were right to map + and ^ to boolean or and and respectively. It looks like the two lines you cited deal with alternation (sum) and concatenation (product):

δ(R1 + R2) = δ(R1) + δ(R2)

The alternation of R1 and R2 is nullable if R1 is nullable, R2is nullable, or both R1 and R2 are nullable.

δ(R1 · R2) = δ(R1) ∧ δ(R2)

The concatenation of R1 and R2 is only nullable if both R1 and R2 are nullable.

See here for an Haskell implementation of these rules.

like image 175
Frédéric Hamidi Avatar answered Sep 22 '22 15:09

Frédéric Hamidi


I think you're getting caught out by the notational liberties taken by the author. The return type of δ(R) is most certainly a set, or rather a language. If you look at the definition:

alt text

you can see that there is an inconsistency in the return type, formally λ is an element, but ∅ is the empty language... What it should say is:

alt text

The fact that the author uses λ for both the empty string as well as the language containing only the empty string is further evidenced by his definition of the Kleene star operator:

alt text

Clearly, the last part should be alt text if we want to be pedantic.

Given that the return type of δ(R) is a set, or rather a language, the equations you give make perfect sense and express exactly what you described.

like image 35
wich Avatar answered Sep 22 '22 15:09

wich