Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

An infinite language can't be regular? What is a finite language?

I read this in a book on computability:

(Kleene's Theorem) A language is regular if and only if it can be obtained from finite languages by applying the three operations union, concatenation, repetition a finite number of times.

I am struggling with "finite languages".

Consider this language: L = a*

It is not finite. It is the set {0, a, aa, aaa, ...} which is clearly an infinite set (0 = the empty string).

So it is an infinite language, right? That is, "infinite set" means "infinite language", right?

Clearly, a* is a regular language. And it is an infinite language. Thus, by Kleene's Theorem it cannot be a regular language. Contradiction.

I'm confused. I guess that I don't know what "finite language" means.

like image 455
Roger Costello Avatar asked Jul 01 '13 19:07

Roger Costello


People also ask

What is a finite language?

Finite languages, those containing only a finite number of words. These are regular languages, as one can create a regular expression that is the union of every word in the language.

Is a language regular if it is finite?

The following theorem shows that any finite language is regular. We say a language is finite if it consists of a finite number of strings, that is, a finite language is a set of n strings for some natural number n. Theorem 2: A finite language is regular.

Is a language infinite or finite?

A language is a set of strings. It is finite if it has a finite number of strings in it.

Are there any infinite regular languages?

(An infinite language is a language with infinitely many strings in it. {an | n ≥ 0}, {ambn | m, n ≥ 0}, and {a, b}∗ are all infinite regular languages.) Lemma 1. If A is an infinite language, then for every natural number n ≥ 0, there exists a string w ∈ A such that |w| > n.


3 Answers

A finite language is a language containing a finite number of words. The simplest cases are those containing no words at all, the empty string, and a single string consisting of a single symbol (e.g. a in your example).

I think your confusion comes from misreading the rule you quote (as are some of those commenting on the question).

(Kleene's Theorem) A language is regular if and only if it can be obtained from finite languages by applying the three operations union, concatenation, repetition a finite number of times.

The passage is not talking about the number of operations on strings needed to create all the strings in a language, but about the number of operations on simpler languages needed to define the language being defined. The language you mention is built by starting with a finite language (the set {"a"}) and applying the repetition operator once.

A different and less direct way of putting the point would appeal not to languages and operations on languages, but to expressions denoting languages and more complex expressions combining them: a language is regular if and only if it can be denoted by a regular expression containing a finite number of operators.

Take an expression like a, denoting the finite language containing just the single word "a". We can add a single repetition operator to that expression, and we get a*, the infinite language containing every concatenation of zero or more words from the first language. Every finite expression E we can build by starting from expressions denoting finite languages and by combining one or two smaller expressions F and G using the patterns E = F | G, E = FG, or E = F* will denote a regular language. Expressions denoting finite languages (languages with a finite number of words) are the base case when the rule is stated in terms of expressions; finite languages are the base case when the rule is stated directly in terms of languages, without any detours to expression-land.

If we allow union, concatenation, and repetition to be applied infinitely many times (or equivalently if we allow infinite expressions using the rules for regular exressions), the resulting language is not guaranteed regular. This is the analogue at the regular-language level of the observation that infinitely large context-free grammars can define non-context-free languages, as illustrated by van Wijngaarden grammars.

like image 171
C. M. Sperberg-McQueen Avatar answered Sep 17 '22 20:09

C. M. Sperberg-McQueen


Very shortly, your statement is saying:

A language is regular if and only if there is a regular expression for it.


The "can be obtained from finite languages by applying the three operations union, concatenation, repetition a finite number of times" part is essentially a quick verbal definition of a regular expression. Usually, a regular expression (RE) is formally defined starting with the following base cases:

  • ∅ (the empty set) is an RE
  • ε (the empty string) is an RE
  • a is an RE, where a is in the alphabet

These are all finite languages. We then obtain other REs by applying the following three recursive rules a finite number of times:

  • A | B is an RE where both A and B are REs
  • AB is an RE where both A and B are REs
  • A* is an RE where A is an RE

In the end, you can create infinite languages using finite descriptions (a regular expression).

like image 24
DPenner1 Avatar answered Sep 21 '22 20:09

DPenner1


You are on the right track, it could be clearer. Kleen's theorem expresses the equivalence of three statements

A language is regular == A language can be expressed by a Regular Expression == A language can be expressed by a finite automata.

Your example is indeed a regular language. A finite language is what you would expect it to be, a language that can be listed in a finite amount of time.

When they are talking about repetition, they are talking about the Kleen Star operation, which is exactly what a* represents, the set {empty, a, aa, aaa, aaaa, ...}

EDIT:

I have found this link: Kleenes Theorem which helps quite a bit. It by 'repetition' they mean Kleen Star, then the original statement makes sense. a* is Kleen_Star(a)

like image 31
dakillakan Avatar answered Sep 21 '22 20:09

dakillakan