Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is an example of a lexical error and is it possible that a language has no lexical errors?

for our compiler theory class, we are tasked with creating a simple interpreter for our own designed programming language. I am using jflex and cup as my generators but i'm a bit stuck with what a lexical error is. Also, is it recommended that i use the state feature of jflex? it feels wrong as it seems like the parser is better suited to handling that aspect. and do you recommend any other tools to create the language. I'm sorry if i'm impatient but it's due on tuesday.

like image 945
cesar Avatar asked Aug 14 '10 18:08

cesar


People also ask

What is lexical error in language?

Lexical errors are categorized under this type of error when a lexical item used in a sentence does not suit or collocate with another part of the sentence, these items sound unnatural or inappropriate.

What are the types of lexical errors?

The lexical errors found in these compositions have been counted and grouped into seven categories as follows; errors of wrong word choice, errors of literal translation, errors of omission or incompletion, misspelling, errors of redundancy, errors of collocation, and errors of word formation.

What are lexical errors What are the possible recovery mechanisms?

Error Recovery in Lexical AnalyzerRemoves one character from the remaining input. In the panic mode, the successive characters are always ignored until we reach a well-formed token. By inserting the missing character into the remaining input. Replace a character with another character.

What are the lexical phase errors?

Lexical error is a sequence of characters that does not match the pattern of any token. Lexical phase error is found during the execution of the program.


2 Answers

A lexical error is any input that can be rejected by the lexer. This generally results from token recognition falling off the end of the rules you've defined. For example (in no particular syntax):

[0-9]+   ===> NUMBER token
[a-zA-Z] ===> LETTERS token
anything else ===> error!

If you think about a lexer as a finite state machine that accepts valid input strings, then errors are going to be any input strings that do not result in that finite state machine reaching an accepting state.

The rest of your question was rather unclear to me. If you already have some tools you are using, then perhaps you're best to learn how to achieve what you want to achieve using those tools (I have no experience with either of the tools you mentioned).

EDIT: Having re-read your question, there's a second part I can answer. It is possible that a language could have no lexical errors - it's the language in which any input string at all is valid input.

like image 177
Gian Avatar answered Sep 18 '22 20:09

Gian


A lexical error could be an invalid or unacceptable character by the language, like '@' which is rejected as a lexical error for identifiers in Java (it's reserved).

Lexical errors are the errors thrown by your lexer when unable to continue. Which means that there's no way to recognise a lexeme as a valid token for you lexer. Syntax errors, on the other side, will be thrown by your scanner when a given set of already recognised valid tokens don't match any of the right sides of your grammar rules.

it feels wrong as it seems like the parser is better suited to handling that aspect

No. It seems because context-free languages include regular languages (meaning than a parser can do the work of a lexer). But consider than a parser is a stack automata, and you will be employing extra computer resources (the stack) to recognise something that doesn't require a stack to be recognised (a regular expression). That would be a suboptimal solution.

NOTE: by regular expression, I mean... regular expression in the Chomsky Hierarchy sense, not a java.util.regex.* class.

like image 20
Martín Schonaker Avatar answered Sep 18 '22 20:09

Martín Schonaker