What is an example of a lexical error and is it possible that a language has no lexical errors?

Tags:

for our compiler theory class, we are tasked with creating a simple interpreter for our own designed programming language. I am using jflex and cup as my generators but i'm a bit stuck with what a lexical error is. Also, is it recommended that i use the state feature of jflex? it feels wrong as it seems like the parser is better suited to handling that aspect. and do you recommend any other tools to create the language. I'm sorry if i'm impatient but it's due on tuesday.

945

asked Aug 14 '10 18:08

cesar

2 Answers

A lexical error is any input that can be rejected by the lexer. This generally results from token recognition falling off the end of the rules you've defined. For example (in no particular syntax):

[0-9]+   ===> NUMBER token
[a-zA-Z] ===> LETTERS token
anything else ===> error!

If you think about a lexer as a finite state machine that accepts valid input strings, then errors are going to be any input strings that do not result in that finite state machine reaching an accepting state.

The rest of your question was rather unclear to me. If you already have some tools you are using, then perhaps you're best to learn how to achieve what you want to achieve using those tools (I have no experience with either of the tools you mentioned).

EDIT: Having re-read your question, there's a second part I can answer. It is possible that a language could have no lexical errors - it's the language in which any input string at all is valid input.

177

answered Sep 18 '22 20:09

Gian

A lexical error could be an invalid or unacceptable character by the language, like '@' which is rejected as a lexical error for identifiers in Java (it's reserved).

Lexical errors are the errors thrown by your lexer when unable to continue. Which means that there's no way to recognise a lexeme as a valid token for you lexer. Syntax errors, on the other side, will be thrown by your scanner when a given set of already recognised valid tokens don't match any of the right sides of your grammar rules.

it feels wrong as it seems like the parser is better suited to handling that aspect

No. It seems because context-free languages include regular languages (meaning than a parser can do the work of a lexer). But consider than a parser is a stack automata, and you will be employing extra computer resources (the stack) to recognise something that doesn't require a stack to be recognised (a regular expression). That would be a suboptimal solution.

NOTE: by regular expression, I mean... regular expression in the Chomsky Hierarchy sense, not a java.util.regex.* class.

answered Sep 18 '22 20:09

Martín Schonaker

Related questions
                            
                                Unit testing a compiler
                            
                                Example compilers [closed]
                            
                                Which programming languages have a regular grammar?
                            
                                Code generation for expressions with fixed/preassigned register
                            
                                What language features are required in a programming language to make a compiler?
                            
                                Compilation vs translation, "compiling" Java to bytecode?
                            
                                What are the key design choices to make a wicked fast compiler?
                            
                                Why can't the compiler just compile my code as I type it?
                            
                                complexity of parsing C++
                            
                                Assembly Performance Tuning
                            
                                Formally constructing Control Flow Graph
                            
                                Help understanding LR(1) parsers, table generation? Any other resources?
                            
                                Parsing "off-side" (indentation-based) languages
                            
                                Reversible computing platform
                            
                                Languages and VMs: Features that are hard to optimize and why
                            
                                Minimum pumping length for the following regular languages
                            
                                Removing Left Recursion in ANTLR
                            
                                Does the compiler decide when to inline my functions (in C++)?
                            
                                How to construct parsing table for LL(k>1)?
                            
                                Are there any "fun" ways to learn about Languages, Grammars, Parsing and Compilers? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is an example of a lexical error and is it possible that a language has no lexical errors?

Tags:

compiler-theory

jflex

cesar

People also ask

2 Answers

Gian

Martín Schonaker

Recent Activity

Donate For Us