Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return multiple tokens in ocamllex

Is there any way to return multiple tokens in OCamlLex?

I'm trying to write a lexer and parser for an indentation based language, and I would like my lexer to return multiple DEDENT tokens when it notices that the indentation level is less than it previously was. This will allow it to notify the parser when multiple blocks have ended.

By following this method, I would be able to use INDENT and DEDENT as drop-in replacements for BEGIN and END, as these two tokens would be implied by the INDENT and DEDENT tokens.

like image 758
Joe Bloggs Avatar asked Aug 09 '10 06:08

Joe Bloggs


1 Answers

Return the list of tokens. If the parser cannot natively handle that (say ocamlyacc) - just insert a cache in between :

let cache =
  let l = ref [] in
  fun lexbuf ->
    match !l with
    | x::xs -> l := xs; x
    | [] -> match Lexer.tokens lexbuf with
            | [] -> failwith "oops"
            | x::xs -> l := xs; x

Or you can run the lexer on the full document and then run the parser on the full token stream.

BTW did you see ocaml+twt?

like image 169
ygrek Avatar answered Nov 06 '22 02:11

ygrek