Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to return multiple tokens for one fslex rule pattern?

Tags:

f#

fslex

Using fslex I would like to return multiple tokens for one pattern but I don't see a way how to accomplish that. Even to use another rule function that returns multiple tokens would work for me.

I am trying to use something like this:

let identifier = [ 'a'-'z' 'A'-'Z' ]+

// ...

rule tokenize = parse
// ...
| '.' identifier '(' { let value = lexeme lexbuf
                       match operations.TryFind(value) with
                      // TODO: here is the problem:
                      // I would like to return like [DOT; op; LPAREN]
                      | Some op -> op
                      | None    -> ID(value) }

| identifier         { ID (lexeme lexbuf) }
// ...

The problem I am trying to solve here is to match for predefined tokens (see: operations map) only if the identifier is between . and (. Otherwise the match should be returned as an ID.

I am fairly new to fslex so I am happy for any pointers in the right direction.

like image 537
kongo2002 Avatar asked Jan 14 '23 17:01

kongo2002


1 Answers

Okay, here it is.

Each lexer rule (i.e. rule <name> = parse .. cases ..) defined a function <name> : LexBuffer<char> -> 'a, where 'a can be any type. Usually, you return tokens (possibly defined for you by FsYacc), so then you can parse text like that:

let parse text =
    let lexbuf = LexBuffer<char>.FromString text
    Parser.start Lexer.tokenize lexbuf

Where Parser.start is the parsing function (from your FsYacc file), of type (LexBuffer<char> -> Token) -> LexBuffer<char> -> AST (Token and AST are your types, nothing special about them).

In your case, you want <name> : LexBuffer<char> -> 'a list, so then all you have to do is this:

let parse' text =
    let lexbuf = LexBuffer<char>.FromString text
    let tokenize =
        let stack = ref []
        fun lexbuf ->
        while List.isEmpty !stack do
            stack := Lexer.tokenize lexbuf
        let (token :: stack') = !stack // can never get match failure,
                                        // else the while wouldn't have exited
        stack := stack'
        token
    Parser.start tokenize lexbuf

This simply saves the tokens your lexer supplies, and gives them to the parser one-by-one (and generates more tokens as needed).

like image 131
Ramon Snir Avatar answered Jan 29 '23 05:01

Ramon Snir