I'm using OCaml to build a recursive descent parser for a subset of Scheme. Here's the grammar:
S -> a|b|c|(T)
T -> ST | Epsilon
So say I have:
type expr =
Num of int | String of string | Tuple of expr * expr
Pseudocode
These functions have to return expr type to build the AST
parseS lr =
if head matches '(' then
parseL lr
else
match tokens a, b, or c
Using First Set of S which are tokens and '(':
parseL lr =
if head matches '(' or the tokens then
Tuple (parseS lr, parseL lr)
else
match Epsilon
My question is "How do I return for the Epsilon part since I just can't return ()?". An OCaml function requires same return type and even if I leave blank for Epsilon part, OCaml still assumes unit type.
As far as I can see, your AST doesn't match your grammar.
You can solve the problem by having a specifically empty node in your AST type to represent the Epsilon in your grammar.
Or, you can change your grammar to factor out the Epsilon.
Here's an equivalent grammar with no Epsilon:
S -> a|b|c|()|(T)
T -> S | S T
Maybe instead of creating parser-functions manually it is better to use existent approaches: for example, LALR(1) ocamlyacc or camlp4 based LL(k) parsers ?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With