In trying to decide if a EDSL is prudent for my project, I read this paper and this paper, describing the implementation of meta-repa. They both mention HOAS and FOAS. From the first paper,
data FunC a where LitI :: Int -> FunC Int LitB :: Bool -> FunC Bool If :: FunC Bool -> FunC a -> FunC a -> FunC a While :: (FunC s -> s -> FunC Bool) -> (FunC s -> FunC s) -> FunC s -> FunC s Pair :: FunC a -> FunC b -> FunC (a, b) Fst :: FunC (a, b) -> FunC a Snd :: FunC (a, b) -> FunC b Prim1 :: String -> (a -> b) -> FunC a -> FunC b Prim2 :: String -> (a -> b -> c) -> FunC a -> FunC b -> FunC c Value :: a -> FunC a Variable :: String -> FunC a
We have also chosen Higher Order Abstract Syntax to represent constructs with variable binding. In the above data type, the only higher-order construct is
While
.
What about the While
constructor makes it HOAS? Why are none of the other constructors HOAS?
In the second paper, meta-repa code is written in a HOAS tree, then transformed (at compile-time) to a FOAS for further processing. Again, I don't understand what makes the data defined in HOAS.hs HOAS while the data defined in FOASTyped is FOAS. The mysterious quote from that paper is:
The type
Expr
[in HOAS.hs] uses higher order abstract syntax to represent programs. This representation is convenient for programming with but somewhat less ideal for rewriting programs. The AST is therefore converted into a first order representation[.] A possible implementation would have been to skip the [HOAS]Expr
type and generate the first order representation directly. We have kept the higher order representation partly because it helps maintain the type safety of the implementation and partly because it allows us to write a well typed, tagless interpreter.
Is there some general way in which HOAS is more difficult to transform than FOAS? How does HOAS help with type safety compared to FOAS?
I've read the Wikipedia article on FOAS and HOAS, but that didn't clear anything up for me.
Wikipedia suggests that HOAS is useful in languages with variable binders (also mentioned in the first quote). What is a variable binder, how does Haskell implement it, and what languages don't have variable binders?
In FOAS we represent variables with identifiers, so
data STLC = Var String
| Lam String STLC
| Unit
| STLC :*: STLC
term = Lam "a" $
Lam "b" $
Var "a" :*: (Lam "a" $ Var "a")
We have explicit variables and now it's up to us to make sure that scoping and variable binding works properly. The extra work has it's rewards however since we can now inspect and pattern match across a lambda's body which is vital for most transformation.
HOAS is essentially where we use the host languages (Haskell's) implementation of variables instead of representing them in the AST.
For example, consider STLC
data STLC = Unit
| Lam (STLC -> STLC)
| STLC :*: STLC
Notice how we use the Haskell function STLC -> STLC
to represent the variable that's bound by a lambda. That means that we can write
term = Lam $ \a ->
Lam $ \b ->
a :*: (Lam $ \a -> a)
and it works. In a normal AST, we'd have to make sure that we alpha-convert everything properly to ensure that we respect scoping properly. This same advantage applies to all things that bind variables (variable-binders): Let expressions, continuations, exception handlers, whatever.
This comes with a major disadvantage though, since Lam
has a fully abstract function, we can't inspect the body of the function at all. This makes a lot of transformations well, painful since everything is wrapped up under the the Haskell binding.
Another benefit is that since we don't provide an explicit constructor for variables, all terms are guaranteed closed.
Usually this means we represent things with a combination of HOAS and FOAS.
jozefg's answer explains what FOAS and HOAS is, so in this answer, I just try to answer the various smaller points from the question. Read jozefg's answer first, I guess.
What about the While constructor makes it HOAS?
Let's look at the second argument of the While
constructor: While :: ... -> (FunC s -> FunC s) -> ...
. In the type of this field, FunC
shows up to the left of an arrow. So if you use While
in a FunC
program, your program is not an abstract syntax tree in memory, but something more complicated. The intended meaning of FunC s -> FunC s
is "a FunC s
with a free variable of type s
". I guess this is used for the body of a while loop, and the free variable contains the value that changes in each loop iteration.
Why are none of the other constructors HOAS?
They don't have the ... -> (FunC ... -> ...) -> ...
pattern we saw with the While
constructor above. So if a FunC
value only uses the other constructors, its memory representation looks like an abstract syntax tree.
Again, I don't understand what makes the data defined in HOAS.hs HOAS while the data defined in FOASTyped is FOAS.
You can look at the FOAS version of the code in the paper to see how they change the type of While
to avoid the HOAS pattern, and what else they need to change to make it work.
Is there some general way in which HOAS is more difficult to transform than FOAS?
A HOAS program is not a tree, so you cannot pattern match on it. For example, you cannot pattern match on While (\_ _ (LitB False)) ...
because you cannot match on lambdas like this.
How does HOAS help with type safety compared to FOAS?
In a HOAS program, you use Haskell variables to represent FunC
variables. The Haskell typechecker will check that you only use Haskell variables in the scope of a corresponding variable binding. (GHC tells you "Not in scope: foo'
" otherwise). Because FunC
variables are represented as Haskell variables, this check is also useful for the type safety of FunC
. If you use a HOAS-encoded FunC
variable out of scope, the Haskell typechecker will complain about the Haskell variable being out of scope.
Now in FOAS, if you use Haskell Strings as FunC variables, the Haskell type checker will never complain if you use the wrong string, because you can use whatever string you want as far as GHC is concerned. There are techniques for improving FOAS to make the Haskell typechecker check your embedded program, but they tend to require more work from the user of the embedded language.
What is a variable binder?
A variable binder is language construct that introduces new names that you can use in other parts of the program. For example, in Haskell, if I write let x = 14 in ...
I introduce a new name x
that I can use in the ...
. Other binders in Haskell include lambda expressions, pattern matching, and top-level definitions.
how does Haskell implement it?
I don't really get this question. For typechecking, GHC keeps track of what variables are in scope where and complains if you use variables at the wrong place. For compilation, GHC generates machine code that "knows" where the values denoted by the variables are, usually because a pointer to the value of the variable is stored in a processor register or the stack or the heap.
and what languages don't have variable binders?
Many small and specialized languages don't have variable binders.
For example, consider regular expressions. At least originally, they cannot bind variables. (Some regular expression engines use backreferences, which are a form of variables, though).
Another example is the "language" of URLs. A URL is made of various parts (the protocol, server name, path, parameters, ...) with rules about what you can and cannot write, so it is a language. But you cannot introduce a name in a URL that you can later use in the URL.
Many low-level languages don't have variable binders.
There are Turing-complete languages without variable binders.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With