Using the De Bruijn notation, it is possible to define lambda terms as:
data BTerm = BVar Int | BLam BTerm | BApp BTerm BTerm
Or using the usual notation,
data Term = Var String | Lam String Term | App Term Term
These two datatypes allow construction of both closed terms and terms containing free variables.
Is it possible to define a datatype that allows construction of only closed terms. ie only the terms such as: \x.x, \x. x x, \x.\y. x y, \x.\y. y, \x.\y.\z.z(x y)
A rough definition of a data structure is that it allows you to store data and apply a set of operations on that data while preserving consistency of data before and after the operation. However some people insist that a primitive variable like 'int' can also be considered as a data structure.
is a definition with the field details and which can not hold any data in it.
You can use GADTs to force the list of free variables to be empty. The free variables can be kept in a type-level list. Below, I chose to use De Bruijn indices to represent variables.
We start by defining how to append two type level lists:
{-# LANGUAGE KindSignatures, DataKinds, TypeFamilies, TypeOperators,
GADTs, ScopedTypeVariables, TypeApplications #-}
{-# OPTIONS -Wall #-}
import GHC.TypeLits
import Data.Proxy
-- Type level lists append
type family (xs :: [Nat]) ++ (ys :: [Nat]) :: [Nat] where
'[] ++ ys = ys
(x ': xs) ++ ys = x ': (xs ++ ys)
Then we compute the free variables of \ t
given those of t
.
-- Adjust Debuijn indices under a lambda:
-- remove zeros, decrement positives
type family Lambda (xs :: [Nat]) where
Lambda '[] = '[]
Lambda (0 ': xs) = Lambda xs
Lambda (x ': xs) = x-1 ': Lambda xs
Finally our GADT:
-- "BTerm free" represents a lambda term with free variables "free"
data BTerm (free :: [Nat]) where
BVar :: KnownNat n => BTerm '[n]
BLam :: BTerm free -> BTerm (Lambda free)
BApp :: BTerm free1 -> BTerm free2 -> BTerm (free1 ++ free2)
A type for closed terms is now trivial to define:
-- Closed terms have no free variables
type Closed = BTerm '[]
We are done. Let's write some tests. We start from a Show
instance to be able to actually print the terms.
showBVar :: forall n. KnownNat n => BTerm '[n] -> String
showBVar _ = "var" ++ show (natVal (Proxy @n))
instance Show (BTerm free) where
show t@BVar = showBVar t
show (BLam t) = "\\ " ++ show t
show (BApp t1 t2) = "(" ++ show t1 ++ ")(" ++ show t2 ++ ")"
And here's a couple of tests:
-- \x. \y. \z. z (x y)
-- Output: \ \ \ (var0)((var2)(var1))
test1 :: Closed
test1 = BLam (BLam (BLam (BApp z (BApp x y))))
where
z = BVar @0
y = BVar @1
x = BVar @2
-- \x. \y. x y (\z. z (x y))
-- Output: \ \ ((var1)(var0))(\ (var0)((var2)(var1)))
test2 :: Closed
test2 = BLam (BLam (BApp (BApp x' y') (BLam (BApp z (BApp x y)))))
where
z = BVar @0
y = BVar @1
x = BVar @2
y' = BVar @0
x' = BVar @1
If you want to encode it in some form that looks close to arbitrary lambda expressions, I suspect it will require some type-level programming to track the current lambda depth. This will make it hard to combine terms without knowing at compile time what types those terms have.
But if you don't mind something equivalent that looks quite different, it is well known that SKI combinators are equivalent to lambda calculus. And since SKI offers no explicit lambdas or indeed variable references at all, there's no way to encode a non-closed term.
Sure, this is even implemented as a library in bound, which is a generalisation of a representation of De Bruijn indices using a (polymorphically) recursive data type:
data Var bound free
= Bound bound
| Free free
newtype Scope bound term a
= Scope { runScope :: term (Var bound (term a)) }
data Term var
= Var var
| App (Term var) (Term var)
| Lam (Scope () Term var)
-- == Lam (Term (Var () (Term var)))
The type of terms is indexed by the type of variables in those terms, both bound and free. When defining a new nested scope in a lambda term, we change that type: for bound variables, we just add another level of nesting, here annotated with ()
to give plain De Bruijn indices; for free variables, we just pass the type along, but also add a level of nesting with Term
. The nesting at the type level reflects the number of De Bruijn levels that indices may refer to.
Now Term Void
is the type of a term with no free variables; it may still have bound variables, by virtue of the fact that the recursion is polymorphic: Lam (Scope (Var (Bound ()))) :: Term Void
represents (λx. x).
This method works in plain Haskell 98 (i.e. without GADTs), although there are advantages to adding some fancy types, e.g. I regularly use them for typed ASTs and statically typed typecheckers. Ed Kmett gave a nice overview of the design of bound
at School of Haskell. There are related libraries in this space such as unbound and unbound-generics.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With