Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to write a data structure or data structures that represent only closed terms in Haskell or any other language?

Using the De Bruijn notation, it is possible to define lambda terms as:

data BTerm = BVar Int | BLam BTerm | BApp BTerm BTerm

Or using the usual notation, data Term = Var String | Lam String Term | App Term Term

These two datatypes allow construction of both closed terms and terms containing free variables.

Is it possible to define a datatype that allows construction of only closed terms. ie only the terms such as: \x.x, \x. x x, \x.\y. x y, \x.\y. y, \x.\y.\z.z(x y)

like image 837
Edgar Avatar asked May 29 '21 06:05

Edgar


People also ask

Can a variable be a data structure?

A rough definition of a data structure is that it allows you to store data and apply a set of operations on that data while preserving consistency of data before and after the operation. However some people insist that a primitive variable like 'int' can also be considered as a data structure.

Can a structure hold data?

is a definition with the field details and which can not hold any data in it.


3 Answers

You can use GADTs to force the list of free variables to be empty. The free variables can be kept in a type-level list. Below, I chose to use De Bruijn indices to represent variables.

We start by defining how to append two type level lists:

{-# LANGUAGE KindSignatures, DataKinds, TypeFamilies, TypeOperators,
    GADTs, ScopedTypeVariables, TypeApplications #-}
{-# OPTIONS -Wall #-}

import GHC.TypeLits
import Data.Proxy

-- Type level lists append
type family (xs :: [Nat]) ++ (ys :: [Nat]) :: [Nat] where
   '[]       ++ ys = ys
   (x ': xs) ++ ys = x ': (xs ++ ys)

Then we compute the free variables of \ t given those of t.

-- Adjust Debuijn indices under a lambda:
-- remove zeros, decrement positives
type family Lambda (xs :: [Nat]) where
   Lambda '[]       = '[]
   Lambda (0 ': xs) = Lambda xs
   Lambda (x ': xs) = x-1 ': Lambda xs

Finally our GADT:

-- "BTerm free" represents a lambda term with free variables "free"
data BTerm (free :: [Nat]) where
   BVar :: KnownNat n => BTerm '[n]
   BLam :: BTerm free -> BTerm (Lambda free)
   BApp :: BTerm free1 -> BTerm free2 -> BTerm (free1 ++ free2)

A type for closed terms is now trivial to define:

-- Closed terms have no free variables
type Closed = BTerm '[]

We are done. Let's write some tests. We start from a Show instance to be able to actually print the terms.

showBVar :: forall n. KnownNat n => BTerm '[n] -> String
showBVar _ = "var" ++ show (natVal (Proxy @n))

instance Show (BTerm free) where
   show t@BVar       = showBVar t
   show (BLam t)     = "\\ " ++ show t
   show (BApp t1 t2) = "(" ++ show t1 ++ ")(" ++ show t2 ++ ")"

And here's a couple of tests:

-- \x. \y. \z. z (x y)
-- Output: \ \ \ (var0)((var2)(var1))
test1 :: Closed
test1 = BLam (BLam (BLam (BApp z (BApp x y))))
   where
   z = BVar @0
   y = BVar @1
   x = BVar @2

-- \x. \y. x y (\z. z (x y))
-- Output: \ \ ((var1)(var0))(\ (var0)((var2)(var1)))
test2 :: Closed
test2 = BLam (BLam (BApp (BApp x' y') (BLam (BApp z (BApp x y)))))
   where
   z  = BVar @0
   y  = BVar @1
   x  = BVar @2
   y' = BVar @0
   x' = BVar @1
like image 103
chi Avatar answered Oct 18 '22 08:10

chi


If you want to encode it in some form that looks close to arbitrary lambda expressions, I suspect it will require some type-level programming to track the current lambda depth. This will make it hard to combine terms without knowing at compile time what types those terms have.

But if you don't mind something equivalent that looks quite different, it is well known that SKI combinators are equivalent to lambda calculus. And since SKI offers no explicit lambdas or indeed variable references at all, there's no way to encode a non-closed term.

like image 32
amalloy Avatar answered Oct 18 '22 09:10

amalloy


Sure, this is even implemented as a library in bound, which is a generalisation of a representation of De Bruijn indices using a (polymorphically) recursive data type:

data Var bound free
  = Bound bound
  | Free free

newtype Scope bound term a
  = Scope { runScope :: term (Var bound (term a)) }

data Term var
  = Var var
  | App (Term var) (Term var)
  | Lam (Scope () Term var)
  -- == Lam (Term (Var () (Term var)))

The type of terms is indexed by the type of variables in those terms, both bound and free. When defining a new nested scope in a lambda term, we change that type: for bound variables, we just add another level of nesting, here annotated with () to give plain De Bruijn indices; for free variables, we just pass the type along, but also add a level of nesting with Term. The nesting at the type level reflects the number of De Bruijn levels that indices may refer to.

Now Term Void is the type of a term with no free variables; it may still have bound variables, by virtue of the fact that the recursion is polymorphic: Lam (Scope (Var (Bound ()))) :: Term Void represents (λx. x).

This method works in plain Haskell 98 (i.e. without GADTs), although there are advantages to adding some fancy types, e.g. I regularly use them for typed ASTs and statically typed typecheckers. Ed Kmett gave a nice overview of the design of bound at School of Haskell. There are related libraries in this space such as unbound and unbound-generics.

like image 44
Jon Purdy Avatar answered Oct 18 '22 08:10

Jon Purdy