Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LaTeX natural deduction proofs using Haskell

How can one create LaTeX source for natural deduction proof trees (like those shown here) via Haskell eg using HaTeX? I'd like to emulate LaTeX .stys like bussproofs.sty or proof.sty.

like image 422
pravnar Avatar asked Nov 12 '13 01:11

pravnar


1 Answers

I'm using your question as an excuse to improve and demo a Haskell call-tracing library I'm working on. In the context of tracing, an obvious way to create a proof tree is to trace a type checker and then format the trace as a natural-deduction proof. To keep things simple my example logic is the simply-typed lambda calculus (STLC), which corresponds to the implicational fragment of propositional intuitionistic logic.

I am using proofs.sty, but not via HaTeX or any other Haskell Latex library. The Latex for proof trees is very simple and using a Haskell Latex library would just complicate things.

I've written the proof-tree generation code twice:

  1. in a self-contained way, by writing a type checker that also returns a proof tree;

  2. using my tracing library, by call-tracing a type checker and then post processing the trace into a proof tree.

Since you didn't ask about call-tracing libraries, you may be less interested in the call-trace based version, but I think it's interesting to compare both versions.

Examples

Let's start with some output examples first, to see what all this gets us. The first three examples are motivated by an axiom system for implicational propositional calculus; the first two also happen to correspond to S and K:

  • The first axiom, K, with proof terms:

    K combinator, with proof terms

  • The second axiom, S, with proof terms, but with premises in the context, not lambda bound:

    S combinator, via context, with proof terms

  • The fourth axiom, modus ponens, without proof terms:

    Modus Ponens, without proof terms

The third axiom in that Wikipedia article (Peirce's law) is non-constructive and so we can't prove it here.

For a different kind of example, here's a failed type check of the Y combinator:

Y combinator, error, with proof terms

The arrows are meant to lead you to the error, which is marked with a bang (!).

Code

Now I'll describe the code which generated those examples. The code is from this file unless otherwise noted. I'm not including every line of code here; see that link if you want something you can actually build using GHC 7.6.3.

Most of the code -- the grammar, parser, and pretty printer -- is the same for both versions; only the type checkers and proof-tree generators differ. All of the common code is in the file just referenced.

STLC grammar

The STLC grammar in ASCII:

-- Terms
e ::= x | \x:T.e | e e
-- Types
T ::= A | T -> T
-- Contexts
C ::= . | C,x:T

And the corresponding Haskell:

type TmVar = String
type TyVar = String
data Tm = Lam TmVar Ty Tm
        | TmVar TmVar
        | Tm :@: Tm
  deriving Show
data Ty = TyVar TyVar
        | Ty :->: Ty
  deriving (Eq , Show)
type Ctx = [(TmVar,Ty)]

Type checking + proof tree generation

Both versions implement the same abstract STLC type checker. In ASCII:

(x:T) in C
---------- Axiom
C |- x : T

C,x:T1 |- e : T2
--------------------- -> Introduction
C |- \x:T1.e : T1->T2

C |- e : T1 -> T2    C |- e1 : T1
--------------------------------- -> Elimination
C |- e e1 : T2

Version 1: self-contained with inline proof-tree generation

The full code for this version is here.

The proof-tree generation happens in the type checker, but the actual proof-tree generation code is factored out into addProof and conclusion.

Type checking

-- The mode is 'True' if proof terms should be included.
data R = R { _ctx :: Ctx , _mode :: Bool }
type M a = Reader R a

extendCtx :: TmVar -> Ty -> M a -> M a
extendCtx x t = local extend where
  extend r = r { _ctx = _ctx r ++ [(x,t)] }

-- These take the place of the inferred type when there is a type
-- error.
here , there :: String
here = "\\,!"
there = "\\,\\uparrow"

-- Return the inferred type---or error string if type inference
-- fails---and the latex proof-tree presentation of the inference.
--
-- This produces different output than 'infer' in the error case: here
-- all premises are always computed, whereas 'infer' stops at the
-- first failing premise.
inferProof :: Tm -> M (Either String Ty , String)
inferProof tm@(Lam x t e) = do
  (et' , p) <- extendCtx x t . inferProof $ e
  let et'' = (t :->:) <$> et'
  addProof et'' [p] tm
inferProof tm@(TmVar x) = do
  mt <- lookup x <$> asks _ctx
  let et = maybe (Left here) Right mt
  addProof et [] tm
inferProof tm@(e :@: e1) = do
  (et , p) <- inferProof e
  (et1 , p1) <- inferProof e1
  case (et , et1) of
    (Right t , Right t1) ->
      case t of
        t1' :->: t2 | t1' == t1 -> addProof (Right t2)   [p , p1] tm
        _ ->                       addProof (Left here)  [p , p1] tm
    _ ->                           addProof (Left there) [p , p1] tm

Proof tree generation

The addProof corresponds to proofTree in the other version:

-- Given the inferred type, the proof-trees for all premise inferences
-- (subcalls), and the input term, annotate the inferred type with a
-- result proof tree.
addProof :: Either String Ty -> [String] -> Tm -> M (Either String Ty , String)
addProof et premises tm = do
  R { _mode , _ctx } <- ask
  let (judgment , rule) = conclusion _mode _ctx tm et
  let tex = "\\infer[ " ++ rule ++ " ]{ " ++
            judgment ++ " }{ " ++
            intercalate " & " premises ++ " }"
  return (et , tex)

The code for conclusion is common to both versions:

conclusion :: Mode -> Ctx -> Tm -> Either String Ty -> (String , String)
conclusion mode ctx tm e = (judgment mode , rule tm)
  where
    rule (TmVar _) = "\\textsc{Axiom}"
    rule (Lam {}) = "\\to \\text{I}"
    rule (_ :@: _) = "\\to \\text{E}"

    tyOrError = either id pp e

    judgment True = pp ctx ++ " \\vdash " ++ pp tm ++ " : " ++ tyOrError
    judgment False = ppCtxOnlyTypes ctx ++ " \\vdash " ++ tyOrError

Version 2: via call-tracing, with proof-tree generation as post processing

Here the type checker is not even aware of proof-tree generation, and adding call-tracing is just one line.

Type checking

type Mode = Bool
type Stream = LogStream (ProofTree Mode)
type M a = ErrorT String (ReaderT Ctx (Writer Stream)) a

type InferTy = Tm -> M Ty
infer , infer' :: InferTy
infer = simpleLogger (Proxy::Proxy "infer") ask (return ()) infer'

infer' (TmVar x) = maybe err pure . lookup x =<< ask where
  err = throwError $ "Variable " ++ x ++ " not in context!"
infer' (Lam x t e) = (t :->:) <$> (local (++ [(x,t)]) . infer $ e)
infer' (e :@: e1) = do
  t <- infer e
  t1 <- infer e1
  case t of
    t1' :->: t2 | t1' == t1 -> pure t2
    _ -> throwError $ "Can't apply " ++ show t ++ " to " ++ show t1 ++ "!"

The LogStream type and ProofTree class are from the library. The LogStream is the type of log events that the "magic" simpleLogger logs. Note the line

infer = simpleLogger (Proxy::Proxy "infer") ask (return ()) infer'

which defines infer to be a logged version of infer', the actual type checker. That's all you have to do to trace a monadic function!

I won't get into how simpleLogger actually works here, but the result is that each call to infer gets logged, including the context, arguments, and return value, and these data get grouped together with all logged subcalls (here only to infer). It would be easy to manually write such logging code for infer, but it's nice that with the library you don't have to.

Proof-tree generation

To generate the Latex proof trees, we implement ProofTree to post process infer's call trace. The library provides a proofTree function that calls the ProofTree methods and assembles the proof trees; we just need to specify how the conclusions of the typing judgments will be formatted:

instance ProofTree Mode (Proxy (SimpleCall "infer" Ctx InferTy ())) where
  callAndReturn mode t = conclusion mode ctx tm (Right ty)
    where
      (tm , ()) = _arg t
      ty = _ret t
      ctx = _before t
  callAndError mode t = conclusion mode ctx tm (Left error)
    where
      (tm , ()) = _arg' t
      how = _how t
      ctx = _before' t
      error = maybe "\\,!" (const "\\,\\uparrow") how

The pp calls are to a user defined pretty printer; obviously, the library can't know how to pretty print your data types.

Because calls can be erroneous -- the library detects errors -- we have to say how to format successful and failing calls. Refer to the Y-combinator example above for an example a failing type check, corresponding to the callAndError case here.

The library's proofTree function is quite simple: it builds a proofs.sty proof tree with the current call as conclusion, and the subcalls as premises:

proofTree :: mode -> Ex2T (LogTree (ProofTree mode)) -> String
proofTree mode (Ex2T t@(CallAndReturn {})) =
  "\\infer[ " ++ rule ++ " ]{ " ++ conclusion ++ " }{ " ++ intercalate " & " premises ++ " }"
  where
    (conclusion , rule) = callAndReturn mode t
    premises = map (proofTree mode) (_children t)
proofTree mode (Ex2T t@(CallAndError {})) =
  "\\infer[ " ++ rule ++ " ]{ " ++ conclusion ++ " }{ " ++ intercalate " & " premises ++ " }"
  where
    (conclusion , rule) = callAndError mode t
    premises = map (proofTree mode)
                   (_children' t ++ maybe [] (:[]) (_how t))

I use proofs.sty in the library because it allows arbitrarily many premises, although bussproofs.sty would work for this STLC example since no rule has more than five premises (the limit for bussproofs). Both Latex packages are described here.

Pretty printing

Now we return to code that is common between both versions.

The pretty printer that defines the pp used above is rather long -- it handles precedence and associativity, and is written in a way that should be extensible if more terms, e.g. products, are added to the calculus -- but mostly straightforward. First, we set up a precedence table and a precedence-and-associativity-aware parenthesizer:

- Precedence: higher value means tighter binding.
type Prec = Double

between :: Prec -> Prec -> Prec
between x y = (x + y) / 2

lowest , highest , precLam , precApp , precArr :: Prec
highest = 1
lowest = 0
precLam = lowest
precApp = between precLam highest
precArr = lowest

-- Associativity: left, none, or right.
data Assoc = L | N | R deriving Eq

-- Wrap a pretty print when the context dictates.
wrap :: Pretty a => Assoc -> a -> a -> String
wrap side ctx x = if prec x `comp` prec ctx
                  then pp x
                  else parens . pp $ x
  where
    comp = if side == assoc x || assoc x == N
           then (>=)
           else (>)
    parens s = "(" ++ s ++ ")"

And then we define the individual pretty printers:

class Pretty t where
  pp :: t -> String
  prec :: t -> Prec
  prec _ = highest
  assoc :: t -> Assoc
  assoc _ = N

instance Pretty Ty where
  pp (TyVar v) = v
  pp t@(t1 :->: t2) = wrap L t t1 ++ " {\\to} " ++ wrap R t t2
  prec (_ :->: _) = precArr
  prec _ = highest
  assoc (_ :->: _) = R
  assoc _ = N

instance Pretty Tm where
  pp (TmVar v) = v
  pp (Lam x t e) = "\\lambda " ++ x ++ " {:} " ++ pp t ++ " . " ++ pp e
  pp e@(e1 :@: e2) = wrap L e e1 ++ " " ++ wrap R e e2
  prec (Lam {}) = precLam
  prec (_ :@: _) = precApp
  prec _ = highest
  assoc (_ :@: _) = L
  assoc _ = N

instance Pretty Ctx where
  pp [] = "\\cdot"
  pp ctx@(_:_) =
    intercalate " , " [ x ++ " {:} " ++ pp t | (x,t) <- ctx ]

By adding a "mode" argument, it would be easy to use the same pretty printer to print plain ASCII, which would be useful with other call-trace post processors, such as the (unfinished) UnixTree processor.

Parsing

A parser is not essential to the example, but of course I did not enter the example input terms directly as Haskell Tms.

Recall the STLC grammar in ASCII:

-- Terms
e ::= x | \x:T.e | e e
-- Types
T ::= A | T -> T
-- Contexts
C ::= . | C,x:T

This grammar is ambiguous: both the term application e e and function type T -> T have no associativity given by the grammar. But in STLC term application is left associative and function types are right associative, and so the corresponding disambiguated grammar we actually parse is

-- Terms
e ::= e' | \x:T.e | e e'
e' ::= x | ( e )
-- Types
T ::= T' | T' -> T
T' ::= A | ( T )
-- Contexts
C ::= . | C,x:T

The parser is maybe too simple -- I'm not using a languageDef and it's whitespace sensitive -- but it gets the job done:

type P a = Parsec String () a

parens :: P a -> P a
parens = Text.Parsec.between (char '(') (char ')')

tmVar , tyVar :: P String
tmVar = (:[]) <$> lower
tyVar = (:[]) <$> upper

tyAtom , arrs , ty :: P Ty
tyAtom = parens ty
     <|> TyVar <$> tyVar
arrs = chainr1 tyAtom arrOp where
  arrOp = string "->" *> pure (:->:)
ty = arrs

tmAtom , apps , lam , tm :: P Tm
tmAtom = parens tm
     <|> TmVar <$> tmVar
apps = chainl1 tmAtom appOp where
  appOp = pure (:@:)
lam = uncurry Lam <$> (char '\\' *> typing)
                  <*> (char '.' *> tm)
tm = apps <|> lam

typing :: P (TmVar , Ty)
typing = (,) <$> tmVar
             <*> (char ':' *> ty)

ctx :: P Ctx
ctx = typing `sepBy` (char ',')

To clarify what the input terms look like, here are the examples from the Makefile:

#           OUTFILE  CONTEXT                TERM
./tm2latex.sh S.ctx  'x:P->Q->R,y:P->Q,z:P' 'xz(yz)'
./tm2latex.sh S.lam  ''                     '\x:P->Q->R.\y:P->Q.\z:P.xz(yz)'
./tm2latex.sh S.err  ''                     '\x:P->Q->R.\y:P->Q.\z:P.xzyz'
./tm2latex.sh K.ctx  'x:P,y:Q'              'x'
./tm2latex.sh K.lam  ''                     '\x:P.\y:Q.x'
./tm2latex.sh I.ctx  'x:P'                  'x'
./tm2latex.sh I.lam  ''                     '\x:P.x'
./tm2latex.sh MP.ctx 'x:P,y:P->Q'           'yx'
./tm2latex.sh MP.lam ''                     '\x:P.\y:P->Q.yx'
./tm2latex.sh ZERO   ''                     '\s:A->A.\z:A.z'
./tm2latex.sh SUCC   ''                     '\n:(A->A)->(A->A).\s:A->A.\z:A.s(nsz)'
./tm2latex.sh ADD    '' '\m:(A->A)->(A->A).\n:(A->A)->(A->A).\s:A->A.\z:A.ms(nsz)'
./tm2latex.sh MULT   '' '\m:(A->A)->(A->A).\n:(A->A)->(A->A).\s:A->A.\z:A.m(ns)z'
./tm2latex.sh Y.err  ''                        '\f:A->A.(\x:A.f(xx))(\x:A.f(xx))'
./tm2latex.sh Y.ctx  'a:A->(A->A),y:(A->A)->A' '\f:A->A.(\x:A.f(axx))(y(\x:A.f(axx)))'

Latex document generation

The ./tm2latex.sh script just calls pdflatex on the output of the Haskell programs described above. The Haskell programs produce the proof tree and then wrap it in a minimal Latex document:

unlines
  [ "\\documentclass[10pt]{article}"
  , "\\usepackage{proof}"
  , "\\usepackage{amsmath}"
  , "\\usepackage[landscape]{geometry}"
  , "\\usepackage[cm]{fullpage}"
  -- The most slender font I could find:
  -- http://www.tug.dk/FontCatalogue/iwonalc/
  , "\\usepackage[light,condensed,math]{iwona}"
  , "\\usepackage[T1]{fontenc}"
  , "\\begin{document}"
  , "\\tiny"
  , "\\[" ++ tex ++ "\\]"
  , "\\end{document}"
  ]

As you can see, most of the Latex is devoted to making the proof trees as small as possible; I plan to also write an ASCII proof tree post processor, which may be more useful in practice when the examples are larger.

Conclusion

As always, it takes a bit of code to write a parser, type checker, and pretty printer. On top of that, adding proof-tree generation is pretty simple in both versions. This is a fun toy example, but I expect to do something similar in the context of a "real" unification-based type checker for a dependently-typed language; there I expect call tracing and proof-tree generation (in ASCII) to provide significant help in debugging the type checker.

like image 154
ntc2 Avatar answered Nov 11 '22 00:11

ntc2