Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Infinite type error when defining zip with foldr only; can it be fixed?

(for the context to this see this recent SO entry).

I tried to come up with the definition of zip using foldr only:

zipp :: [a] -> [b] -> [(a,b)]
zipp xs ys = zip1 xs (zip2 ys)
  where
     -- zip1 :: [a] -> tq -> [(a,b)]          -- zip1 xs :: tr ~ tq -> [(a,b)]
     zip1 xs q = foldr (\ x r q -> q x r ) n xs q 
                       -------- c --------
     n    q  = []

     -- zip2 :: [b] -> a -> tr -> [(a,b)]     -- zip2 ys :: tq ~ a -> tr -> [(a,b)]
     zip2 ys x r = foldr (\ y q x r -> (x,y) : r q ) m ys x r  
                         ---------- k --------------
     m  x r  = []
{-
      zipp [x1,x2,x3] [y1,y2,y3,y4]

    = c x1 (c x2 (c xn n)) (k y1 (k y2 (k y3 (k y4 m))))
           ---------------       ----------------------
            r                    q

    = k y1 (k y2 (k y3 (k y4 m))) x1 (c x2 (c xn n))
           ----------------------    ---------------
           q                         r
-}

It "works" on paper, but gives two "infinite type" errors:

Occurs check: cannot construct the infinite type:
  t1 ~ (a -> t1 -> [(a, b)]) -> [(a, b)]             -- tr

Occurs check: cannot construct the infinite type:
  t0 ~ a -> (t0 -> [(a, b)]) -> [(a, b)]             -- tq

Evidently, each type tr, tq, depends on the other, in a circular manner.

Is there any way to make it work, by some type sorcery or something?

I use Haskell Platform 2014.2.0.0 with GHCi 7.8.3 on Win7.

like image 447
Will Ness Avatar asked Apr 27 '15 01:04

Will Ness


2 Answers

My issue with using Fix to type zipp (as I pointed out in the comments to Carsten's answer to the prior question) is that no total language contains the Fix type:

newtype Fix a = Fix { unFix :: Fix a -> [a] }

fixList :: ([a] -> [a]) -> [a]
fixList f = (\g -> f (unFix g g)) $ Fix (\g -> f (unFix g g))

diverges :: [a]
diverges = fixList id

This may seem to be an obscure issue, but it really is nice to have an implementation in a total language, because that also constitutes a formal proof of termination. So let's find a type for zipp in Agda.

First, let's stick for a while with Haskell. If we manually unfold the definitions of zip1 and zip2 for some fixed lists, we find that all of the unfoldings have proper types, and we can apply any unfolding of zip1 to any unfolding of zip2, and the types line up (and we get the correct results).

-- unfold zip1 for [1, 0]
f0 k = []        -- zip1 []
f1 k = k 0 f0    -- zip1 [0]
f2 k = k 1 f1    -- zip1 [1, 0]

-- unfold zip2 for [5, 3]
g0 x r = []               -- zip2 []
g1 x r = (x, 3) : r g0    -- zip2 [3]
g2 x r = (x, 5) : r g1    -- zip2 [3, 5]

-- testing
f2 g2 -- [(1, 5), (0, 3)]
f2 g0 -- []

-- looking at some of the types in GHCI
f0 :: t -> [t1]
f1 :: Num a => (a -> (t1 -> [t2]) -> t) -> t
g0 :: t -> t1 -> [t2]
g1 :: Num t1 => t -> ((t2 -> t3 -> [t4]) -> [(t, t1)]) -> [(t, t1)]

We conjecture that the types can be unified for any particular combination of zip1-s and zip2-s, but we can't express this with the usual foldr, because there is an infinite number of different types for all the unfoldings. So we switch to Agda now.

Some preliminaries and the usual definition for dependent foldr:

open import Data.Nat
open import Data.List hiding (foldr)
open import Function
open import Data.Empty
open import Relation.Binary.PropositionalEquality
open import Data.Product

foldr :
    {A : Set}
    (B : List A → Set)
  → (∀ {xs} x → B xs → B (x ∷ xs))
  → B []    
  → (xs : List A)
  → B xs
foldr B f z []       = z
foldr B f z (x ∷ xs) = f x (foldr B f z xs)

We notice that the types of the unfoldings depend on the length of the to-be-zipped list, so we concoct two functions to generate these types. A is the type of the elements of the first list, B is the type of the elements of the second list and C is a parameter for the argument that we ignore when we get to the end of the list. n is the length of the list, of course.

Zip1 : Set → Set → Set → ℕ → Set
Zip1 A B C zero    = C → List (A × B)
Zip1 A B C (suc n) = (A → Zip1 A B C n → List (A × B)) → List (A × B)

Zip2 : Set → Set → Set → ℕ → Set
Zip2 A B C zero    = A → C → List (A × B)
Zip2 A B C (suc n) = A → (Zip2 A B C n → List (A × B)) → List (A × B)

We need to prove now that we can indeed apply any Zip1 to any Zip2, and get back a List (A × B) as result.

unifyZip : ∀ A B n m → ∃₂ λ C₁ C₂ → Zip1 A B C₁ n ≡ (Zip2 A B C₂ m → List (A × B))
unifyZip A B zero    m       = Zip2 A B ⊥ m , ⊥ , refl
unifyZip A B (suc n) zero    = ⊥ , Zip1 A B ⊥ n , refl
unifyZip A B (suc n) (suc m) with unifyZip A B n m
... | C₁ , C₂ , p = C₁ , C₂ , cong (λ t → (A → t → List (A × B)) → List (A × B)) p

The type of unifyZip in English: "for all A and B types and n and m natural numbers, there exist some C₁ and C₂ types such that Zip1 A B C₁ n is a function from Zip2 A B C₂ m to List (A × B)".

The proof itself is straightforward; if we hit the end of either zippers, we instantiate the input type of the empty zipper to the type of the other zipper. The use of the empty type (⊥) communicates that the choice of type for that parameter is arbitrary. In the recursive case we just bump the equality proof by one step of iteration.

Now we can write zipp:

zip1 : ∀ A B C (as : List A) → Zip1 A B C (length as)
zip1 A B C = foldr (Zip1 A B C ∘ length) (λ x r k → k x r) (λ _ → [])

zip2 : ∀ A B C (bs : List B) → Zip2 A B C (length bs)
zip2 A B C = foldr (Zip2 A B C ∘ length) (λ y k x r → (x , y) ∷ r k) (λ _ _ → [])

zipp : ∀ {A B : Set} → List A → List B → List (A × B)
zipp {A}{B} xs ys with unifyZip A B (length xs) (length ys)
... | C₁ , C₂ , p with zip1 A B C₁ xs | zip2 A B C₂ ys
... | zxs | zys rewrite p = zxs zys

If we squint a bit and try to ignore ignore the proofs in the code, we find that zipp is indeed operationally the same as the Haskell definition. In fact, the code becomes exactly the same after all the erasable proofs have been erased. Agda probably doesn't do this erasure, but the Idris compiler certainly does.

(As a side note, I wonder if we could make use of clever functions like zipp in fusion optimizations. zipp seems to be more efficient than Oleg Kiselyov's fold zipping. But zipp doesn't seem to have a System F type; maybe we could try encoding data as dependent eliminators (induction principles) instead of the usual eliminators, and try to fuse those representations?)

like image 77
András Kovács Avatar answered Nov 05 '22 19:11

András Kovács


Applying the insight from my other answer, I was able to solve it, by defining two mutually-recursive types:

-- tr ~ tq -> [(a,b)]
-- tq ~ a -> tr -> [(a,b)]

newtype Tr a b = R { runR ::  Tq a b -> [(a,b)] }
newtype Tq a b = Q { runQ ::  a -> Tr a b -> [(a,b)] }

zipp :: [a] -> [b] -> [(a,b)]
zipp xs ys = runR (zip1 xs) (zip2 ys)
  where
     zip1 = foldr (\ x r -> R $ \q -> runQ q x r ) n 
     n = R (\_ -> [])

     zip2 = foldr (\ y q -> Q $ \x r -> (x,y) : runR r q ) m 
     m = Q (\_ _ -> [])

main = print $ zipp [1..3] [10,20..]
-- [(1,10),(2,20),(3,30)]

The translation from type equivalency to type definition was purely mechanical, so maybe compiler could do this for us, too!

like image 27
Will Ness Avatar answered Nov 05 '22 21:11

Will Ness