Skip to content

Lambda calculus typer from scratch - Correction: unifyCtx was wrong

WARNING

This post corrects a bug in Lambda calculus typer from scratch - Part 2. If you read that post, please read this one too.

Corresponding code: https://github.com/xiaoshihou514/xingli

The bug

In Part 2, I presented unifyctx as follows:

haskell
unifyctx :: TypeCtx -> TypeCtx -> Maybe (PrincipalPair -> PrincipalPair)
unifyctx ctx1 ctx2 = (liftPP . foldr (.) id) <$> sequence subs
  where
    subs = [unify a b | (x, a) <- Map.toList env1, b <- maybeToList $ Map.lookup x env2]
    env1 = env ctx1
    env2 = env ctx2

This is incorrect.

The problem is that subs is computed by unifying each common-variable pair independently. The substitution produced by unifying one pair is never applied to the remaining pairs before those are unified. This is wrong for the same reason that naively composing independent equations doesn't give you a most-general unifier — later pairs may share type variables with earlier ones, and the substitution from the first unification should propagate before the next one is attempted.

A concrete illustration: suppose both contexts assign types to two shared variables x and y, with:

ctx1: x : A -> B,  y : A
ctx2: x : C -> D,  y : C

The correct procedure is:

  1. Unify A -> B with C -> D, producing s1 = [A ↦ C, B ↦ D].
  2. Apply s1 to the remaining pairs: the y-pair becomes (s1 A, s1 C) = (C, C).
  3. Unify C with C, producing s2 = id.
  4. Compose: s2 ∘ s1.

The old code skips step 2: it unifies A with C independently of what s1 already said about A, producing a separate substitution [A ↦ C] and composing that with s1. In simple cases the result is the same, but in general it is not — you can end up with a substitution that doesn't actually make all common-variable type pairs equal under the composed result.

This bug was identified while formalising the soundness proof of the algorithm in Lean 4. Proving UnifyCtxMapCommon — that the output substitution of unifyCtx equates all types assigned to common variables — required the sequential threading to go through.

The fix

Introduce a recursive helper unifyCtx' that threads substitutions:

haskell
-- Takes a list of (varName, typeFromCtx1, typeFromCtx2) triples
unifyCtx' :: [(Label, CurryType, CurryType)] -> Maybe [CurryType -> CurryType]
unifyCtx' [] = Just []
unifyCtx' ((_, a, b) : rest) = do
    s1    <- unify a b
    -- Apply s1 to all remaining pairs before recursing
    exs'  <- unifyCtx' (map (\(x, ta, tb) -> (x, s1 ta, s1 tb)) rest)
    return (s1 : exs')

unifyctx :: TypeCtx -> TypeCtx -> Maybe (PrincipalPair -> PrincipalPair)
unifyctx ctx1 ctx2 = do
    exs <- unifyCtx' common
    return $ liftPP (foldl (.) id exs)
  where
    common = [ (x, a, b)
             | (x, a) <- Map.toList env1
             , b      <- maybeToList $ Map.lookup x env2
             ]
    env1 = env ctx1
    env2 = env ctx2

The key change is in unifyCtx': after computing s1 = unify a b, we map it over all remaining triples before recursing. This ensures that each subsequent unification sees a type pair that is already consistent with all earlier substitutions, exactly as Robinson's algorithm proceeds.

Note also the switch from foldr to foldl when composing the final list: foldl (.) id [s1, s2, ...] applies s1 first (innermost), then s2, and so on — the natural left-to-right order matching the recursion.

The rest of the algorithm is unchanged.


The overall algorithm (once unifyctx is corrected) has now been formally proved sound in Lean 4. You can find the details in the accompanying paper Part 1, Part 2.