a trick from Matt Noonan's Ghosts of Departed Proofs

Posted on February 1, 2021

The violation of preconditions is a main cause of incorrect semantics including run time errors. The approach to functions requiring preconditions that is traditionally applied in the Haskell community as described below addresses the problem poorly and suffers from multiple problems. Noonan [@noonan2018] presents a novel technique to design library APIs leveraging the type system to ensure that any preconditions are satisfied. Work currently in progress by researchers and GHC core developers has the potential to make this technique vastly more ergonomic.

The Problem

For example, consider a Haskell library for working with sorted lists. The entire example code from this paper can be found on https://gitlab.com/rdnz/ghosts-proofs-example. The library shall provide a function to merge two lists into a single sorted list if the input lists were already sorted.

-- MergeExpand.hs

-- | Merge two lists into one sorted list
-- if the input lists are already sorted.
-- The first argument defines the ordering.
mergeBy :: (a -> a -> Ordering) -> [a] -> [a] -> Maybe [a]

This is an example of a function requiring a precondition as stated in its specification. The input lists must already be sorted in the order defined by the first argument, the comparator. The possibility of a violated precondition is represented by the Maybe return type forcing library users to handle a violated precondition. The return type is expanded by the value Nothing because the function cannot reasonably return a list if the precondition is violated. That is why I called the file MergeExpand.hs.

This approach to functions requiring preconditions is common in the Haskell community. Examples are the lookup functions in the base and containers package [@hoogle_lookup]. But there are multiple problems with this approach as will soon become evident.

I will later call the file containing Noonan’s [@noonan2018] novel approach simply Merge.hs to distinguish it from the traditional approach of expanding the return type. But before that, I will present a first unsuccessful attempt and call that file MergeFirst.hs.

But let us get back to MergeExpand.hs for now. This library shall also provide a function to check mergeBy’s precondition.

-- MergeExpand.hs
sortedBy :: (a -> a -> Ordering) -> [a] -> Bool

Now consider an example of how this library might be used. The library user might have a data type to represent errors in their application

-- MainExpand.hs
data AppError =
  List1Unsorted | List2Unsorted | SomeOtherProblems
  deriving Show

and a function validating input lists and potentially returning an appropriate value of that error data type AppError.

-- MainExpand.hs
validateLists ::
  (Integer -> Integer -> Ordering) ->
  [Integer]                        ->
  [Integer]                        ->
  Maybe AppError
validateLists comparator list1 list2 =
  case (sortedBy comparator list1, sortedBy comparator list2) of
    (True, True) -> Nothing
    (False, _)   -> Just List1Unsorted
    _            -> Just List2Unsorted

The AppError might carry as its payload the index of the first list element that is out of order to render this information as an error message later but I have left that out for brevity.

validateLists could be used as follows.

-- MainExpand.hs
processLists ::
  (Integer -> Integer -> Ordering) ->
  [Integer]                        ->
  [Integer]                        ->
  Either AppError [Integer]
processLists comparator list1 list2 =
  case validateLists comparator list1 list2 of
    Just appError -> Left appError
    Nothing       ->
      case mergeBy comparator list1 list2 of
        Just result -> Right result
        Nothing     ->
          error "impossible because validateLists guarantees order" -- (1)

example :: String
example =
  case processLists descending [3,1] [5,4,2] of
    Right result           -> show result
    Left List1Unsorted     -> "The first list needs to be sorted."
    Left List2Unsorted     -> "The second list needs to be sorted."
    Left SomeOtherProblems -> "some other problems"

descending :: Integer -> Integer -> Ordering
descending a1 a2
  | a1 > a2   = LT
  | a1 == a2  = EQ
  | otherwise = GT

The approach of expanding the return type of functions that require preconditions demonstrated in MergeExpand.hs suffers from the following problems.

  • If the Nothing value representing a violated precondition needs to be propagated multiple levels up the call tree, there might be multiple preconditions of multiple functions that would be represented by the Nothing value. The information, which specific precondition of which functions was violated, is lost if they are all represented by the same value Nothing. If you try to repair this by using Either return types instead of Maybe return types, you will face the problem that simple Either-based checked exceptions do not compose well [@parsons_trouble_typed_errors].

  • The argument and return types of mergeBy are larger than necessary. That means that mergeBy’s type accepts more incorrect implementations than necessary. The more refined your types are, the more the type system helps you avoid bugs [@parsons_type_back_forth].

  • A much worse problem is the performance cost of checking twice that both lists are sorted, once when validateLists calls sortedBy, and again when mergeBy has to examine if both lists are sorted in order to decide whether to return Nothing or Just. I have not shown any implementation of mergeBy yet because it does not matter much but here is one to illustrate the need to examine if both lists are sorted.

    -- MergeExpand.hs
    mergeBy :: (a -> a -> Ordering) -> [a] -> [a] -> Maybe [a]
    mergeBy comparator list1 list2
      | sortedBy comparator list1 && sortedBy comparator list2 =
        Just (go list1 list2)
      | otherwise = Nothing
      where
        go list1 [] = list1
        go [] list2 = list2
        go (head1 : tail1) (head2 : tail2)
          | LT <- comparator head2 head1 =
            head2 : go (head1 : tail1) tail2
          | otherwise =
            head1 : go tail1 (head2 : tail2)
  • And worst of all, the library user is forced to use error in line (1) of the implementation of processLists in the “impossible” case because they know that mergeBy’s precondition is satisfied and therefore cannot reasonably handle a violated precondition. But what if validateLists were modified in the future to stop checking that the lists are sorted, intentionally or unintentionally? The library user might not remember to update processLists, and suddenly the “impossible” error becomes a very possible run time error [@king_parse_validate].

A First Attempt at a Solution

The fact that you cannot express propositions and postconditions about arguments and variables through types, because you cannot refer to term-level variables on the type level, is the fundamental obstacle to expressing mergeBy’s precondition through its type. Noonan’s [@noonan2018] first key idea is to wrap values into a newtype that has got a phantom type variable and is defined in its own module.

-- NamedFirst.hs
newtype Named name a = Named a

The library module provides functions to wrap and unwrap too.

-- NamedFirst.hs
name :: a -> Named name a
name = Named

forgetName :: Named name a -> a
forgetName (Named a) = a

This phantom type variable shall then be used to refer to the wrapped term-level value on the type level. In a way, it is providing a type-level name.

An example of that is the following data type in the MergeFirst.hs library module.

-- MergeFirst.hs
data SortedBy comparator list = SortedBy

It does not contains any information at run time. But its phantom type variables refer to a comparator’s name and a list’s name. And if a value of SortedBy comparator list is in scope, then any list Named list [Integer] is guaranteed at compile time to be sorted according to any comparator Named comparator (Integer -> Integer -> Ordering). A value of that type can be understood as a proof or a witness of this guarantee.

How can a SortedBy value make such a guarantee? Because the only way to obtain a SortedBy value is by calling a library function like sortedBy that provides you with it only after confirming that the referred to list is indeed sorted according to the referred to comparator. Otherwise, you get Nothing.

-- MergeFirst.hs
import qualified MergeExpand as ME
sortedBy ::
  Named comparator (a -> a -> Ordering) -> -- (1)
  Named list       [a]                  -> -- (2)
  Maybe (SortedBy comparator list)         -- (3)
sortedBy comparator list =
  if ME.sortedBy (forgetName comparator) (forgetName list)
  then Just SortedBy
  else Nothing

Notice, how the return type’s type variables in line (3) must equal the ones in the argument types in line (1) and (2) respectively.

In a way, we have just expressed sortedBy’s postcondition through its type. We can now express mergeBy’s precondition through its type.

-- MergeFirst.hs
mergeBy ::
  Named comparator (a -> a -> Ordering)         ->
  (Named list1  [a], SortedBy comparator list1) ->
  (Named list2  [a], SortedBy comparator list2) ->
  (Named result [a], SortedBy comparator result)

The function can only be called if SortedBy values for both input lists are available, which guarantee mergeBy‘s precondition. Consequently, mergeBy need not return a Maybe type anymore nor check the lists’ order at run time anymore either.

-- MergeFirst.hs
mergeBy comparator (list1, _proof1) (list2, _proof2) =
  let result = name $ go (forgetName list1) (forgetName list2)
    in (result, SortedBy)
  where
    go list1 [] = list1
    go [] list2 = list2
    go (head1 : tail1) (head2 : tail2)
      | LT <- (forgetName comparator) head2 head1 =
        head2 : go (head1 : tail1) tail2
      | otherwise =
        head1 : go tail1 (head2 : tail2)

Taking inspiration from the library functions sortedBy and mergeBy fabricating arbitrary SortedBy values from thin air by simply using its data constructor SortedBy, you might ask “What stops a library user from doing the same?”

-- MainFirst.hs
preconditionViolated :: [Integer]
preconditionViolated =
  forgetName $
  fst $
  mergeBy
    (name descending)
    (name [3,1,99], SortedBy)
    (name [5,4,2],  SortedBy)

The answer is in the library’s module declaration.

-- MergeFirst.hs
module MergeFirst (SortedBy (), mergeBy, sortedBy) where

SortedBy’s is an abstract data type, its data constructor is not exported. This restricts the power to fabricate its values from thin air to the library code, where the data type is defined. Mistakes in the library will not be caught by the type system. But the user cannot use the library in an unsafe way because it can only obtain a SortedBy value from library functions like sortedBy.

Let us see what the library user’s validateLists and processLists look like when using the new MergeFirst.hs. The rest of the user code stays the same.

-- MainFirst.hs
validateLists ::
  Named comparator (Integer -> Integer -> Ordering) ->
  Named list1      [Integer]                        ->
  Named list2      [Integer]                        ->
  Either AppError (SortedBy comparator list1, SortedBy comparator list2)
validateLists comparator list1 list2 =
  case (sortedBy comparator list1, sortedBy comparator list2) of
    (Just proof1, Just proof2) -> Right (proof1, proof2)
    (Nothing, _)               -> Left List1Unsorted
    _                          -> Left List2Unsorted

processLists ::
  (Integer -> Integer -> Ordering) ->
  [Integer]                        ->
  [Integer]                        ->
  Either AppError [Integer]
processLists comparator list1 list2 =
  let
    comparatorNamed :: Named comparator (Integer -> Integer -> Ordering)
    comparatorNamed = name comparator
    list1Named      :: Named list1 [Integer]
    list1Named      = name list1
    list2Named      :: Named list2 [Integer]
    list2Named      = name list2
  in
    case validateLists comparatorNamed list1Named list2Named of -- (1)
      Left appError          -> Left appError
      Right (proof1, proof2) ->
        let
          (result, _proof) =
            mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2)
        in Right (forgetName result)

mergeBy does not return a Maybe anymore as mentioned before. Therefore, the library user is not forced to use error in any “impossible” case in the implementation of processLists anymore. The user can communicate their conviction that mergeBy’s precondition is satisfied to the library in a way that the library can verify.

At least, that would be the case, if there was not still a big problem with this first attempt. Look at processLists again and imagine, only the first input list was sorted but not the second one. In this case, there should be no way to apply mergeBy to the lists because the necessary SortedBy comparator list2 value cannot be obtained from sortedBy, which only returns Nothing. But what if the library user does not pass the second unsorted list to validateLists but the first sorted list again instead in line (1)?

-- MainFirst.hs
processLists ::
  (Integer -> Integer -> Ordering) ->
  [Integer]                        ->
  [Integer]                        ->
  Either AppError [Integer]
processLists comparator list1 list2 =
  let
    comparatorNamed :: Named comparator (Integer -> Integer -> Ordering)
    comparatorNamed = name comparator
    list1Named      :: Named list1 [Integer]
    list1Named      = name list1
    list2Named      :: Named list2 [Integer]
    list2Named      = name list2
  in
--  case validateLists comparatorNamed list1Named list2Named of
    case validateLists comparatorNamed list1Named list1Named of -- (1)
      Left appError          -> Left appError
      Right (proof1, proof2) ->
        let
          (result, _proof) =
            -- (2)
            mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2)
        in Right (forgetName result)

proof2 is now of type SortedBy comparator list1, list2Named is still of type Named list2 [Integer]. You might expect a type error from the line below line (2) considering line (3) of mergeBy’s type signature.

-- MergeFirst.hs
mergeBy ::
  Named comparator (a -> a -> Ordering)         ->
  (Named list1  [a], SortedBy comparator list1) ->
  (Named list2  [a], SortedBy comparator list2) -> -- (3)
  (Named result [a], SortedBy comparator result)

list2 and list1 would need to be the same type to satisfy the type checker. But there is no type error unfortunately. The reason is that we are indeed allowed to assume that the type variables list2 and list1 are the same type. They are universally quantified and therefore unify. Indeed, we may assume each of them to be any arbitrary type. There is no type error just like there is no type error in line (1) and (2) of the following code for example.

wellTyped :: Int
wellTyped =
  let
    aList    :: [variable1]
    aList    = []
    bList    :: [variable2]
    bList    = aList -- (1)
    boolList :: [Bool]
    boolList = aList -- (2)
  in
    length bList

So violating mergeBy’s precondition is not a type error. We could accidentally apply the latest version of processLists to an unsorted list resulting in unspecified behavior by MergeFirst.mergeBy. This is worse than MergeExpand.hs.

The Solution

Let us again try to trick the library into evaluating mergeBy without its precondition satisfied just as we did before my manipulating processLists, this time by manipulating validateLists’s implementation. Again, we imagine only the first input list was sorted but not the second one.

-- MainFirst.hs
validateLists ::
  Named comparator (Integer -> Integer -> Ordering) ->
  Named list1      [Integer]                        ->
  Named list2      [Integer]                        ->
  Either AppError (SortedBy comparator list1, SortedBy comparator list2)
validateLists comparator list1 list2 =
--case (sortedBy comparator list1, sortedBy comparator list2) of
  case (sortedBy comparator list1, sortedBy comparator list1) of -- (1)
    (Just proof1, Just proof2) -> Right (proof1, proof2)
    (Nothing, _)               -> Left List1Unsorted
    _                          -> Left List2Unsorted

We do not apply sortedBy to list2 but again to list1 instead in line (1). Similarly to before, proof2 now is of type SortedBy comparator list1 while expected to be of type SortedBy comparator list2. But very much unlike before, we now do get the exact type error that we hoped for.

• Couldn't match type ‘list1’ with ‘list2’
[...]
Expected type: Either AppError
                 (SortedBy comparator list1, SortedBy comparator list2)
  Actual type: Either AppError
                 (SortedBy comparator list1, SortedBy comparator list1)

Why may we not assume that the type variables list2 and list1 are the same type anymore? Because they are now part of the function’s parameter types. Naturally, we may not make any assumptions about type variables in function parameters because we have to be able to deal with whatever types the user chooses to apply that function too. In the implementation of a function length :: [a] -> Integer, for example, we may not assume that a is Bool because the caller might choose to apply length to a list of Char’s.

So how do we consistently force a library user into the situation of validateLists where they cannot make any assumptions about the Named type variables?

-- Named.hs
{-# language RankNTypes #-}
module Named (Named (), name, forgetName) where
name ::
  a ->
  (forall name. Named name a -> result) ->
  result
name a continuation = continuation (Named a)

The answer is that whenever the library user wants to use a Named value, they have to write a function because the library function name does not return Named values anymore. It does not offer “Give me an a and I will return you a Named name a” anymore. It now offers “Give me an a and tell me what you would do with a Named name a. I will do that for you and return you the result of that.” This is also called “continuation-passing style”. That is Noonan’s [@noonan2018] most important key idea.

Let us see what the library user’s processLists look like when using the library adjusted to the new Named.hs. The rest of the user code stays the same including validateLists. That is unsurprising because we have just determined that validateLists is exactly the kind of code we want to force the library user to write all the time.

-- Main.hs
processLists ::
  (Integer -> Integer -> Ordering) ->
  [Integer]                        ->
  [Integer]                        ->
  Either AppError [Integer]
processLists comparator list1 list2 =
  name comparator $ \comparatorNamed ->
  name list1      $ \list1Named      ->
  name list2      $ \list2Named      ->
  case validateLists comparatorNamed list1Named list2Named of
    Left appError          -> Left appError
    Right (proof1, proof2) ->
      mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2) $
        \(result, _proof) ->
        Right (forgetName result)

Here is MainFirst.processLists from our previous unsuccessful first attempt again in case you want to compare them.

-- MainFirst.hs
processLists ::
  (Integer -> Integer -> Ordering) ->
  [Integer]                        ->
  [Integer]                        ->
  Either AppError [Integer]
processLists comparator list1 list2 =
  let
    comparatorNamed :: Named comparator (Integer -> Integer -> Ordering)
    comparatorNamed = name comparator
    list1Named      :: Named list1 [Integer]
    list1Named      = name list1
    list2Named      :: Named list2 [Integer]
    list2Named      = name list2
  in
    case validateLists comparatorNamed list1Named list2Named of
      Left appError          -> Left appError
      Right (proof1, proof2) ->
        let
          (result, _proof) =
            mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2)
        in Right (forgetName result)

New Developments

In summary, you have got a way of referring to term-level variables on the type level. This allows you to express function’s preconditions and postconditions through their type. This technique can easily be applied to many other libraries.

But there is the drawback of making libraries less ergonomic and harder to understand.

-- Named.hs
{-# language RankNTypes #-}
name ::
  a ->
  (forall name. Named name a -> result) ->
  result

name and its type is somewhat alien and complicated. And it is somewhat unergonomic to use as you can see in the lines below line (1) in Main.processLists.

-- Main.hs
processLists ::
  (Integer -> Integer -> Ordering) ->
  [Integer]                        ->
  [Integer]                        ->
  Either AppError [Integer]
processLists comparator list1 list2 =
  -- (1)
  name comparator $ \comparatorNamed ->
  name list1      $ \list1Named      ->
  name list2      $ \list2Named      ->
  case validateLists comparatorNamed list1Named list2Named of
    Left appError          -> Left appError
    Right (proof1, proof2) ->
      -- (2)
      mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2) $
        \(result, _proof) ->
        Right (forgetName result)

But it does not stop there. Other library functions or even user functions might become infected with this continuation-passing style too as you can see in the lines below line (2) where mergeBy is used. This happens when functions return new Named values. mergeBy needs to wrap its result into Named in order to express its postcondition that its result is sorted.

-- Merge.hs
{-# language RankNTypes #-}
mergeBy ::
  Named comparator (a -> a -> Ordering)                                ->
  (Named list1 [a], SortedBy comparator list1)                         ->
  (Named list2 [a], SortedBy comparator list2)                         ->
  (forall result. (Named result [a], SortedBy comparator result) -> r) ->
  r
mergeBy comparator (list1, _proof1) (list2, _proof2) continuation =
  name (go (forgetName list1) (forgetName list2)) $ \result ->
    continuation (result, SortedBy)
  where
    go list1 [] = list1
    go [] list2 = list2
    go (head1 : tail1) (head2 : tail2)
      | LT <- (forgetName comparator) head2 head1 =
        head2 : go (head1 : tail1) tail2
      | otherwise =
        head1 : go tail1 (head2 : tail2)

GHC core developer Richard Eisenberg has recently revealed [@eisenberg_compositional] that he is working on a keyword to declare about type variables that no assumptions may be made about them, getting rid of the continuation-passing style and their ergonomic overhead entirely.

He is working on a paper on lightweight existential types with some collaborators, which he hopes to submit for ICFP 2021 by 2021-03-02 [@eisenberg_berlin].

This work is not motivated by Ghosts of Departed Proofs but interestingly by the other two approaches to formal verification of Dependent Haskell and LiquidHaskell [@eisenberg_tweag].

He thinks we will have a way of getting an exists keyword that syntactically works like forall [@eisenberg_tweag] and hopes to have a new extension to GHC roughly by May [@eisenberg_berlin]. name’s type might then look as straightforward as follows again.

name :: exists name. a -> Named name a

References