The violation of preconditions is a main cause of incorrect semantics including run time errors. The approach to functions requiring preconditions that is traditionally applied in the Haskell community as described below addresses the problem poorly and suffers from multiple problems. Noonan [@noonan2018] presents a novel technique to design library APIs leveraging the type system to ensure that any preconditions are satisfied. Work currently in progress by researchers and GHC core developers has the potential to make this technique vastly more ergonomic.
The Problem
For example, consider a Haskell library for working with sorted lists. The entire example code from this paper can be found on https://gitlab.com/rdnz/ghosts-proofs-example. The library shall provide a function to merge two lists into a single sorted list if the input lists were already sorted.
-- MergeExpand.hs
-- | Merge two lists into one sorted list
-- if the input lists are already sorted.
-- The first argument defines the ordering.
mergeBy :: (a -> a -> Ordering) -> [a] -> [a] -> Maybe [a]
This is an example of a function requiring a precondition as stated in
its specification. The input lists must already be sorted in the order
defined by the first argument, the comparator. The possibility of a
violated precondition is represented by the Maybe
return
type forcing library users to handle a violated precondition. The
return type is expanded by the value Nothing
because the
function cannot reasonably return a list if the precondition is
violated. That is why I called the file MergeExpand.hs
.
This approach to functions requiring preconditions is common in the
Haskell community. Examples are the lookup
functions in
the base
and containers
package [@hoogle_lookup]. But there are
multiple problems with this approach as will soon become evident.
I will later call the file containing Noonan’s [@noonan2018] novel
approach simply Merge.hs
to distinguish it from the traditional
approach of expanding the return type. But before that, I will present
a first unsuccessful attempt and call that file MergeFirst.hs
.
But let us get back to MergeExpand.hs
for now. This library shall
also provide a function to check mergeBy
’s precondition.
-- MergeExpand.hs
sortedBy :: (a -> a -> Ordering) -> [a] -> Bool
Now consider an example of how this library might be used. The library user might have a data type to represent errors in their application
-- MainExpand.hs
data AppError =
List1Unsorted | List2Unsorted | SomeOtherProblems
deriving Show
and a function validating input lists and potentially returning an
appropriate value of that error data type AppError
.
-- MainExpand.hs
validateLists ::
Integer -> Integer -> Ordering) ->
(Integer] ->
[Integer] ->
[Maybe AppError
=
validateLists comparator list1 list2 case (sortedBy comparator list1, sortedBy comparator list2) of
True, True) -> Nothing
(False, _) -> Just List1Unsorted
(-> Just List2Unsorted _
The AppError
might carry as its payload the index of the
first list element that is out of order to render this information as
an error message later but I have left that out for brevity.
validateLists
could be used as follows.
-- MainExpand.hs
processLists ::
Integer -> Integer -> Ordering) ->
(Integer] ->
[Integer] ->
[Either AppError [Integer]
=
processLists comparator list1 list2 case validateLists comparator list1 list2 of
Just appError -> Left appError
Nothing ->
case mergeBy comparator list1 list2 of
Just result -> Right result
Nothing ->
error "impossible because validateLists guarantees order" -- (1)
example :: String
=
example case processLists descending [3,1] [5,4,2] of
Right result -> show result
Left List1Unsorted -> "The first list needs to be sorted."
Left List2Unsorted -> "The second list needs to be sorted."
Left SomeOtherProblems -> "some other problems"
descending :: Integer -> Integer -> Ordering
descending a1 a2| a1 > a2 = LT
| a1 == a2 = EQ
| otherwise = GT
The approach of expanding the return type of functions that require
preconditions demonstrated in MergeExpand.hs
suffers from
the following problems.
If the
Nothing
value representing a violated precondition needs to be propagated multiple levels up the call tree, there might be multiple preconditions of multiple functions that would be represented by theNothing
value. The information, which specific precondition of which functions was violated, is lost if they are all represented by the same valueNothing
. If you try to repair this by usingEither
return types instead ofMaybe
return types, you will face the problem that simpleEither
-based checked exceptions do not compose well [@parsons_trouble_typed_errors].The argument and return types of
mergeBy
are larger than necessary. That means thatmergeBy
’s type accepts more incorrect implementations than necessary. The more refined your types are, the more the type system helps you avoid bugs [@parsons_type_back_forth].A much worse problem is the performance cost of checking twice that both lists are sorted, once when
validateLists
callssortedBy
, and again whenmergeBy
has to examine if both lists are sorted in order to decide whether to returnNothing
orJust
. I have not shown any implementation ofmergeBy
yet because it does not matter much but here is one to illustrate the need to examine if both lists are sorted.-- MergeExpand.hs mergeBy :: (a -> a -> Ordering) -> [a] -> [a] -> Maybe [a] mergeBy comparator list1 list2| sortedBy comparator list1 && sortedBy comparator list2 = Just (go list1 list2) | otherwise = Nothing where = list1 go list1 [] = list2 go [] list2 : tail1) (head2 : tail2) go (head1 | LT <- comparator head2 head1 = : go (head1 : tail1) tail2 head2 | otherwise = : go tail1 (head2 : tail2) head1
And worst of all, the library user is forced to use
error
in line (1) of the implementation ofprocessLists
in the “impossible” case because they know thatmergeBy
’s precondition is satisfied and therefore cannot reasonably handle a violated precondition. But what ifvalidateLists
were modified in the future to stop checking that the lists are sorted, intentionally or unintentionally? The library user might not remember to updateprocessLists
, and suddenly the “impossible” error becomes a very possible run time error [@king_parse_validate].
A First Attempt at a Solution
The fact that you cannot express propositions and postconditions about
arguments and variables through types, because you cannot refer to
term-level variables on the type level, is the fundamental
obstacle to expressing mergeBy
’s precondition through its
type. Noonan’s [@noonan2018] first key idea is to wrap values into a
newtype
that has got a phantom type variable and is
defined in its own module.
-- NamedFirst.hs
newtype Named name a = Named a
The library module provides functions to wrap and unwrap too.
-- NamedFirst.hs
name :: a -> Named name a
= Named
name
forgetName :: Named name a -> a
Named a) = a forgetName (
This phantom type variable shall then be used to refer to the wrapped term-level value on the type level. In a way, it is providing a type-level name.
An example of that is the following data type in the MergeFirst.hs
library module.
-- MergeFirst.hs
data SortedBy comparator list = SortedBy
It does not contains any information at run time. But its phantom type
variables refer to a comparator’s name and a list’s name. And if a
value of SortedBy comparator list
is in scope, then any
list Named list [Integer]
is guaranteed at compile time to
be sorted according to any comparator Named comparator (Integer -> Integer -> Ordering)
. A value of that type can be
understood as a proof or a witness of this guarantee.
How can a SortedBy
value make such a guarantee? Because
the only way to obtain a SortedBy
value is by calling a
library function like sortedBy
that provides you with it
only after confirming that the referred to list is indeed sorted
according to the referred to comparator. Otherwise, you get
Nothing
.
-- MergeFirst.hs
import qualified MergeExpand as ME
sortedBy ::
Named comparator (a -> a -> Ordering) -> -- (1)
Named list [a] -> -- (2)
Maybe (SortedBy comparator list) -- (3)
=
sortedBy comparator list if ME.sortedBy (forgetName comparator) (forgetName list)
then Just SortedBy
else Nothing
Notice, how the return type’s type variables in line (3) must equal the ones in the argument types in line (1) and (2) respectively.
In a way, we have just expressed sortedBy
’s postcondition
through its type. We can now express mergeBy
’s
precondition through its type.
-- MergeFirst.hs
mergeBy ::
Named comparator (a -> a -> Ordering) ->
Named list1 [a], SortedBy comparator list1) ->
(Named list2 [a], SortedBy comparator list2) ->
(Named result [a], SortedBy comparator result) (
The function can only be called if SortedBy
values for
both input lists are available, which guarantee mergeBy
‘s
precondition. Consequently, mergeBy
need not return a
Maybe
type anymore nor check the lists’ order at run time
anymore either.
-- MergeFirst.hs
=
mergeBy comparator (list1, _proof1) (list2, _proof2) let result = name $ go (forgetName list1) (forgetName list2)
in (result, SortedBy)
where
= list1
go list1 [] = list2
go [] list2 : tail1) (head2 : tail2)
go (head1 | LT <- (forgetName comparator) head2 head1 =
: go (head1 : tail1) tail2
head2 | otherwise =
: go tail1 (head2 : tail2) head1
Taking inspiration from the library functions sortedBy
and
mergeBy
fabricating arbitrary SortedBy
values
from thin air by simply using its data constructor
SortedBy
, you might ask “What stops a library user from
doing the same?”
-- MainFirst.hs
preconditionViolated :: [Integer]
=
preconditionViolated $
forgetName fst $
mergeBy
(name descending)3,1,99], SortedBy)
(name [5,4,2], SortedBy) (name [
The answer is in the library’s module declaration.
-- MergeFirst.hs
module MergeFirst (SortedBy (), mergeBy, sortedBy) where
SortedBy
’s is an abstract data type, its data constructor
is not exported. This restricts the power to fabricate its values from
thin air to the library code, where the data type is defined. Mistakes
in the library will not be caught by the type system. But the user
cannot use the library in an unsafe way because it can only obtain a
SortedBy
value from library functions like
sortedBy
.
Let us see what the library user’s validateLists
and
processLists
look like when using the new
MergeFirst.hs
. The rest of the user code stays the same.
-- MainFirst.hs
validateLists ::
Named comparator (Integer -> Integer -> Ordering) ->
Named list1 [Integer] ->
Named list2 [Integer] ->
Either AppError (SortedBy comparator list1, SortedBy comparator list2)
=
validateLists comparator list1 list2 case (sortedBy comparator list1, sortedBy comparator list2) of
Just proof1, Just proof2) -> Right (proof1, proof2)
(Nothing, _) -> Left List1Unsorted
(-> Left List2Unsorted
_
processLists ::
Integer -> Integer -> Ordering) ->
(Integer] ->
[Integer] ->
[Either AppError [Integer]
=
processLists comparator list1 list2 let
comparatorNamed :: Named comparator (Integer -> Integer -> Ordering)
= name comparator
comparatorNamed list1Named :: Named list1 [Integer]
= name list1
list1Named list2Named :: Named list2 [Integer]
= name list2
list2Named in
case validateLists comparatorNamed list1Named list2Named of -- (1)
Left appError -> Left appError
Right (proof1, proof2) ->
let
=
(result, _proof)
mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2)in Right (forgetName result)
mergeBy
does not return a Maybe
anymore as
mentioned before. Therefore, the library user is not forced to use
error
in any “impossible” case in the implementation of
processLists
anymore. The user can communicate their
conviction that mergeBy
’s precondition is satisfied to the
library in a way that the library can verify.
At least, that would be the case, if there was not still a big problem
with this first attempt. Look at processLists
again and
imagine, only the first input list was sorted but not the second
one. In this case, there should be no way to apply mergeBy
to the
lists because the necessary SortedBy comparator list2
value cannot be obtained from sortedBy
, which only returns
Nothing
. But what if the library user does not pass the
second unsorted list to validateLists
but the first sorted
list again instead in line (1)?
-- MainFirst.hs
processLists ::
Integer -> Integer -> Ordering) ->
(Integer] ->
[Integer] ->
[Either AppError [Integer]
=
processLists comparator list1 list2 let
comparatorNamed :: Named comparator (Integer -> Integer -> Ordering)
= name comparator
comparatorNamed list1Named :: Named list1 [Integer]
= name list1
list1Named list2Named :: Named list2 [Integer]
= name list2
list2Named in
-- case validateLists comparatorNamed list1Named list2Named of
case validateLists comparatorNamed list1Named list1Named of -- (1)
Left appError -> Left appError
Right (proof1, proof2) ->
let
=
(result, _proof) -- (2)
mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2)in Right (forgetName result)
proof2
is now of type SortedBy comparator list1
, list2Named
is still of type Named list2 [Integer]
. You might expect a type error from the line
below line (2) considering line (3) of mergeBy
’s type
signature.
-- MergeFirst.hs
mergeBy ::
Named comparator (a -> a -> Ordering) ->
Named list1 [a], SortedBy comparator list1) ->
(Named list2 [a], SortedBy comparator list2) -> -- (3)
(Named result [a], SortedBy comparator result) (
list2
and list1
would need to be the same type
to satisfy the type checker. But there is no type error
unfortunately. The reason is that we are indeed allowed to assume that
the type variables list2
and list1
are the
same type. They are universally quantified and therefore
unify. Indeed, we may assume each of them to be any arbitrary
type. There is no type error just like there is no type error in line
(1) and (2) of the following code for example.
wellTyped :: Int
=
wellTyped let
aList :: [variable1]
= []
aList bList :: [variable2]
= aList -- (1)
bList boolList :: [Bool]
= aList -- (2)
boolList in
length bList
So violating mergeBy
’s precondition is not a type
error. We could accidentally apply the latest version of
processLists
to an unsorted list resulting in unspecified
behavior by MergeFirst.mergeBy
. This is worse than
MergeExpand.hs
.
The Solution
Let us again try to trick the library into evaluating
mergeBy
without its precondition satisfied just as we did
before my manipulating processLists
, this time by
manipulating validateLists
’s implementation. Again, we
imagine only the first input list was sorted but not the second one.
-- MainFirst.hs
validateLists ::
Named comparator (Integer -> Integer -> Ordering) ->
Named list1 [Integer] ->
Named list2 [Integer] ->
Either AppError (SortedBy comparator list1, SortedBy comparator list2)
=
validateLists comparator list1 list2 --case (sortedBy comparator list1, sortedBy comparator list2) of
case (sortedBy comparator list1, sortedBy comparator list1) of -- (1)
Just proof1, Just proof2) -> Right (proof1, proof2)
(Nothing, _) -> Left List1Unsorted
(-> Left List2Unsorted _
We do not apply sortedBy
to list2
but again to
list1
instead in line (1). Similarly to before,
proof2
now is of type SortedBy comparator list1
while expected to be of type SortedBy comparator list2
. But very much unlike before, we now do get the exact
type error that we hoped for.
• Couldn't match type ‘list1’ with ‘list2’
[...]
Expected type: Either AppError
(SortedBy comparator list1, SortedBy comparator list2)
Actual type: Either AppError
(SortedBy comparator list1, SortedBy comparator list1)
Why may we not assume that the type variables list2
and
list1
are the same type anymore? Because they are now part
of the function’s parameter types. Naturally, we may not make any
assumptions about type variables in function parameters because we
have to be able to deal with whatever types the user chooses to apply
that function too. In the implementation of a function length :: [a] -> Integer
, for example, we may not assume that
a
is Bool
because the caller might choose to
apply length
to a list of Char
’s.
So how do we consistently force a library user into the situation of
validateLists
where they cannot make any assumptions about
the Named
type variables?
-- Named.hs
{-# language RankNTypes #-}
module Named (Named (), name, forgetName) where
name ::
->
a forall name. Named name a -> result) ->
(
result= continuation (Named a) name a continuation
The answer is that whenever the library user wants to use a
Named
value, they have to write a function because the
library function name
does not return Named
values anymore. It does not offer “Give me an a
and I will
return you a Named name a
” anymore. It now offers “Give me
an a
and tell me what you would do with a Named name a
. I will do that for you and return you the result of
that.” This is also called “continuation-passing style”. That is
Noonan’s [@noonan2018] most important key idea.
Let us see what the library user’s processLists
look like
when using the library adjusted to the new Named.hs
. The rest of the
user code stays the same including validateLists
. That is
unsurprising because we have just determined that
validateLists
is exactly the kind of code we want to force
the library user to write all the time.
-- Main.hs
processLists ::
Integer -> Integer -> Ordering) ->
(Integer] ->
[Integer] ->
[Either AppError [Integer]
=
processLists comparator list1 list2 $ \comparatorNamed ->
name comparator $ \list1Named ->
name list1 $ \list2Named ->
name list2 case validateLists comparatorNamed list1Named list2Named of
Left appError -> Left appError
Right (proof1, proof2) ->
$
mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2) ->
\(result, _proof) Right (forgetName result)
Here is MainFirst.processLists
from our previous
unsuccessful first attempt again in case you want to compare them.
-- MainFirst.hs
processLists ::
Integer -> Integer -> Ordering) ->
(Integer] ->
[Integer] ->
[Either AppError [Integer]
=
processLists comparator list1 list2 let
comparatorNamed :: Named comparator (Integer -> Integer -> Ordering)
= name comparator
comparatorNamed list1Named :: Named list1 [Integer]
= name list1
list1Named list2Named :: Named list2 [Integer]
= name list2
list2Named in
case validateLists comparatorNamed list1Named list2Named of
Left appError -> Left appError
Right (proof1, proof2) ->
let
=
(result, _proof)
mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2)in Right (forgetName result)
New Developments
In summary, you have got a way of referring to term-level variables on the type level. This allows you to express function’s preconditions and postconditions through their type. This technique can easily be applied to many other libraries.
But there is the drawback of making libraries less ergonomic and harder to understand.
-- Named.hs
{-# language RankNTypes #-}
name ::
->
a forall name. Named name a -> result) ->
( result
name
and its type is somewhat alien and complicated. And
it is somewhat unergonomic to use as you can see in the lines below
line (1) in Main.processLists
.
-- Main.hs
processLists ::
Integer -> Integer -> Ordering) ->
(Integer] ->
[Integer] ->
[Either AppError [Integer]
=
processLists comparator list1 list2 -- (1)
$ \comparatorNamed ->
name comparator $ \list1Named ->
name list1 $ \list2Named ->
name list2 case validateLists comparatorNamed list1Named list2Named of
Left appError -> Left appError
Right (proof1, proof2) ->
-- (2)
$
mergeBy comparatorNamed (list1Named, proof1) (list2Named, proof2) ->
\(result, _proof) Right (forgetName result)
But it does not stop there. Other library functions or even user
functions might become infected with this continuation-passing style
too as you can see in the lines below line (2) where
mergeBy
is used. This happens when functions return new
Named
values. mergeBy
needs to wrap its result
into Named
in order to express its postcondition that its
result is sorted.
-- Merge.hs
{-# language RankNTypes #-}
mergeBy ::
Named comparator (a -> a -> Ordering) ->
Named list1 [a], SortedBy comparator list1) ->
(Named list2 [a], SortedBy comparator list2) ->
(forall result. (Named result [a], SortedBy comparator result) -> r) ->
(
r=
mergeBy comparator (list1, _proof1) (list2, _proof2) continuation $ \result ->
name (go (forgetName list1) (forgetName list2)) SortedBy)
continuation (result, where
= list1
go list1 [] = list2
go [] list2 : tail1) (head2 : tail2)
go (head1 | LT <- (forgetName comparator) head2 head1 =
: go (head1 : tail1) tail2
head2 | otherwise =
: go tail1 (head2 : tail2) head1
GHC core developer Richard Eisenberg has recently revealed [@eisenberg_compositional] that he is working on a keyword to declare about type variables that no assumptions may be made about them, getting rid of the continuation-passing style and their ergonomic overhead entirely.
He is working on a paper on lightweight existential types with some collaborators, which he hopes to submit for ICFP 2021 by 2021-03-02 [@eisenberg_berlin].
This work is not motivated by Ghosts of Departed Proofs but interestingly by the other two approaches to formal verification of Dependent Haskell and LiquidHaskell [@eisenberg_tweag].
He thinks we will have a way of getting an exists
keyword that
syntactically works like forall
[@eisenberg_tweag] and
hopes to have a new extension to GHC roughly by May
[@eisenberg_berlin]. name
’s type might then look as
straightforward as follows again.
name :: exists name. a -> Named name a