A Language Agnostic Introduction to Dependent Types
Image Attribution:
Roulex 45. Élévation du pont de Shibapo May 7th, 2010. CC-BY-SA 3.0 License (NOT a CC-BY 4.0 License).
August 16, 2021
Miscellaneous
EDIT (2021-10-05): Tom Smeding has kindly pointed out several errors in a few
of my examples. In particular I did not include additional IsSanitized(s)
and
Sanitized(s)
arguments to Some
and None
in checkIfStringIsSafe
. I also
did not actually end up using the recursiveEquality
value in
addingZeroOnRightDoesNothing
. Those errors have now been fixed. Many thanks to
Tom! I have separately also added an additional simple example of proving a
contradiction and elaborated somewhat on my string sanitization example.
“Dependent types” seem to be one of those things a lot of programmers have heard about, but don’t really know much about it. Discussion forums talk about how you can prove things with dependent types and how they’re so so awesome for program correctness, but the discussions often are very confused with people talking past each other. Unfortunately most introductions to dependent types seem to assume a deep familiarity with functional programming or to assume a certain amount of familiarity with programming language theory.
I personally think dependent types are a fantastic feature I would like to see many more programming languages implement, rather than the paltry few that currently exist, but I also am quite skeptical of how dependent types are often used today in the few languages that do implement them and would like to suggest new ways of using them. However, I will not be talking about the latter in this post and will mainly be focusing on describing dependent types as they are found today.
Here I present an introduction to dependent types that is hopefully accessible for anyone who has worked with a statically typed language before that has some sort of a switch or case matching sort of functionality. I’ll go over some of the basics of how to use dependent types to prove things. This will be very heavy on pseudocode examples and very light on theory. Once you have an intuition for how the examples work, I believe you will be better equipped to understand other more theoretical introductions to dependent types.
Because this is targetted at people who have no experience with dependent types the pace will be slow and potentially have some repetition.
To those who are familiar with dependent types, it is worth mentioning that this introduction focuses heavily on dependent pattern matching as its primary method of interaction with the typechecker rather than eliminators. While the latter is theoretically cleaner, the former is often more familiar to software developers unfamiliar with CS and logic theory. It is also the most common presentation among existing dependently-typed languages.
As mentioned, this post will present common uses of dependent types uncritically. This does not represent a personal endorsement of some of the patterns I’ll be presenting here. In a follow-up post I will detail what common patterns in dependently-typed programming I believe to entail the wrong set of trade-offs when it comes to production code in most businesses. In particular I believe that a proof-heavy way of using dependent types is usually not the best way to leverage dependent types and that this is even more aggravated when the proofs are integrated into the data structure (such as the common example of length-indexed vectors and type-safe append).
However, that will all come later. For now I’ll be presenting aspects of dependent types as they exist “in the wild” mostly divorced of my own thoughts about them.
This post is long and is structured into several parts:
- Introduction to our pseudo-code : Our pseudo-code will consist of three
main concepts: type constructors, pattern matching, and functions. Dependent
types derive their power from a special type of types called
Type
that allows for more flexible type constructors and additional information passed to the type checker while pattern matching. - Using the Type type: Dependent types let you work with types just the
same as you would any other value. This lets you recreate stuff like generics
without any built-in support for them. Note that in the interests of
accessibility we will not be addressing the issue of “type-in-type” (i.e.
what is the type of
Type
?). - Creating dependent types: We can use this increased flexibility to create dependent types: types that depend on runtime values.
- Using pattern matching to narrow down runtime values: Our static typechecker needs to be able to statically narrow down runtime values to actually work. To do that we will use pattern matching.
- Determining when types are equal: A large part of typechecking dependently typed code is proving two things are equal.
- Using type equality to prove runtime properties about our code: Boolean edition: Using the mechanisms that dependent types give us we go over some examples about how to prove properties of code using booleans.
- Using type equality to prove runtime properties about our code: NaturalNumber edition: We do the same thing but now with natural numbers.
- Dependent types in data structures: We create data structures that intertwine dependent types with their underlying data.
- Using dependent types as a way of tagging data: We use dependent types in a very lightweight manner to just tag data with properties.
- Why do dependently typed languages tend to be pure functional languages?: We close out by talking about why dependently typed languages tend to be pure functional languages.
Introduction to our pseudo-code
We will be working with some pseudocode inspired by Typescript, Java, Kotlin, and Scala. Note that none of these languages have full-blown dependent types of the sort I’ll be presenting here although Typescript and Scala have some small fragments.
The pseudocode looks something like this:
// This is a comment
/*
* This is a block comment
*/
/*
* This is how we create a new type
*
* Type constructors are the atomic "building blocks" of our programs. Most
* non-function expressions ultimately evaluates to one of these constructors of
* some type.
*
* We can create the usual primitive types this way, such as `Boolean`. We'll
* show some other examples later as well.
*/
type Boolean constructors {
// The ": Boolean" may look redundant right now, but it will come in handy
// later when we have more complicated constructors
True: Boolean
False: Boolean
}
function not(x: Boolean): Boolean = {
// This is an example of pattern matching
case x of {
True => False
False => True
}
}
// The last line of an expression is the return value of the entire expression
function notNot(x: Boolean): Boolean = {
val oneNot = not(x)
// This last line is the return value
not(oneNot)
}
// Here's how we declare variables
// Once assigned they're immutable, we'll take more later about why this is
// important
val thisIsFalse = not(True)
type Shape constructors {
Rectangle(isSquare: Boolean): Shape
Circle: Shape
Triangle(isEquilateral: Boolean): Shape
}
val square: Shape = Rectangle(True)
function isPointy(shape: Shape): Boolean = {
case shape of {
// We can bind new variables in pattern matches although in this case
// it's unused
Rectangle(unused) => True
Circle => False
Triangle(alsoUnused) => True
}
}
// isPointy(Circle) is False
// isPointy(Triangle(True)) is True
// isPointy(Triangle(False)) is also True
function isRegularPolygon(shape: Shape): Boolean = {
case shape of {
// Another example of variable binding in pattern matches
Rectangle(isThisASquare) => isThisASquare
Circle => True
Triangle(areWeEquilateral) => areWeEquilateral
}
}
// isRegularPolygon(Rectangle(False)) is False
// isRegularPoylgon(Triangle(True)) is True
So far so good. This is a pretty bog-standard cross of languages such as Java, Kotlin, Scala, Javascript, etc. It turns out that all the tools we have here are actually enough to do basic programming. You can’t interact with the outside world since we haven’t provided any primitives to do so (such as printing to your console). However it turns out just with pattern matching and the ability to create types with custom constructors, we can do all the logical bits of computing you’d normally do (e.g. arithmetic, business logic, etc.).
We can build up data with the custom constructors we’ve defined, then pattern matching lets us break down the custom types we create and conditionally dispatch logic on them.
Here’s some fancier type examples. We can have recursive types such as the following.
type NaturalNumber constructors {
Zero: NaturalNumber
Successor(x: NaturalNumber): NaturalNumber
}
val zero: NaturalNumber = Zero
val one: NaturalNumber = Successor(Zero)
val two: NaturalNumber = Successor(Successor(Zero))
// We have recursion in functions as well!
function add(x: NaturalNumber, y: NaturalNumber): NaturalNumber = {
case x of {
Zero => y
Successor(prev) => Successor(add(prev, y))
}
}
// We can also recurse on both arguments
// True if x is greater than y, False otherwise
function greaterThan(x: NaturalNumber, y: NaturalNumber): Boolean = {
case x of {
Zero => False
Successor(prevX) => case y of {
Zero => True
Successor(prevY) => greaterThan(prevX, prevY)
}
}
}
Using the Type type
We can also have the equivalent of generics in other languages, however we’re going to do this in a little different manner.
Here’s the first semi-exotic thing that you might not have seen in other
statically typed languages. We have a special “type of types” called Type
.
Any expression that is a Type
can be used in a type signature.
This means that we don’t have separate generic arguments to a type denoted by
angle or square brackets such as List<Int>
or List[Int]
that you might find
in other languages. We just use the same round parentheses as usual. This is a
little wordier than the usual treatment of generics, but that’s because I’m
intentionally showing you all the machinery and not hiding things behind various
inference mechanisms.
// This is similar to an Option<T>, Option[T], or Maybe t type you might find in
// other languages
// Notice that `t: Type` is just a normal argument like any other!
type Option(t: Type) constructors {
// Because x is a Type it can be used in the type signature of value
Some(x: Type, value: x): Option(x)
None(x: Type): Option(x)
}
// The Some(Boolean, ...) is explicitly passing the Boolean argument where this
// is often inferred in other languages with generics, e.g. Some<Boolean>(...)
// or Some[Boolean](...) which could be written as Some(...)
val someBoolean: Option(Boolean) = Some(Boolean, True)
val noBooleanFound: Option(Boolean) = None(Boolean)
type List(t: Type) constructors {
Nil(x: Type): List(x)
Cons(x: Type, element: x, rest: List(x)): List(x)
}
val listOfThreeElements: List(Boolean) =
Cons(Boolean, True, Cons(Boolean, False, Cons(Boolean, True, Nil(Boolean))))
Now that we have the ability to create a Type
explicitly, we can create
Type
s using all sorts of expression. Indeed, the ability for expressions to
evaluate to Type
s themselves creates a lot of interesting other possibilities.
// We get type synonyms for free
// Note that choosing to capitalize the first letter of this function is just a
// convention for expressions that return a Type. There's no special mechanism
// here other than simple naming
function SynonymForBoolean(): Type = Boolean
// Remember, anything that evaluates to something of type Type can be used in a
// type signature and SynonymForBoolean returns a value of type Type!
val someBool: SomeSynonymForBoolean() = True
Creating dependent types
But an explicit Type
by itself isn’t a dependent type quite yet. Here’s our
first true dependent type, that is a Type
that depends on a runtime value.
// Again note that capitalizing the first letter of this function is just
// convention, nothing more
// Notice that unlike List which took an argument of type Type, here we have an
// argument boolean which is the normal runtime type Boolean
function ShapeOrNumber(boolean: Boolean): Type = {
case boolean of {
True => NaturalNumber
False => Shape
}
}
val circle: ShapeOrNumber(False) = Circle
val anotherZero: ShapeOrNumber(True) = Zero
Whew that’s kind of trippy isn’t it? You see runtime values in the form of literals in a type signature! But of course the question arises, yeah this is all and well when we have type constructor literals such as False or True, but what happens when you don’t have a literal?
Using pattern matching to narrow down runtime values
Let’s take a look at perhaps the simplest “hail Mary” approach where we just try to stick in an abstract variable and see what happens.
function useAtRuntime(boolean: Boolean): NaturalNumber = {
// Does this type check?
// The answer is no, we'll go into a bit more later, but basically the
// compiler can't evaluate abstract variables so it can't figure out what
// type ShapeOrNumber(boolean) evaluates to, which is just as well since
// this can't possibly work if boolean is False
val thisDoesntCompile: ShapeOrNumber(boolean) = one
thisDoesntCompile
}
So there’s no magic here that can make this work. At a first glance this seems to severely cripple dependent types. After all the majority of runtime values are not going to be literals. And in fact we are pretty screwed if we have arbitrary runtime values. But we have a way of narrowing down what the value of a runtime value can be so that it’s not arbitrary anymore: pattern matching!
Every pattern match gives the compiler additional information about the runtime
value of a variable in that branch. Then in the branches of your pattern match
you can recover literals that you pass to ShapeOrNumber
.
function useAtRuntimeTryAgain(boolean: Boolean): NaturalNumber = {
case boolean of {
True => {
// The typechecker now knows in this branch boolean is True
//
// Notice that we're reusing boolean here! In a lot of statically
// typed FP languages such as Haskell, ML, Elm, etc. the best you
// could do is do ShapeOrNumber(True), where we are forced to repeat
// the True even though we know that boolean can only be True.
// In a dependently typed language we get this for free.
//
// Because the typechecker now knows that boolean is True it can
// proceed to evaluate ShapeOrNumber to get NaturalNumber.
val result: ShapeOrNumber(boolean) = one
result
}
False =>
// In this branch the typechecker now knows that boolean is False
// We aren't doing anything with this information but we could've
// used it to create something like `val someValue: ShapeOrNumber(boolean) = Circle`
zero
}
}
// useAtRuntimeTryAgain(True) is Successor(Zero)
// useAtRuntimeTryAgain(False) is Zero
This is important! Pattern matching in a dependently typed language can be thought of as a more powerful form of flow typing that you might see in languages such as Kotlin or Typescript. That is pattern matching introduces additional information about various equalities among different values and types that occur in that branch.
This bears repeating: pattern matching is the fundamental tool that we use to tell the compiler more information about which types are equal to each other. This makes dependent pattern matching a more powerful tool than in a non-dependently-typed setting.
Determining when types are equal
But to understand this in more detail, we have to understand the phrase “what
type are equal to each other” in more detail. All languages with static type
systems have an implicit idea of what type equality is. For example any static
type checker would reject the following expression because it knows for certain
the Boolean
and Shape
are not equal.
// Type error
// Circle is type Shape and the compiler knows Shape != Boolean
val thisDoesntWork : Boolean = Circle
On the other hand, all statically typed languages have the ability to realize trivial type equalities.
// Luckily our type checker knows that True is of type Boolean and Boolean =
// Boolean so therefore the types for thisDoesWork and True match!
val thisDoesWork : Boolean = True
Certain other statically typed languages might allow for very basic computations on top of types, such as calculating type equality after resolving various type synonyms, and somewhat more rarely statically typed languages might also allow for non-trivial type-level computations before calculating type equality.
Dependent types extend this and allow for runtime functions to be used in computations before calculating type equality. How exactly does this happen? Well the forms of computation that exist in our language so far are function application and function evaluation and the only non-trivial bit of that is pattern matching. So the typechecker must have some way of evaluating functions. However, we don’t always have all the information at compile time to fully evaluate a function (e.g. a number may be provided as user input). So instead the typechecker will often only partially evaluate a function.
In particular, if a function appears in a signature, when type checking the compiler will try to evaluate a function until it reaches an abstract variable in a pattern match.
Then it will check whether what it has after this evaluation is syntactically equal or not. If they are syntactically equal then the compiler concludes the two types are equal. Otherwise the compiler declares it doesn’t know whether they are equal or not. Note that this is not the same as the compiler declaring that two types are not equal! Due to stuff like the halting problem the compiler cannot always prove types are equal and sometimes just throws its hands up and says it doesn’t know. It is then up to the programmer to prove that they are equal.
Don’t worry if you didn’t quite get that, we’ll get to some examples that will illustrate the idea further. The main idea here is that the typechecker tries to evaluate functions unti it hits an abstract variable, then it compares for syntactic equality, and then it comes to one of the following conclusions:
- Sees that the types are syntactically equal
- Sees that the types are syntactically different, but that there were functions that could not be further evaluated and therefore it doesn’t know whether the types were equal
- Sees that the types are syntactically different, but notices that all functions have been completely evaluated, and therefore it knows the types must be different.
Let’s go over some standalone examples of how the typechecker might try to compare two types for equality.
Comparing
Boolean
withInt
: The typechecker notices that the types are syntactically different and that there are no functions in these types. Therefore it knows these types are different.Comparing
Shape
withShapeOrNumber(False)
: The typechecker evaluatesShapeOrNumber(False)
. It gets tocase boolean of
, which it can further evaluate because boolean is set toFalse
via theFalse
literal whose value is known at compile-time. It therefore realizes thatShape
is equal toShapeOrNumber(False)
.Comparing
Shape
withShapeOrNumber(True)
: The typechecker evaluatesShapeOrNumber(True)
. It gets tocase boolean of
, which it can further evaluate because boolean is set toTrue
via theTrue
literal whose value is known at compile-time. It therefore realizes thatShape
is not equal toShapeOrNumber(True)
because it has evaluatedShapeOrNumber(True)
toNaturalNumber
.Comparing
Shape
withShapeOrNumber(someUnknownValue)
: The typechecker attempts to evaluateShapeOrNumber(someUnknownValue)
, but it gets stuck on the case checkcase boolean of
inShapeOrNumber
because this timeboolean
is an abstract variable whose value is set tosomeUnknownValue
which is also an abstract value. Therefore it can’t access the actual pattern match aftercase boolean of
and stops evaluation. Without full evaluation it cannot conclude whetherShapeOrNumber(someUnknownValue)
is equal toShape
.Comparing
Shape
withShapeOrNumber(someUnknownValue)
where we are in theFalse
branch of a pattern match somewhere that looks likecase someUnknownValue of { case True => ... case False => ... // Where we are }
: The typechecker now knows that
someUnknownValue
isFalse
and can evaluateShapeOrNumber(someUnknownValue)
completely, which alllows it to realize thatShapeOrNumber(someUnknownValue)
andShape
are equal.Comparing
ShapeOrNumber(someUnknownValue)
withShapeOrNumber(someUnknownValue)
: Even though the typechecker cannot evaluate either type fully, it sees that they are syntactically identical and therefore concludes that they are equal.
Using type equality to prove runtime properties about our code: Boolean edition
So that’s a brief look at type equality. Let’s see how we can make of use of
that more explictly in our code. First let’s return to our very first type
declaration of Boolean
.
type Boolean constructors {
// The ": Boolean" may look redundant right now, but it will come in handy
// later when we have more complicated constructors
True: Boolean
False: Boolean
}
At the time we noted that the : Boolean
looks redundant and up until now it
really has been kind of redundant. All of our types could’ve been written
without it and it could’ve (in theory) been inferred by the compiler. However,
now let’s make use of the fact that our constructors have explicit return types.
type IsEqual(t: Type, x: t, y: t) constructors {
// Refl is short for reflexive. As we start to use IsEqual more, you may
// start to see how usages of Refl are equivalent to the reflexivity
// property of equality
Refl(t: Type, x: t): IsEqual(t, x, x)
}
The crucial point to see here is that the return value of Refl
is a
constrained version of IsEqual
rather than a completely unconstrained
IsEqual(t, x, y)
.
By having a single value x
in the argument to Refl
duplicated in the return
type of IsEqual
, we constrain IsEqual
’s x
and y
arguments to be equal to
each other in a way that the type checker can see. IsEqual
and its analogues
are very fundamental types in dependently typed languages. They allow us to take
the implicit notion of type equality in the “brains” of our type checker and use
an explicit representation of it. And because dependent types allow for runtime
values to exist in types, we can create a type that uses the type checker’s
implicit notion of equality to create a compile-time representation of equality
among runtime values!
How does it work? Well let’s take it for a spin.
val trueIsEqualToTrue: IsEqual(Boolean, True, True) = {
// Refl(Boolean, True) has type of IsEqual(Boolean, True, True) and so our
// type checker accepts it as a return value of trueIsEqualToTrue
Refl(Boolean, True)
}
// This is impossible to create, since Refl(Boolean, True) is of type
// IsEqual(Boolean, True, True) and Refl(Boolean, False) is of type
// IsEqual(Boolean, False, False). We don't have any way of creating a value of
// type IsEqual(Boolean, True, False) or IsEqual(Boolean, False, True)
val trueIsEqualToFalse: IsEqual(Boolean, True, False) = // Impossible
This is a pretty cool trick. As we will soon see, this lets us express runtime
properties about our code as type signatures! So far the properties we’ve seen
are trivial, namely that True
is equal to True
. Let’s move on to a fancier
proposition.
// This type signature represents the fact that applying not twice to a Boolean
// returns the same value you started with
function notNotIsIdentity(x: Boolean): IsEqual(Boolean, x, not(not(x))) = {
// TODO: implement this
}
This proposition isn’t some masterful insight, but it is at least more complex than the proposition that things are equal to themselves! Let’s see how we can get the type checker to accept this.
First let’s just try stuffing Refl
there.
function notNotIsIdentity(x: Boolean): IsEqual(Boolean, x, not(not(x))) = {
// Nope this blows up with a type error!
Refl(Boolean, x)
}
The compiler fails to accept that Refl(Boolean, x)
’s type of IsEqual(Boolean, x, x)
can unify with IsEqual(Boolean, x, not(not(x)))
, that is it can’t tell
that IsEqual(Boolean, x, x)
is equal to IsEqual(Boolean, x, not(not(x)))
.
You can think of the process as something like the following
1. Compare IsEqual(Boolean, x, x) with IsEqual(Boolean, x, not(not(x)))
2. Boolean = Boolean, x = x, now compare x with not(not(x))
3. x has no further function applications so look at not(not(x))
4. not(not(x)) is evaluated via the definition of not (after expanding out both
nots) to be
case {
case x of {
True => False
False => True
}
} of {
True => False
False => True
}
5. x is an abstract variable in that case expression and so the compiler cannot
evaluate not(not(x)) any further
6. The compiler cannot prove x to be equal to the nested case expression so it
cannot prove x to be equal to not(not(x))
7. The compiler cannot prove IsEqual(Boolean, x, x) = IsEqual(Boolean, x, not(not(x))
So the compiler was unable to prove that x = not(not(x))
during the course of
type checking. Note that the compiler was also unable to prove x != not(not(x))
since it was unable to finish evaluating not(not(x))
. We’ll get
back to the !=
case later, but for now let’s return to just =
.
As humans we know that not(not(x))
evaluates to x
for all x
, so how we can
provide that information to the compiler? Once again the magic ingredient is
pattern matching! Remember that pattern matching provides the compiler with more
information in each branch that is pattern matched that it can then use to
further evaluate functions while type checking.
// This definition is accepted by the type checker
function notNotIsIdentity(x: Boolean): IsEqual(Boolean, x, not(not(x))) = {
case x of {
True => Refl(Boolean, x)
False => Refl(Boolean, x)
}
}
That might look weird if you’re coming from a non-dependently typed background.
We are ultimately returning the exact same expression, namely Refl(Boolean, x)
. How can this type check but not our previous function body? Well the
pattern match gives the compiler that extra bit of necessary information. Let’s
look at the True
branch in greater detail.
True => Refl(Boolean, x)
On the right-hand side of True
, the compiler now knows that x = True
. So
let’s run through the same steps we saw before.
1. Compare IsEqual(Boolean, x, x) with IsEqual(Boolean, x, not(not(x)))
2. Boolean = Boolean, x = x, now compare x with not(not(x))
3. x has no further function applications so look at not(not(x))
4. not(not(x)) is evaluated via the definition of not (after expanding out both
nots) to be
case {
case x of {
True => False
False => True
}
} of {
True => False
False => True
}
5. x = True, so now we have
case False of {
True => False
False => True
}
which evaluates to True.
6. So the compiler now knows that not(not(x)) evaluates to True. It also knows x
evaluates to True.
7. Therefore the compiler can prove that not(not(x)) = x.
8. Therefore the compiler can prove IsEqual(Boolean, x, x) = IsEqual(Boolean, x, not(not(x))
Notice that crucial change in step 5! That extra bit of information that x = True
allows the type checker to continue evaluating not(not(x))
whereas
before it was stuck since x
was just an abstract variable. This allows it to
fully evaluate not(not(x))
and x
and find that they are the same.
The exact same thing happens in the False
branch, and by adding what might
look like a redundant pattern match, we’ve helped the compiler prove that
IsEqual(Boolean, x, not(not(x)))
for any boolean x
!
Let’s have the compiler prove some other things for us.
Note that we previously defined notNot(x)
which is just the same thing as
not(not(x))
. Can we express that in code?
function notNotIsJustNotTwice(x: Boolean): IsEqual(Boolean, notNot(x), not(not(x))) = {
case x of {
True => Refl(Boolean, x)
False => Refl(Boolean, x)
}
}
Yep just the same thing as before! The only difference is that in each branch of
the pattern match we simultaneously evaluate both notNot(x)
and not(not(x))
,
not just not(not(x))
.
We’ve seen how pattern matching on ordinary, non-dependent types can introduce new equality constraints by constraining an abstract variable to a type constrctor. However, we can also introduce equality constraints among abstract variables by pattern matching on dependent types.
For example we can pattern match on Refl
to turn IsEqual
into equality
constraints the type checker can understand.
function equalityIsPreservedForNot(
x: Boolean,
y: Boolean,
xEqualToY: IsEqual(Boolean, x, y)
): IsEqual(Boolean, not(x), not(y)) = {
case xEqualToY of {
Refl(unused0, unused1) =>
// Refl(Boolean, not(y)) would work too
Refl(Boolean, not(x))
}
}
In the Refl
branch (which is the only branch), the type checker must unify the
original type of xEqualToY
with the new one introduced by Refl
. So in
particular the type checker now knows IsEqual(Boolean, x, y) = IsEqual(Boolean, x, x)
. With this constraint, the type checker can further derive the fact that
x = y
.
So roughly we get the following steps, which results in a successful type check.
1. Derive x = y from the pattern match
case xEqualToY of {
Refl(unused0, unused1) => ... // Matching against Refl introduces
// the constraint x = y
}
2. Compare IsEqual(Boolean, not(x), not(x)) (type of Refl(Boolean, not(x))) with IsEqual(Boolean, not(x), not(y))
3. Boolean = Boolean, not(x) = not(x), now compare not(x) with not(y)
4. But we know that x = y so therefore the compiler can prove IsEqual(Boolean, not(x), not(x)) = IsEqual(Boolean, not(x), not(y))
5. Therefore Refl(Boolean, not(x)) successfully type checks
Given a function type A -> B
so that e.g. not
has the type Boolean -> Boolean
we can write a more general function
and implement it in the same
way.
function functionsPreserveEquality(
t0: Type,
t1: Type,
x: t0,
y: t0,
f: t0 -> t1,
xEqualToY: IsEqual(t0, x, y)
): IsEqual(t1, f(x), f(y)) = {
case xEqualToY of {
Refl(unused0, unused1) => Refl(t1, f(x))
}
}
The steps look much the same, just with more abstract variables:
1. Derive x = y from the pattern match
2. Compare IsEqual(t1, f(x), f(x)) (type of Refl(t1, f(x))) with IsEqual(t1, f(x), f(y))
3. t1 = t1, f(x) = f(x), now compare f(x) with f(y)
4. But we know that x = y so therefore the compiler can prove f(x) = f(y) by
syntactic substitution
5. Therefore the compiler can prove IsEqual(t1, f(x), f(x)) = IsEqual(t1, f(x), f(y))
6. Therefore Refl(t1, f(x)) successfully type checks
So far we’ve seen how to prove that two values are equal to each other. What about if we want to prove that two values are not equal to each other?
Let’s introduce a new keyword called impossible
to our pattern matches. As
we’ve seen, when pattern matching we introduce new equality constraints for our
type checker. However, if and only if the type checker detects that equality
constraints are self-contradictory, then we are allowed to write impossible
on
the right-hand side, which will always type check with whatever the overall type
signature is.
function onlyIfXAndYAreEqual(
x: Boolean,
y: Boolean,
xEqualToY: IsEqual(Boolean, x, y)
): Boolean = {
case (x, y, xEqualToY) of {
(True, False, Refl(unused0, unused1)) => impossible
(False, True, Refl(unused0, unused1)) => impossible
(True, True, Refl(unused0, unused1)) => True
(False, False, Refl(unused0, unused1)) => False
}
}
Let’s look at (True, False, Refl(unused0, unused1)) => impossible
in more
detail.
1. Derive x = True, y = False, and IsEqual(Boolean, x, y) = IsEqual(Boolean, x,
x). The last one comes from matching on Refl.
2. IsEqual(Boolean, x, y) = IsEqual(Boolean, x, x) implies that y = x
3. But if y = x and x = True and y = False, then True = False
4. This is a contradiction among base constructors of a type and therefore the
typechecker permits usage of the keyword impossible in this branch
5. impossible also works with any type, in particular impossible unifies with
Boolean, so this branch fuly typechecks
Note that syntactic equality was enough to prove equality, i.e. the type
checker knows that f(x) = f(x), but syntactic inequality is not enough to prove
inequality, i.e. the typechecker cannot conclude that simply because x and y are
different variable names that x != y. In order to use impossible
the type
checker must derive a contradiction among constructors, e.g. True = False
.
So for example the following function does not type check because it uses
impossible
incorrectly.
function onlyIfXAndYAreEqualBrokenVersion(
x: Boolean,
y: Boolean,
xEqualToY: IsEqual(Boolean, x, y)
): Boolean = {
// Notice that we fail to match against xEqualToY
case (x, y) of {
// The compiler now knows that x = True and y = False, but it cannot derive
// any contradictions among its equality constraints because it doesn't
// know x = y since you haven't pattern matched against Refl, so it
// rejects impossible and this whole thing fails with a type error
(True, False) => impossible
// Same thing here
(False, True) => impossible
(True, True) => True
(False, False) => False
}
}
Before we continue, I’m going to introduce a new type, suggestively called
Contradiction
that has no constructors.
type Contradiction constructors {}
This means Contradiction
cannot ever exist since there are no actual values of type
Contradiction
. Well if you can’t actually create a value of type Contradiction
, how could
you ever return it from a function? Well we just said that impossible
works
with any type… so what happens if every branch of your case statement ends in
impossible
? Well then you can use any type, including Contradiction
.
function xCannotBeBothTrueAndFalse(
x: Boolean,
xEqualToTrue: IsEqual(Boolean, x, True),
xEqualToFalse: IsEqual(Boolean, x, False)
): Contradiction = {
case (x, xEqualToTrue, xEqualToFalse) of {
(True, Refl(unused0, unused1), Refl(unused2, unused3)) =>
impossible
(False, Refl(unused0, unused1), Refl(unused2, unused3)) =>
impossible
}
}
// Since impossible matches any type, we could've done
// function xCannotBeBothTrueAndFalse(...): Boolean
// or
// function xCannotBeBothTrueAndFalse(...): NaturalNumber
// but those are a lot less semantically useful than Contradiction
//
// We'll revisit this with our principleOfExplosion function later
Note here that in the first branch Refl(unused0, unused1)
implies x = True
and Refl(unused2, unused3)
implies x = False
so again the type checker
derives the contradiction True = False
which it can use to validate the usage
of impossible
.
In fact we can see that the two Refl
s are enough to derive a contradiction, so
we don’t actually need to pattern match on x
at all.
function xCannotBeBothTrueAndFalseNewVersion(
x: Boolean,
xEqualToTrue: IsEqual(Boolean, x, True),
xEqualToFalse: IsEqual(Boolean, x, False)
): Contradiction = {
case (xEqualToTrue, xEqualToFalse) of {
(Refl(unused0, unused1), Refl(unused2, unused3)) =>
impossible
}
}
Hence we can read any function with Contradiction
as its output type as proving that
all its input types cannot all simultaneously be true, they must contradict each
other.
And indeed, with this interpretation it becomes possible to state that in fact
IsEqual(Boolean, True, False)
is impossible. By pattern matching on Refl
we
immediately introduce a contradiction among base constructors, namely True
and
False
.
function trueCannotBeFalse(
trueEqualToFalse: IsEqual(Boolean, True, False)
): Contradiction = {
case trueEqualToFalse of {
Refl(unused0, unused1) => impossible
}
}
Here’s another example.
function xCannotBeEqualToNotX(
x: Boolean,
xEqualToNotX: IsEqual(Boolean, x, not(x))
): Contradiction = {
case (x, xEqualToNotX) of {
(True, Refl(unused0, unused1)) => impossible
(False, Refl(unused0, unused1)) => impossible
}
}
Unlike xCannotBeBothTrueAndFalseNewVersion
here we must pattern match on x
.
By pattern matching on x
we allow evaluation of not(x)
to occur, so that we
end up with in the first branch the constraints x = True
(by pattern matching
on x
), not(x) = x
(by pattern matching on Refl
) and then not(True) = True
and finally False = True
, which is a contradiction in constructors and
allows us to write impossible
.
Having an explicit Contradiction
type also lets us nicely write out the
https://en.wikipedia.org/wiki/Principle_of_explosion.
function principleOfExplosion(
proposition: Type,
contradiction: Contradiction
): proposition = {
// Note that an empty pattern match unifies with any return type
case contradiction of {}
}
// Now, assuming we have a consistent type system, we can't actually create a
// top-level value of `Contradiction`, but perhaps we are in a function that has
// contradictory values passed in.
val someContradiction: Contradiction = ...
val anythingGoes: IsEqual(Boolean, True, False) =
principleOfExplosion(IsEqual(Boolean, True, False), someContradiction)
Using type equality to prove runtime properties about our code: NaturalNumber edition
Now let’s move away from Boolean
s and make some propositions about
NaturalNumber
s.
function addingZeroOnLeftDoesNothing(x: NaturalNumber): IsEqual(NaturalNumber, x, add(Zero, x)) = {
Refl(NaturalNumber, x)
}
In this case no pattern match is necessary. add(Zero, x)
is able to
immediately evaluate to x
because add
pattern matches on the Zero
and just
returns x
. Roughly the steps are as follows.
1. Compare IsEqual(NaturalNumber, x, add(Zero, x)) with IsEqual(NaturalNumber, x, x)
2. NaturalNumber = NaturalNumber, x = x, now compare add(Zero, x) with x
3. add(Zero, x) is evaluated via the definition of add to be
case Zero of {
Zero => x
Successor(prev) => Successor(add(prev, x))
}
which just becomes x.
4. Therefore the compiler can prove that add(Zero, x) = x
5. Therefore the compiler can prove IsEqual(NaturalNumber, x, add(Zero, x)) = IsEqual(NaturalNumber, x, x)
What about the right side?
function addingZeroOnRightDoesNothing(x: NaturalNumber): IsEqual(NaturalNumber, x, add(x, Zero)) = {
// Type error!
Refl(NaturalNumber, x)
}
This doesn’t work with Refl
because now when the type checker goes to evaluate
add
it needs to pattern match against an abstract variable x
and therefore
cannot proceed.
// Reminder that add looks like this
function add(x, y: NaturalNumber): NaturalNumber = {
case x of {
Zero => y
Successor(prev) => Successor(add(prev, y))
}
}
// When you have add(x, Zero), we aren't matching against Zero anymore, but an
// abstract x, since Zero is now on the right-hand side
case x of {
Zero => x
Successor(prev) => Successor(add(prev, x))
}
So we need to manually pattern match on x
at an earlier stage to constrain the
value of x
for the compiler to successfully evaluate the pattern match.
function addingZeroOnRightDoesNothing(x: NaturalNumber): IsEqual(NaturalNumber, x, add(x, Zero)) = {
case x of {
Zero => Refl(NaturalNumber, x) // Refl(NaturalNumber, Zero) would also work here
Successor(x0) => ???
}
}
The Zero
branch is pretty easy. We can just directly use Zero
because if the
type checker knows x = Zero
then it can easily evaluate add(x, Zero)
to just
Zero
. However, the Successor
branch is a bit trickier. We now know that x = Successor(x0)
, which is enough for add
to evaluate once.
Let’s walk through how this affects evaluation of add
.
// Again, reminder that add looks like this
function add(x, y: NaturalNumber): NaturalNumber = {
case x of {
Zero => y
Successor(prev) => Successor(add(prev, y))
}
}
// Let's focus on what happens if x = Successor(x0) and y = Zero
case (Successor(x0)) of {
Zero => Zero // Note that the second argument to add was Zero, replacing y here
Successor(prev) => Successor(add(prev, Zero)) // Again note that the second argument was Zero
}
// So therefore add(Successor(x0), Zero) just evaluates to
Successor(add(x0, Zero))
However, Successor(add(x0, Zero))
still isn’t syntactically the
same thing as x
(or Successor(x0)
) since the compiler knows x = Successor(x0)
in this branch)!
We have one trick up our sleeves, which is recursion on
addingZeroOnRightDoesNothing
. We can call addingZeroOnRightDoesNothing(x0)
which has the type IsEqual(NaturalNumber, x0, add(x0, Zero))
.
That looks pretty promising! If we could somehow use IsEqual(NaturalNumber, x0, add(x0, Zero))
to tell the compiler that Successor(x0) = Successor(add(x0, Zero))
, we would then be able to fill in that ???
.
function addingZeroOnRightDoesNothing(x: NaturalNumber): IsEqual(NaturalNumber, x, add(x, Zero)) = {
case x of {
Zero => Refl(NaturalNumber, x) // Refl(NaturalNumber, Zero) would also work here
Successor(x0) =>
// Assume we knew the following three things:
//
// x = Successor(x0)
// add(x, Zero) = Successor(add(x0, Zero))
// Successor(x0) = Successor(add(x0, Zero)) or equivalently Successor(add(x0, Zero)) = Successor(x0)
//
// Then we know that x = add(x, Zero) by the following substitutions
//
// x = add(x, Zero)
// Successor(x0) = add(x, Zero)
// Successor(x0) = Successor(add(x0, Zero))
// Successor(x0) = Successor(x0)
//
// And we could therefore write Refl(NaturalNumber, x) and be done.
// We already know x = Successor(x0) by pattern matching.
// We know add(x, Zero) = Successor(add(x0, Zero)) by evaluating add
// after taking into account x = Successor(x0)
// So we just need to prove Successor(x0) = Successor(add(x0, Zero))
???
}
}
Luckily we have that in the form of functionsPreserveEquality
that we
constructed earlier.
// I'm using a named argument syntax here just to make it clear what arguments
// are what
val desiredEquality: IsEqual(
NaturalNumber,
Successor(x0),
Successor(add(x0, Zero))
) = functionsPreserveEquality(
t0 = NaturalNumber,
t1 = NaturalNumber,
x = x0,
y = add(x0, Zero),
f = function(a) { Successor(a) },
xEqualToY = // Insert some value of type IsEqual(NaturalNumber, x0, add(x0, Zero))
)
So putting that all together we get
function addingZeroOnRightDoesNothing(x: NaturalNumber): IsEqual(NaturalNumber, x, add(x, Zero)) = {
case x of {
Zero => Refl(NaturalNumber, x) // Refl(NaturalNumber, Zero)
Successor(x0) =>
// This is exactly what can go into xEqualToY in desiredEquality
val recursiveEquality: IsEqual(NaturalNumber, x0, add(x0, Zero)) =
addingZeroOnRightDoesNothing(x0)
val desiredEquality: IsEqual(
NaturalNumber,
Successor(x0),
Successor(add(x0, Zero))
) = functionsPreserveEquality(
t0 = NaturalNumber,
t1 = NaturalNumber,
x = x0,
y = add(x0, Zero),
f = function(a) { Successor(a) },
xEqualToY = recursiveEquality
)
// Note that because of pattern matching on Successor(x0) we know
// x = Successor(x0)
// This allows add(x, Zero) to evaluate via add(Successor(x0), Zero)
// to become Successor(add(x0, Zero))
//
desiredEquality
}
}
Let’s examine why desiredEquality
type checks in a little bit more detail.
1. We compare IsEqual(NaturalNumber, x, add(x, Zero)) with
IsEqual(
NaturalNumber,
Successor(x0),
Successor(add(x0, Zero))
)
2. We know x = Successor(x0) by the pattern match.
3. By evaluating add(x, Zero) on x = Successor(x0) we know
add(x, Zero) = Successor(add(x0, Zero))
4. Therefore by substituting those equalities we find that our two IsEquals are
equal
Whew that one was a bit of a doozy! It gets easier once you’ve done a lot of these and start to develop some intuition.
Dependent types in data structures
We’ve seen so far how we can use dependent types to make assertions about your
code that can be proved in the type system. We can also use dependent types
to create data structures that directly obey some sort of invariant. The most
common example of this is length-indexed vectors. First let’s begin with an
ordinary list, which we construct through recursion in the same way we did for
NaturalNumber
.
type List(t: Type) constructors {
Nil(type: Type): List(type)
Cons(type: Type, value: type, restOfList: List(type)): List(type)
}
val listOfTwoElements = Cons(Boolean, True, Cons(Boolean, False, Nil(Boolean)))
Now we can create a length
function for this List
and use it to enforce
certain invariants in other functions.
function length(t: Type, list: List(t)): NaturalNumber = {
case list of {
Nil(unused) => Zero
Cons(unused0, unused1, rest) => Successor(length(t, rest))
}
}
// This is how it's used
val thisWillBeTwo: NaturalNumber = length(Boolean, listOfTwoElements)
function head(
t: Type,
list: List(t),
isNonZeroLength: IsEqual(Boolean, greaterThan(length(t, list), zero), True)
): t {
case list of {
Nil(unused) =>
case isNonZeroLength of {
Refl(unused0, unused1) =>
// We can use impossible here because our earlier Nil
// pattern match tells the compiler list = Nil(...) which
// allows it to completely evaluate `length` and therefore
// `greaterThan` to get
// `greaterThan(length(1, list), zero) = False`.
// But then matching against `Refl` gets the constraint
// `greaterThan(length(1, list), zero) = True`.
//
// This contradiction lets us then use impossible.
impossible
}
Cons(unusedType, value, unusedRest) =>
value
}
}
function append(t: Type, list0: List(t), list1: List(t)): List(t) = {
case list0 of {
Nil(unused) =>
list1
Cons(unusedType, value, restOfList) =>
Cons(t, value, append(t, restOfList, list1))
}
}
function appendCombinesLength(
t: Type,
list0: List(t),
list1: List(t),
): IsEqual(NaturalNumber, length(t, append(t, list0, list1)), add(length(t, list0), length(t, list1))) = {
// Proof omitted for brevity
}
Or we could directly encode the length of a list into its type.
type LengthIndexedList(t: Type, length: NaturalNumber) constructors {
// LI for LengthIndexed
LINil(type: Type): LengthIndexedList(type, zero)
LICons(
type: Type,
value: type,
lengthOfRestOfList: NaturalNumber,
restOfList: LengthIndexedList(type, lengthOfRestOfList)
): List(type, Successor(lengthOfRestOfList))
}
val aListOfOneElement: LengthIndexedList(Boolean, one) =
LICons(Boolean, True, zero, LINil(Boolean))
// Note that lengthOfRestOfList restricts what restOfList can be
// So e.g. the following is a type error
val thisDoesNotWork: LengthIndexedList(Boolean, one) =
LICons(Boolean, True, one, LINil(Boolean)) // notice one instead of zero
This way we can create an append
function that directly encodes the fact that
lengths of the lists sum together, rather than relying on a separate
appendCombinesLength
.
function append(
t: Type,
lengthOfList0: NaturalNumber,
lengthOfList1: NaturalNumber,
list0: LengthIndexedList(t: Type, length: lengthOfList0),
list1: LengthIndexedList(t: Type, length: lengthOfList1),
): LengthIndexedList(t: Type, length: add(lengthOfList0, lengthOfList1)) = {
case list0 of {
LINil(unused) =>
case lengthOfList0 of {
Zero =>
list1
Successor(unused) =>
impossible
}
LICons(unusedType, value, lengthOfRestOfList, restOfList) =>
case append(t, restOfList, list1) of {
LICons(t, value)
}
LICons(t, value, Successor(lengthOfRestOfList), append(t, restOfList, list1))
}
}
Personally I prefer using length
over LengthIndexedList
, but I’ll talk more
about that in a future post.
Using dependent types as a way of tagging data
Our final examples of dependent types won’t introduce anything new, but will move away from the proof-heavy style I’ve presented above to a much “lighter” style that only uses dependent types to tag certain bits of data with additional information.
This is the style of programming I personally prefer to use when using dependent types. I’ll talk about it more at length in a future post.
// Imagine that these constructors are private and cannot be accessed by other
// modules in our code
type Sanitized(s: String) private constructors {
IsSanitized(s: String): Sanitized(s)
}
// We may have some low-level way of checking whether a string is sanitized or
// not, but we generally won't be using this in other code
private function isStringSafe(s: String): Boolean = {
// Some implementation here...
}
// This is the main function we'll be using and the only way for other modules
// to create Sanitized data since the constructors are private
function checkIfStringIsSafe(s: String): Option(Sanitized(s)) = {
case isStringSafe(s) of {
True => Some(Sanitized(s), IsSanitized(s))
False => None(Sanitized(s))
}
}
// Imagine we have this function in another module. Now we're guaranteed that s
// must have been checked for safety at some point before passing it into
// processString.
// You can't ever forget to check the string!
function processString(s: String, stringIsSanitized: Sanitized(s)): SomeCustomType = {...}
val someString: String = ...
case (checkIfStringIsSafe(someString)) of {
// We don't use the first arguments of Some and None because those are just
// types. This is similar to how in a non-dependently-typed language when we
// match on e.g. an Option<Int> type we only match against Some(someInt) and
// None instead of also matching on the Int generic.
Some(unused, stringIsSafe) =>
processString(someString, stringIsSafe)
None(unused) => // raise some sort of error
}
// Now let's do an example of where the type system tells us we've messed up.
val someOtherString: String = ...
// Note that this is an error! We are checking someString instead of
// someOtherString
case (checkIfStringIsSafe(someString)) of {
Some(unused, stringIsSafe) =>
// This is a type error and the compiler would reject this code, because
// stringIsSafe here is associated with someString, not someOtherString.
// Dependent types ensure you are checking the correct string, not any
// old string.
processString(someOtherString, stringIsSafe)
None(unused) => // raise some sort of error
}
Of course sometimes we can’t simply check whether a string is safe or not. We can only run it through a sanitizer, which might give us a new and different string. So we can instead just tag the data that comes out of the sanitizer.
function sanitizeString(s: String): String = {
// Some implementation here...
}
function proofOfSanitization(s: String): Sanitized(sanitizeString(s)) = {
IsSanitized(sanitizeString(s))
}
val someString: String = ...
val sanitizedString: String = sanitizeString(someString)
// No matter how we implement sanitization, as long as the constructors for
// Sanitized are kept private, we always guaranteed that all strings passed to
// processString must have either been checked for or undergone sanitization.
processString(
sanitizedString,
// It may look like we're doing redundant work by calling sanitizeString
// again in proofOfSanitization. I'll talk in a future post about a feature
// that allows this to be purely a compile-time tag with no runtime
// performance consequences.
proofOfSanitization(someString)
)
We can encode the latter pattern in a concrete data type called a dependent pair, where the type of the second element depends on the value of the first, to “carry around the tag with its value.”
type DependentPair(t: Type, propType: t -> Type) {
CreateDependentPair(t: Type, propType: t -> Type, value: t, prop: propType t): DependentPair(t, propType)
}
val sanitizeStringWrapped(s: String): DependentPair(String, Sanitized) = {
CreateDependentPair(
String,
Sanitized,
sanitizeString(s),
// Same thing as function(s) { IsSanitized(s) }
IsSanitized
)
}
case sanitizeStringWrapped(someString) of {
CreateDependentPair(unused0, unused1, sanitizedString, proofOfSanitization) =>
processString(sanitizedString, proofOfSanitization)
}
Why do dependently typed languages tend to be pure functional languages?
This is a question that comes up from time to time and now that we’ve fully developed the basics of how to program with dependent types we can explore the answer. It boils down to the insight that dependent types assume immutability and function purity.
Let’s think back to the crucial fulcrum of dependent types: dependent pattern matching. We implicitly always assumed that the values we were matching against were immutable, otherwise it would be incorrect for the typechecker to assume that the different branches of the pattern match guaranteed what the runtime value of what we were matching against could be.
case someBoolean of {
True =>
// Our typechecker now assumes someBoolean is True, but that's not a
// safe assumption if someBoolean is mutable. Imagine what would happen
// if someBoolean could suddenly change from True to False here!
...
False => ...
}
This means that for most of the places we used dependent types, we implicitly assumed that the runtime values we were depending on were immutable.
As for function purity, dependent types rely on evaluation of functions at compile time to be able to type check things in the first place. This would become rapidly intractable if functions were mutable and so changed their behavior over the course of typechecking or if they had side effects and could do all sorts of crazy things as your compiler runs.
It is important to note that the way the typechecker evaluates functions does not have to be the same way that the the runtime system evaluates the code. For example, all throughout this post the typechecker has only been ever partially evaluating functions rather than fully evaluating them, which is usually illegal for a runtime system to do. You could also imagine that while the typechecker performs a very naive, step-by-step evaluation of functions, the runtime system compiles functions into something completely different so that e.g. recursion turns into loops. You could even imagine different evaluation regimes where the typechecker lazily evalates functions but the runtime system eagerly evaluates them.
The only constraint (but it is a big constraint) is that the typechecker and the runtime system must agree on equality. If the typechecker thinks two things are equal the runtime system should have no way of distinguishing them. If the typechecker thinks two things are not equal the runtime system should always be able to distinguish them. And of course if the typechecker is unable to determine equality then the runtime system can do whatever it wants.
Otherwise all your proofs about equality and inequality don’t mean anything when your code actually runs.
And so that concludes this introduction to dependent types! The next post will build upon this understanding of dependent types to talk about their trade-offs when it comes to production codebases and how to avoid their pitfalls and capitalize on their strengths.