Let’s begin by reviewing the classical definition. A **monad** is given by the following data:

- a category \({\mathcal{C}}\);
- an endofunctor \(T : {\mathcal{C}}\to {\mathcal{C}}\);
- natural transformations \(\eta : I \to T\) and \(\mu : T \circ T \to T\);

satisfying certains laws (namely: \(\mu \circ \eta T = \mu \circ T \eta = {\mathsf{id}}\) and \(\mu \circ T \mu = \mu \circ \mu T\)). Note that the category \({\mathcal{C}}\) is considered to be part of the data, rather than fixed beforehand.

In this post, I will illustrate a compact formulation of the above definition that can easily be generalised to include other similar notions, which appear from time to time in functional programming.

Here is the punchline:

A monad is a lax 2-functor from the terminal 2-category 1 to \({\mathsf{Cat}}\).

To make sense of this definition, we need to venture into the marvellous world of *higher categories*. If we take the definition of category that we are familiar with, we can regard it as some sort of 1-dimensional structure: we have a set of objects, which we can picture as points, and a set of arrows between them, which we imagine as (oriented) 1-dimensional lines.

It is then relatively easy to go one dimension up, and imagine an entity with 3 levels of structure: objects, morphisms, and 2-dimensional “cells” connecting arrows. This is what we call a 2-category.

More precisely, a 2-category is given by:

- a set of
*objects*(or 0-cells) - for any two objects \(x, y\), a set of
*morphisms*(or 1-cells) \({\mathsf{hom}}(x, y)\); - for any two objects \(x, y\), and any two morphisms \(f, g : {\mathsf{hom}}(x, y)\), a set of
*2-morphisms*\({\mathsf{hom}}_2 (f, g)\).

Of course, this cannot really be the complete definition of 2-category, as we also need to be able to compose morphisms and 2-morphisms, but we won’t go into much detail here. The interested reader can find more details on this nLab page.

The primary example of 2-category is \({\mathsf{Cat}}\), the 2-category of categories. Objects of \({\mathsf{Cat}}\) are (ordinary) categories (also called 1-categories), morphisms are functors, and 2-morphisms are natural transformations. Another example is the terminal 2-category, containing only 1 object, and no non-identity morphisms or 2-morphisms.

As always happens in mathematics, every new structure that we define is accompanied by a corresponding notion of morphism. Given 2-categories \({\mathcal{C}}\) and \({\mathcal{D}}\), we want to define what it means to give a “map” \({\mathcal{C}}\to {\mathcal{D}}\) that respects the 2-category structure. We call such maps *2-functors*.

As it turns out, there are multiple ways to give a definition of 2-functor. They differ in the amount of *strictness* that they require. More precisely, a 2-functor \({\mathcal{C}}\to {\mathcal{D}}\) is given by:

- a function \(F\) mapping objects of \({\mathcal{C}}\) to objects of \({\mathcal{D}}\);
- for any two objects \(x, y\) of \({\mathcal{C}}\), a function (also denoted \(F\)) mapping morphisms between \(x\) and \(y\) in \({\mathcal{C}}\) to morphisms between \(F x\) and \(F y\) in \({\mathcal{D}}\);
- for any two objects \(x, y\) of \({\mathcal{C}}\), and morphisms \(f, g : {\mathsf{hom}}(x, y)\), a function mapping 2-morphisms between \(f\) and \(g\) to 2-morphisms between \(F f\) and \(F g\);

subject to certain “functoriality” properties. We can make this functoriality requirement precise in a number of different (non-equivalent) ways.

First, we might directly generalise the functoriality properties for functors, and require: \[ \begin{aligned} & F{\mathsf{id}}= {\mathsf{id}}, \\ & F (g \circ f) = F g \circ F f. \end{aligned} \]

If we do that, we get the notion of *strict functor*. However, the elements appearing in the above equations are objects of certain categories (namely, \({\mathsf{hom}}\)-categories of \({\mathcal{D}}\)), and if category theory has taught us anything, it is the idea that comparing objects of categories up to equality is often not very fruitful.

Therefore, we are naturally lead to the notion of *pseudofunctor*, which weakens the equalities to *isomorphisms*: \[
\begin{aligned}
& F{\mathsf{id}}\cong {\mathsf{id}}, \\
& F (g \circ f) \cong F g \circ F f.
\end{aligned}
\]

However, we are interested in an even weaker notion here, called *lax 2-functor*, which replaces the isomorphisms above with arbitrary (possibly not invertible) 2-morphisms: \[
\begin{aligned}
& {\mathsf{id}}\to F {\mathsf{id}}, \\
& F g \circ F f \to F (g \circ f).
\end{aligned}
\]

The direction of the arrows can be reversed, yielding the dual notion of *oplax functor*, which we won’t need here.

Now we understand all the terminology used in the definition above. Let \(F : 1 \to {\mathsf{Cat}}\) be a lax 2-functor. At the level of objects, \(F\) maps the unique object of \(1\) to \({\mathsf{Cat}}\), which amounts to just picking a single category \({\mathcal{C}}\). At the level of morphisms, we map the single (identity) morphisms of 1 to a functor \(T: {\mathcal{C}}\to {\mathcal{C}}\). Now, the “lax structure” produces 2-morphisms in \({\mathsf{Cat}}\) (i.e. natural transformations): \(\eta : I \to T\) and \(\mu : T \circ T \to T\).

So it looks like lax 2-functors to \({\mathsf{Cat}}\), at least ignoring certain details that we haven’t discussed, correspond perfectly to the classical definition of monad. I encourage the interested reader to look at the complete definition of lax functor, and verify that everything does indeed match, including the monad laws.

After all this work, generalising the definition is now extremely easy: just replace the 2-category 1 with a more general category. A simple example is: given a monoid \(S\), regard \(S\) as a 2-category with 1 object and no non-trivial 2-morphism. Lax functors \(S \to {\mathsf{Cat}}\) are exactly Wadler’s indexed monads.

It is also possible (although slightly more involved) to recover Atkey’s parameterised monads as lax functors. I’ll leave this as a fun exercise.

]]>In the previous post, we investigated free monads, i.e. those whose monad algebras are the same as algebras of some functor. In general, however, not all monads are free, not even in Haskell! Nevertheless, monad algebras can often be regarded as algebras of some functor, satisfying certain “algebraic laws”.

In the first post of the series, we looked at the list monad \(L\). We observed that monad algebras of \(L\) can be regarded as monoids, which is to say they are algebras of the functor \(F\) given by \(F X = 1 + X²\), subject to unit and associativity laws.

The list example is interesting, because it suggests a strong connection between monads and algebraic structures. Can we always regard algebraic structures (such as groups, rings, vector spaces, etc…) as the algebras of some monad?

In this post, we will try to generalise this example to other monads by developing a categorical definition of *algebraic theory* based on monads and monad algebras.

The theory of monoids is a particular instance of a general pattern that occurs over and over in mathematics. We have a set of operations, each with a specified arity, and a set of laws that these operations are required to satisfy. The laws all have the form of equations with universally quantified variables.

For monoids, we have two operations: a unit \(e\), which is a nullary operation (i.e. a constant), and multiplication \(·\), a binary operation (written infix). The laws should be very familiar: \[ \begin{aligned} e · x & = x \\ x · e & = x \\ x · (y · z) & = (x · y) · z \end{aligned} \] where every free variable is implicitly considered to be universally quantified.

As we observed in the first post of this series, the functor \(F\) corresponding to the algebraic theory of monoids is given by \(F X = 1 + X²\). Algebras of \(F\) are sets equipped with the operations of a monoid, but there is no requirement that they satisfy the laws.

Since \(F\) is polynomial, it has an algebraically free monad \(F^*\), so \(F^* X\) is in particular an \(F\)-algebra. If we focus on the first law above, we see that it just consists of a pair of terms in \(F^* X\), parameterised over some unspecified element \(x : X\). This can be expressed as a natural transformation: \[ X → F^*X × F^* X \] The same holds for the second law, while the third can be regarded as a function: \[ X³ → F^*X × F^*X \]

We can assemble those three functions into a single datum, consisting of a pair of natural transformations: \[ X + X + X³ ⇉ F^* X \]

If we set \(G X = X + X + X³\), we have that the laws can be summed up concisely by giving a pair of natural transformations: \[
G ⇉ F^*,
\] which, since algebraically free monads are free, is the same as a parallel pair of monad morphisms: \[
l, r : G^* ⇉ F^*,
\] and this is something that we can easily generalise. Namely, we say that an *algebraic theory* is a parallel pair of morphisms of algebraically free monads.

Note that the terminology here is a bit fuzzy. Some authors might refer to the parallel pair above as a *presentation* of an algebraic theory. It ultimately depends on whether or not you want to consider theories with different syntactical presentations but identical models to be equal. With our definition, they would be considered different.

To really motivate this definition, we now need to explain what the models of an algebraic theory are. This is quite easy if we just follow our derivation of the general definition from the example.

We know that a monoid is an \(F\)-algebra \(θ : F X → X\) that satisfies the monoid laws. Since \(F\)-algebras are the same as \(F^*\)-algebras, we can work with the corresponding \(F^*\)-algebra instead, which we denote by \(θ^* : F^*X → X\).

This algebra satisfies the laws exactly when the two natural transformations above become equal when composed with \(θ^*\), i.e. when \(θ^* ∘ l = θ^* ∘ r\).

We thus define the category of models of an algebraic theory \(l, r : G^* ⇉ F^*\) as the full subcategory of \(\cat{Alg}_F ≅ \cat{mAlg}_{F^*}\) consisting of all those monad algebras \(θ^* : F^* X → X\) such that \(θ^* ∘ l = θ^* ∘ r\).

Now, we know that, in the case of monoids, this subcategory is monadic over \(\set\), but is this true in general?

We begin by defining the notion of a *free model* for some algebraic theory. In the monoid example, we used the list monad to build a monoid out of any set, and then proceeded to prove that this construction gives the left adjoint of the forgetful functor \(\cat{Alg}_F → \set\). This is of course the first step towards proving monadicity.

In general, there does not seem to be a way to generalise this construction directly. We pulled the list monad out of a hat, and showed that it was exactly the monad that we were looking for. We did not derive it using the functor \(F\) in a systematic way that we could replicate in the general case.

Fortunately, there is another way to produce the free monoid over a set \(X\). We start with the free \(F\)-algebra \(F^* X\), then *quotient* it according to the laws. Intuitively, we define an equivalence relation that relates two elements \(t₁\) and \(t₂\) whenever there is a law that requires them to be equal.

The straightforward way to formalise this intuition is to take the equivalence relation generated by such pairs \((t₁, t₂)\), then take the corresponding quotient. A more conceptual approach is to say that \(T X\) is obtained as a coequaliser: \[ G^* X ⇉ F^* X → T X. \]

In the monoid example, \(F^* X\) is the set of all trees with leaves labelled by elements of \(X\). If we regard a tree as a parenthesised string of elements of \(X\), then the equivalence relation on \(F^*\) given by the coequaliser above corresponds to identifying strings with the same underlying *list* of elements but possibly different parenthesizations. Therefore, \(T X\) is clearly isomorphic to the list monad.

More generally, we can take any algebraic theory, which we defined as a parallel pair of monad morphisms between free monads \(F^*\) and \(G^*\), and take the coequaliser in the category of (finitary) monads.

With some reasonable assumptions on the functors \(F\) and \(G\), we can show that this coequaliser always exists, and that the algebras of the resulting monad are exactly the models of the algebraic theory we started with.

This concludes my series on the underlying theory of free monads and their relation with universal algebra.

Here is a list of resources where you can learn more about this topic:

“Triple” is the old term for monads. Chapter 3 is about the monadicity theorem and some of the material that I covered in this series.

Chapter 6 is about monads and their algebras.

The last chapter explains the relationship between initial algebras and monadic functors.

A very comprehensive resource, with detailed proofs.

]]>In the previous post, I introduced the notion of *monadic functor*, exemplified by the forgetful functor from the category of monoids to \(\set\). We saw that monoids form a subcategory of the category of algebras of the functor \(F\) defined by \(F X = 1 + X²\), and we observed that those are the same as the monad algebras of the list monad.

More generally, we can try different subcategories of \(\cat{Alg}_F\) and check whether they are monadic as well. So let’s start with possibly the simplest one: the whole of \(\cat{Alg}_F\).

This leads us to the following definition: we say that an endofunctor \(F\) *admits an algebraically free monad* if \(\cat{Alg}_F\) is monadic. The corresponding monad is called the *algebraically free monad* over \(F\).

Informally, the algebraically free monad over \(F\) is a monad \(T\) such that monad algebras of \(T\) are the same as functor algebras of \(F\).

Unfortunately, not all functors admit an algebraically free monad. For example, it is easy to see that the powerset functor does not.

The free package on Hackage defines something called “free monad” for every Haskell functor. What does this have to do with the notion of algebraically free monad defined above?

Here is the definition of `Free`

from the above package:

```
data Free f a
= Pure a
| Free (f (Free f a))
```

Translating into categorical language, we can define, for an endofunctor \(F\), the functor \(F^*\), which returns, for a set \(X\), a fixpoint of the functor \[ G Y = X + F Y. \]

Let’s assume that the fixpoint is to be intended as inductive, i.e. as an initial algebra. Therefore, we get, for all objects \(X\), an initial algebra: \[ X + F (F^* X) → F^* X. \]

Of course, those initial algebras might not exist, but they do if we choose \(F\) carefully. For example, if \(F\) is polynomial, then all the functors \(G\) above are also polynomial, thus they have initial algebras.

In general, if we assume that those initial algebras all exist, then we can prove that the resulting functor \(F^*\) is a monad, and is indeed the algebraically free monad over \(F\).

We will first show that \(F^*\) allows us to define a left adjoint \(L\) for the forgetful functor \(U : \cat{Alg}_F → \set\). In fact, for any set \(X\), let the carrier of \(L X\) be \(F^* X\), and define the algebra morphism by restriction from the initial algebra structure on \(F^* X\): \[ F (F^* X) → X + F (F^* X) → F^* X. \]

By definition, \(F^* X\) is the initial object in the category of algebras of the functor \(Y ↦ X + F Y\). Moreover, it is easy to see that the latter category is equivalent to the comma category \((X ↓ U)\), where the equivalence maps \(F^* X\) to the obvious morphism \(X → U L X\). By the characterisation of adjunctions in terms of universal arrows, it follows that \(L\) is left adjoint to \(U\). Clearly, \(U L = F^*\), therefore \(F^*\) is a monad.

To conclude the proof, we need to show that the adjunction \(L ⊣ U\) is monadic, i.e. that the comparison functor from \(F\)-algebras to \(F^*\)-algebras is an equivalence. One way to do that is to appeal to Beck’s monadicity theorem. Verifying the hypotheses is a simple exercise.

It is also instructive to look at the comparison functor as implemented in haskell:

```
iter :: Functor f => (f x → x) → (Free f x → x)
iter θ (Pure x) = x
iter θ (Free t) = θ (fmap (iter θ) t)
```

and its inverse

```
uniter : Functor f => (Free f x → x) → (f x → x)
uniter ψ = ψ . liftF
where liftF = Free . fmap Pure
```

It is not hard to prove directly, using equational reasoning, that `iter θ`

is a monad algebra, and that `iter`

and `uniter`

are inverses to each other.

The documentation for `Free`

says:

A Monad

`n`

is a free Monad for`f`

if every monad homomorphism from`n`

to another monad`m`

is equivalent to a natural transformation from`f`

to`m`

which doesn’t look at all like our definition of algebraically free monad. Rather, this says that \(N\) is defined to be the *free monad* over \(F\) if the canonical natural transformation \(F → N\) is a universal arrow from \(F\) to the forgetful functor \(\cat{Mon}(\set) → \cat{Func}(\set, \set)\).

If that forgetful functor had a left adjoint, then we could just say that the free monad is obtained by applying this left adjoint to any endofunctor. This is actually the case if we replace \(\set\) with a so-called *algebraically complete category*, such as the ones modelled by Haskell, where the left adjoint is given by the (higher order) functor `Free`

.

In \(\set\), however, we need to stick to the more awkward definition in terms of universal arrows, as not all functors are going to admit free monads. In any case, the relationship with the previously defined notion of algebraically free monad is not immediately clear.

Fortunately, we can prove that a monad is algebraically free if and only if it is free! Proving that an algebraically free monad \(F^*\) on \(F\) is free amounts to proving that the following natural transformation (corresponding to `liftF`

in the Haskell code above): \[
\require{AMScd}
\begin{CD}
F X @>{F η}>> F (F^* X) @>>> F^* X
\end{CD}
\] is universal, which is a simple exercise.

To prove the converse, we will be using Haskell notation. Suppose given a functor `f`

, and a monad `t`

that is free on `f`

. Therefore, we have a natural transformation:

`l :: f x → t x`

and a function that implements the universal property for `t`

:

`hoist :: Monad m => (∀ x . f x → m x) → t x → m x`

Now we define a functor \(\set → \cat{Alg}_f\) which is going to be the left adjoint of the forgetful functor. The carrier of this functor is given by `t`

itself, so we only need to define the algebra morphism:

```
alg :: f (t x) → t x
alg u = join (l u)
```

To show that this functor is the sought left adjoint, we have to fix a type `x`

and an `f`

-algebra `θ : f y → y`

, define functions:

```
φ :: (x → y) → (t x → y)
ψ :: (t x → y) → (x → y)
```

then prove that `φ g`

is an `f`

-algebra morphism for all `g : x → y`

, and that `φ`

and `ψ`

are inverses to each other.

The function `ψ`

is easy to implement:

`ψ a h = h . return`

Defining `φ`

is a bit more involved. The only tool at our disposal to define functions out of `t x`

is `hoist`

. For that, we need a monad `m`

, and a natural transformation `f → m`

.

The trick is to consider the *continuation monad* `Cont y`

. Using `θ`

, we define a natural transformation

```
w :: f z → Cont y z
w u = Cont (\k -> θ (fmap k u))
```

on which we can apply the universal property of `t`

to get `φ`

:

`φ g = (`runCont` id) . hoist w . fmap g`

From here, the proof proceeds by straightforward equational reasoning, and is left as an exercise.

We looked at two definitions of “free monad”, proved that they are equivalent, and shown the relationship with the Haskell definition of `Free`

. In the next post, we will resume our discussion of algebraic theories “with laws” and try to approach them from the point of view of free monads and monadic functors.

Free monads can be used in Haskell for modelling a number of different concepts: trees with arbitrary branching, terms with free variables, or program fragments of an EDSL.

This series of posts is *not* an introduction to free monads in Haskell, but to the underlying theory. In the following, we will work in the category \(\set\) of sets and functions. However, most of what we say can be trivially generalised to an arbitrary category.

If \(F\) is an endofunctor on \(\set\), an **algebra** of \(F\) is a set \(X\) (called its *carrier*), together with a morphism \(FX → X\). Algebras of \(F\) form a category, where morphisms are functions of their respective carriers that make the obvious square commute.

Bartosz Milewski wrote a nice introductory post on functor algebras from the point of view of functional programming, which I strongly recommend reading to get a feel for why it is useful to consider such objects.

More abstractly, a functor \(F : \set → \set\) generalises the notion of a *signature* of an algebraic theory. For a signature with \(a_i\) operators of arity \(i\), for \(i = 0, \ldots, n\), the corresponding functor is the polynomial: \[
F X = a₀ + a₁ × X + ⋯ + a_n × X^n,
\] where the natural number \(a_i\) denotes a finite set of cardinality \(a_i\).

For example, the theory of monoids has 1 nullary operation, and 1 binary operation. That results in the functor: \[ F X = 1 + X^2 \]

Suppose that \((X, θ)\) is an algebra for this particular functor. That is, \(X\) is a set, and \(θ\) is a function \(1 + X² → X\). We can split \(θ\) into its two components: \[ θ_e : 1 → X, \] which we can simply think of as an element of \(X\), and \[ θ_m : X × X → X. \]

So we see that an algebra for \(F\) is exactly a set, together with the operations of a monoid. However, there is nothing that tells us that \(X\) is indeed a monoid with those operations!

In fact, for \(X\) to be a monoid, the operations above need to satisfy the following laws: \[ \begin{aligned} & θ_m (θ_e(∗), x) = x \\ & θ_m (x, θ_e(∗)) = x \\ & θ_m (θ_m (x, y), z) = θ_m (x, θ_m (y, z)). \end{aligned} \]

However, any two operations \(θ_e\) and \(θ_m\) with the above types can be assembled into an \(F\)-algebra, regardless of whether they do satisfy the monoid laws or not.

The above example shows that functor algebras don’t quite capture the general notion of “algebraic structure” in the usual sense. They can express the idea of a set equipped with operations complying to a given signature, but we cannot enforce any sort of *laws* on those operations.

For the monoid example above, we noticed that we can realise any actual monoid as an \(F\)-algebra (for \(FX = 1 + X²\)), but that not every such algebra is a monoid. This means that monoids can be regarded as the objects of the subcategory of \(\cat{Alg}_F\) consisting of the “lawful” algebras (exercise: make this statement precise and prove it).

Therefore, we have the following commutative diagram of functors: \[ \require{AMScd} \begin{CD} \mathsf{Mon} @>⊆>> \mathsf{Alg}_F\\ @VUVV @VVV\\ \mathsf{Set} @>=>> \mathsf{Set} \end{CD} \]

and it is easy to see that \(U\) (which is just the restriction of the obvious forgetful functor \(\cat{Alg}_F → \set\) on the right side of the diagram) has a left adjoint \(L\), the functor that returns the free monoid on a set.

Explicitly, \(LX\) has \(X^*\) as carrier (i.e. the set of *lists* of elements of \(X\)), and the algebra is given by the coproduct of the function \(1 → X^*\) that selects the empty list, and the list concatenation function \(X^* × X^* → X^*\).

In Haskell, this algebra looks like:

```
alg :: Either () ([x], [x]) → [x]
alg (Left _) = []
alg (Right (xs, ys)) = xs ++ ys
```

The endofunctor \(UL\), obtained by taking the carrier of the free monoid, is a monad, namely the *list monad*.

Given a monad \((T, η, μ)\) on \(\set\), a monad algebra of \(T\) is an algebra \((X, θ)\) of the underlying functor of \(T\), such that the following two diagrams commute:

\[ \begin{CD} X @>η>> T X \\ @V=VV @VVθV\\ X @>=>> X \end{CD} \]

\[ \begin{CD} T(T X) @>μ>> T X \\ @V{T θ}VV @VVθV \\ T X @>θ>> X \end{CD} \]

In Haskell notation, this means that the following two equations are satisfied:

```
θ (return x) = x
θ (fmap θ m) = θ (join m)
```

In the case where the monad \(T\) returns the set of “terms” of some language for a given set of free variables, a monad algebra can be thought of as an evaluation function.

The first law says that a variable is evaluated to itself, while the second law expresses the fact that when you have a “term of subterms”, you can either evaluate every subterm and then evaluate the resulting term, or regard it as a single term and evaluate it directly, and these two procedures should give the same result.

Naturally, monad algebras of \(T\) form a full subcategory of \(\cat{Alg}_T\) which we denote by \(\cat{mAlg}_T\).

We can now go back to our previous example, and look at what the monad algebras for the list monad are. Suppose we have a set \(X\) and a function \(θ : X^* → X\) satisfying the two laws stated above.

We can now define a monoid instance for \(X\). In Haskell, it looks like this:

```
instance Monoid X where
empty = θ []
mappend x y = θ [x, y]
```

The monoid laws follow easily from the monad algebra laws. Verifying them explicitly is a useful (and fun!) exercise. Vice versa, any monoid can be given a structure of a \(T\)-algebra, simply by taking `mconcat`

as \(θ\).

Therefore, we can extend the previous diagram of functors with an equivalence of categories: \[
\begin{CD}
\mathsf{mAlg}_T @>≅>> \mathsf{Mon} @>⊆>> \mathsf{Alg}_F\\
@VVV @VUVV @VVV\\
\mathsf{Set} @>=>> \mathsf{Set} @>=>> \mathsf{Set}
\end{CD}
\] where the top-left equivalence (which is actually an isomorphism) is determined by the `Monoid`

instance that we defined above, while its inverse is given by `mconcat`

.

Let’s step back at this whole derivation, and reflect on what it is exactly that we have proved. We started with some category of “lawful” algebras, a subcategory of \(\cat{Alg}_F\) for some endofunctor \(F\). We then observed that the forgetful functor from this category to \(\set\) admits a left adjoint \(L\). We then considered monad algebras of the monad \(UL\), and we finally observed that these are exactly those “lawful” algebras that we started with!

We will now generalise the previous example to an arbitrary category of algebra-like objects.

Suppose \(\cat{D}\) is a category equipped with a functor \(G : \cat{D} → \set\). We want to think of \(G\) as some sort of “forgetful” functor, stripping away all the structure on the objects of \(\cat{D}\), and returning just their carrier.

To make this intuition precise, we say that \(G\) is *monadic* if:

- \(G\) has a left adjoint \(L\)
- The
*comparison functor*\(\cat{D} → \cat{mAlg}_T\) is an equivalence of categories, where \(T = GL\).

The comparison functor is something that we can define for any adjunction \(L ⊢ G\), and it works as follows. For any object \(A : \cat{D}\), it returns the monad algebra structure on \(G A\) given by \(G \epsilon\), where \(\epsilon\) is the counit of the adjunction (exercise: check all the details).

So, what this definition is saying is that a functor is monadic if it really is the forgetful functor for the category of monad algebras for some monad. Sometimes, we say that a *category* is monadic, when the functor \(G\) is clear.

The monoid example above can then be summarised by saying that the category of monoids is monadic.

I’ll stop here for now. In the next post we will look at algebraically free monads and how they relate to the corresponding Haskell definition.

]]>The original proof by Voevodsky has been simplified over time, and eventually assumed the distilled form presented in the HoTT book.

All the various versions of the proof have roughly the same outline. They first show that the *weak function extensionality principle* (WFEP) follows from univalence, and then prove that this is enough to establish function extensionality.

Following the book, WFEP is the statement that contractible types are closed under \(Π\), i.e.:

```
WFEP : ∀ i j → Set _
WFEP i j = {A : Set i}{B : A → Set j}
→ ((x : A) → contr (B x))
→ contr ((x : A) → B x)
```

Showing that WFEP implies function extensionality does not need univalence, and is quite straightforward. First, we define what we mean by function extensionality:

```
Funext : ∀ i j → Set _
Funext i j = {A : Set i}{B : A → Set j}
→ {f g : (x : A) → B x}
→ ((x : A) → f x ≡ g x)
→ f ≡ g
```

Now we want to show the following:

```
wfep-to-funext : ∀ {i}{j} → WFEP i j → Funext i j
wfep-to-funext {i}{j} wfep {A}{B}{f}{g} h = p
where
```

To prove that \(f\) and \(g\) are equal, we show that they both have values in the following dependent type, which we can think of as a subtype of \(B(x)\) for all \(x : A\):

```
C : A → Set j
C x = Σ (B x) λ y → f x ≡ y
```

We denote by \(f'\) and \(g'\) the range restrictions of \(f\) and \(g\) to \(C\):

```
f' g' : (x : A) → C x
f' x = (f x , refl)
g' x = (g x , h x)
```

where we made use of the homotopy \(h\) between \(f\) and \(g\) to show that \(g\) has values in \(C\). Now, \(C(x)\) is a singleton for all \(x : A\), so, by WFEP, \(f'\) and \(g'\) have the same contractible type, hence they are equal:

```
p' : f' ≡ g'
p' = contr⇒prop (wfep (λ x → singl-contr (f x))) f' g'
```

The fact that \(f\) and \(g\) are equal then follows immediately by applying the first projection and (implicitly) using \(η\) conversion for \(Π\)-types:

```
p : f ≡ g
p = ap (λ u x → proj₁ (u x)) p'
```

In the book, the strong version of extensionality, i.e.

```
StrongFunext : ∀ i j → Set _
StrongFunext i j = {A : Set i}{B : A → Set j}
→ {f g : (x : A) → B x}
→ ((x : A) → f x ≡ g x)
≅ (f ≡ g)
```

is obtained directly using a more sophisticated, but very similar argument.

Now we turn to proving WFEP itself. Most of the proofs I know use the fact that univalence implies a certain congruence rule for function-types, i.e. if \(B\) and \(B'\) are equivalent types, then \(A → B\) and \(A → B'\) are also equivalent, and furthermore the equivalence is given by precomposing with the equivalence between \(B\) and \(B'\).

However, if we have η conversion for record types, there is a much simpler way to obtain WFEP from univalence.

The idea is as follows: since \(B(x)\) is contractible for all \(x : A\), univalence implies that \(B(x) ≡ ⊤\), so the contractibility of \((x : A) → B(x)\) is a consequence of the contractibility of \(A → ⊤\), which is itself an immediate consequence of the definitional \(η\) rule for \(⊤\):

```
record ⊤ : Set j where
constructor tt
⊤-contr : contr ⊤
⊤-contr = tt , λ { tt → refl }
contr-exp-⊤ : ∀ {i}{A : Set i} → contr (A → ⊤)
contr-exp-⊤ = (λ _ → tt) , (λ f → refl)
```

However, the proof sketch above is missing a crucial step: even though \(B(x)\) is pointwise equal to \(⊤\), in order to substitute \(⊤\) for \(B(x)\) in the \(Π\)-type, we need to show that \(B ≡ λ \_ → ⊤\), but we’re not allowed to use function extensionality, yet!

Fortunately, we only need a very special case of function extensionality. So the trick here is to apply the argument to this special case first, and then use it to prove the general result.

First we prove WFEP for non-dependent \(Π\)-types, by formalising the above proof sketch.

```
nondep-wfep : ∀ {i j}{A : Set i}{B : Set j}
→ contr B
→ contr (A → B)
nondep-wfep {A = A}{B = B} hB = subst contr p contr-exp-⊤
where
p : (A → ⊤) ≡ (A → B)
p = ap (λ X → A → X) (unique-contr ⊤-contr hB)
```

Since \(B\) is non-dependent in this case, the proof goes through without function extensionality, so we don’t get stuck in an infinite regression: two iterations are enough!

Now we can prove the special case of function extensionality that we will need for the proof of full WFEP:

```
funext' : ∀ {i j}{A : Set i}{B : Set j}
→ (f : A → B)(b : B)(h : (x : A) → b ≡ f x)
→ (λ _ → b) ≡ f
funext' f b h =
ap (λ u x → proj₁ (u x))
(contr⇒prop (nondep-wfep (singl-contr b))
(λ _ → (b , refl))
(λ x → f x , h x))
```

Same proof as for `wfep-to-funext`

, only written more succinctly.

Finally, we are ready to prove WFEP:

```
wfep : ∀ {i j} → WFEP i j
wfep {i}{j}{A}{B} hB = subst contr p contr-exp-⊤
where
p₀ : (λ _ → ⊤) ≡ B
p₀ = funext' B ⊤ (λ x → unique-contr ⊤-contr (hB x))
p : (A → ⊤ {j}) ≡ ((x : A) → B x)
p = ap (λ Z → (x : A) → Z x) p₀
```

By putting it all together we get function extensionality:

```
funext : ∀ {i j} → Funext i j
funext = wfep-to-funext wfep
```

This proof can also be modified to work in a theory where \(⊤\) does not have definitional η conversion.

The only point where η is used is in the proof of `contr-exp-⊤`

above. So let’s define a version of \(⊤\) without η, and prove `contr-exp-⊤`

for it.

```
data ⊤ : Set j where
tt : ⊤
⊤-contr : contr ⊤
⊤-contr = tt , λ { tt → refl }
```

We begin by defining the automorphism \(k\) of \(⊤\) which maps everything to \(\mathsf{tt}\). Clearly, \(k\) is going to be the identity, but we can’t prove that until we have function extensionality.

```
k : ⊤ → ⊤
k _ = tt
k-we : weak-equiv k
k-we tt = Σ-contr ⊤-contr (λ _ → h↑ ⊤-contr tt tt)
```

Now we apply the argument sketched above, based on the fact that univalence implies congruence rules for function types. We extract an equality out of \(k\), and then transport it to the exponentials:

```
k-eq : ⊤ ≡ ⊤
k-eq = ≈⇒≡ (k , k-we)
k-exp-eq : ∀ {i}(A : Set i) → (A → ⊤) ≡ (A → ⊤)
k-exp-eq A = ap (λ X → A → X) k-eq
```

If we were working in a theory with computational univalence, coercion along `k-exp-eq`

would reduce to precomposition with \(k\). In any case, we can manually show that this is the case propositionally by using path induction and the computational rule for `≈⇒≡`

:

```
ap-comp : ∀ {i k}{A : Set i}{X X' : Set k}
→ (p : X ≡ X')
→ (f : A → X)
→ coerce (ap (λ X → A → X) p) f
≡ coerce p ∘ f
ap-comp refl f = refl
k-exp-eq-comp' : ∀ {i}{A : Set i}(f : A → ⊤)
→ coerce (k-exp-eq A) f
≡ λ _ → tt
k-exp-eq-comp' f = ap-comp k-eq f
· ap (λ c → c ∘ f)
(uni-coherence (k , k-we))
```

Now it’s easy to conclude that \(A → ⊤\) is a mere proposition (hence contractible): given functions \(f g : A → ⊤\), precomposing them with \(k\) makes them both equal to \(λ \_ → \mathsf{tt}\). Since precomposing with \(k\) is an equivalence by the computational rule above, \(f\) must be equal to \(g\).

```
prop-exp-⊤ : ∀ {i}{A : Set i} → prop (A → ⊤)
prop-exp-⊤ {i}{A} f g = ap proj₁
( contr⇒prop (coerce-equiv (k-exp-eq A) (λ _ → tt))
(f , k-exp-eq-comp' f)
(g , k-exp-eq-comp' g) )
contr-exp-⊤ : ∀ {i}{A : Set i} → contr (A → ⊤)
contr-exp-⊤ = (λ _ → tt) , prop-exp-⊤ _
```

]]>