Algebraic Graphs With Class (Functional Pearl) : Andrey Mokhov
Algebraic Graphs With Class (Functional Pearl) : Andrey Mokhov
Andrey Mokhov
Newcastle University, United Kingdom
Abstract vertex’ error. Both containers and fgl are treasure troves of graph
The paper presents a minimalistic and elegant approach to working algorithms, but it is easy to make an error when using them. Is
with graphs in Haskell. It is built on a rigorous mathematical foun- there a safe graph construction interface we can build on top?
dation — an algebra of graphs — that allows us to apply equational In this paper we present algebraic graphs — a new interface
reasoning for proving the correctness of graph transformation al- for constructing and transforming graphs (more precisely, graphs
gorithms. Algebraic graphs let us avoid partial functions typically with labelled vertices and unlabelled edges). We abstract away
caused by ‘malformed graphs’ that contain an edge referring to a from graph representation details and characterise graphs by a set
non-existent vertex. This helps to liberate APIs of existing graph of axioms, much like numbers are algebraically characterised by
libraries from partial functions. rings [Mac Lane and Birkhoff 1999]. Our approach is based on the
The algebra of graphs can represent directed, undirected, reflex- algebra of parameterised graphs, a mathematical formalism used
ive and transitive graphs, as well as hypergraphs, by appropriately in digital circuit design [Mokhov and Khomenko 2014], which we
choosing the set of underlying axioms. The flexibility of the ap- simplify and adapt to the context of functional programming.
proach is demonstrated by developing a library for constructing Algebraic graphs have a safe and minimalistic core of four graph
and transforming polymorphic graphs. construction primitives, as captured by the following data type:
data Graph a = Empty
CCS Concepts • Mathematics of computing;
| Vertex a
Keywords Haskell, algebra, graph theory | Overlay (Graph a) (Graph a)
ACM Reference Format: | Connect (Graph a) (Graph a)
Andrey Mokhov. 2017. Algebraic Graphs with Class (Functional Pearl). In Here Empty and Vertex construct the empty and single-vertex
Proceedings of 10th ACM SIGPLAN International Haskell Symposium, Oxford, graphs, respectively; Overlay composes two graphs by taking
UK, September 7-8, 2017 (Haskell’17), 12 pages.
the union of their vertices and edges, and Connect is similar to
https://doi.org/10.1145/3122955.3122956
Overlay but also creates edges between vertices of the two graphs,
1 Introduction see Fig. 1 for examples. The overlay and connect operations have
two important properties: (i) they are closed on the set of graphs,
Graphs are ubiquitous in computing, yet working with graphs often
i.e. are total functions, and (ii) they can be used to construct any
requires painfully low-level fiddling with sets of vertices and edges.
graph starting from the empty and single-vertex graphs. For exam-
Building high-level abstractions is difficult, because the commonly
ple, Connect (Vertex 1) (Vertex 2) is the graph with two ver-
used foundation – the pair (V , E) of vertex set V and edge set
tices {1, 2} and a single edge (1, 2). Malformed graphs, such as
E ⊆ V × V – is a source of partial functions. We can represent the
G [1] [(1,2)], cannot be expressed in this core language.
pair (V , E) by the following simple data type1 :
The main goal of this paper is to demonstrate that this core is
data G a = G { vertices :: [a], edges :: [(a,a)] } a safe, flexible and elegant foundation for working with graphs that
Now G [1,2,3] [(1,2),(2,3)] is the graph with three vertices have no edge labels. Our specific contributions are:
V = {1, 2, 3} and two edges E = {(1, 2), (2, 3)}. The consistency • Compared to existing libraries, algebraic graphs have a smaller
invariant E ⊆ V × V holds. But what is G [1] [(1,2)]? The edge core (just four graph construction primitives), are more com-
refers to the non-existent vertex 2, breaking the invariant, and there positional (hence greater code reuse), and have no partial
is no easy way to reflect this in types. Perhaps, our data type is just functions (hence fewer opportunities for usage errors). We
too simplistic; let us look at state-of-the-art graph libraries instead. present the core and justify these claims in §2.
The containers library is designed for performance and powers • The core has a simple mathematical structure fully charac-
GHC itself. It represents graphs by adjacency arrays [King and terised by a set of axioms (§3). This makes the proposed
Launchbury 1995] whose consistency invariant is not statically interface easier for testing and formal verification. We show
checked, which can lead to runtime usage errors such as ‘index that the core is complete, i.e. any graph can be constructed,
out of range’. Another popular library fgl uses the inductive graph and sound, i.e. malformed graphs cannot be constructed.
representation [Erwig 2001], but its API also has partial functions, • Under the basic set of axioms, algebraic graphs correspond to
e.g. inserting an edge can fail with the ‘edge from non-existent directed graphs. As we show in §4, by extending the algebra
1 Although in this paper we exclusively use Haskell, the problem we solve is general with additional axioms, we can represent undirected, reflex-
and the proposed approach can be readily adapted to other programming languages. ive, transitive graphs, their combinations, and hypergraphs.
Importantly, the core remains unchanged, which allows us
Haskell’17, September 7-8, 2017, Oxford, UK
© 2017 Copyright held by the owner/author(s). Publication rights licensed to Associa- to define highly reusable polymorphic functions on graphs.
tion for Computing Machinery. • We develop a library2 for constructing and transforming
This is the author’s version of the work. It is posted here for your personal use. algebraic graphs and demonstrate its flexibility in §5.
Not for redistribution. The definitive Version of Record was published in Proceed-
ings of 10th ACM SIGPLAN International Haskell Symposium, September 7-8, 2017,
https://doi.org/10.1145/3122955.3122956.
2 The library is on Hackage: http://hackage.haskell.org/package/algebraic-graphs.
Haskell’17, September 7-8, 2017, Oxford, UK Andrey Mokhov
1 + 2 = 1 2 2 2
1 1
(a) 1 + 2 + = 1 1 1 =
3 3
1 2 = 1 2
(b) 1 → 2 (c) 1 → (2 + 3) (d) 1 → 1 (e) 1 → 2 + 2 → 3
Figure 1. Examples of graph construction. The overlay and connect operations are denoted by + and →, respectively.
Graphs and functional programming have a long history. We As shown in §1, the core can be represented by a simple data
review related work in §6. Limitations of the presented approach type Graph, parameterised by the type of vertices a. To make the
and future research directions are discussed in §7. core more reusable, the next subsection defines the core type class
that has the usual inhabitants, such as the pair (V , E), data types
2 The Core from containers and fgl, as well as other, stranger forms of life.
In this section we define the core of algebraic graphs comprising
four graph construction primitives. We describe the semantics of 2.2 Type Class
the primitives using the common representation of graphs by sets of We abstract the graph construction primitives defined in §2.1 as
vertices and edges, and then abstract away from this representation the type class Graph4 :
by focusing on the laws that these primitives satisfy. class Graph g where
Let G be the set of all directed graphs whose vertices come from type Vertex g
a fixed universe V. As an example, we can think of graphs whose empty :: g
vertices are positive integers. A graph д ∈ G can be represented by vertex :: Vertex g -> g
a pair (V , E) where V ⊆ V is the set of its vertices and E ⊆ V × V is overlay :: g -> g -> g
the set of its edges. As mentioned in §1, when E * V × V the pair connect :: g -> g -> g
(V , E) is inconsistent and does not correspond to a graph.
When one needs to guarantee the internal consistency of a data Here the associated type5 Vertex g corresponds to the universe of
structure, the standard solution is to define an abstract interface graph vertices V, empty is the empty graph ε, vertex constructs
that encapsulates the data structure and provides a set of safe a graph with a single vertex, and overlay and connect compose
construction primitives. This is exactly the approach we take. given graphs according to the definitions in §2.1. All methods of the
type class are total, i.e. are defined for all possible inputs, therefore,
2.1 Constructing Graphs the presented API allows fewer opportunities for usage errors and
The simplest possible graph is the empty graph. We denote it by ε, greater opportunities for reuse.
therefore ε = (∅, ∅) and ε ∈ G. A graph with a single vertex v ∈ V Let us put the interface to the test and construct some graphs. A
is denoted simply by v. For example, 1 ∈ G is the graph (1, ∅). single edge is obtained by connecting two vertices:
To construct larger graphs from the above primitives we de- edge :: Graph g => Vertex g -> Vertex g -> g
fine binary operations overlay and connect, denoted by + and →, edge x y = connect (vertex x) (vertex y)
respectively. The overlay operation + is defined as The graphs in Fig. 1(b,d) are edge 1 2 and edge 1 1, respectively.
def A graph that contains a given list of isolated vertices can be con-
(V1 , E 1 ) + (V2 , E 2 ) = (V1 ∪ V2 , E 1 ∪ E 2 ).
structed as follows:
That is, the overlay of two graphs comprises the union of their
vertices :: Graph g => [Vertex g] -> g
vertices and edges. The connect → operation is defined similarly:
vertices = foldr overlay empty . map vertex
def
(V1 , E 1 ) → (V2 , E 2 ) = (V1 ∪ V2 , E 1 ∪ E 2 ∪ V1 × V2 ). That is, we turn each vertex into a singleton graph and overlay
The difference is that when we connect two graphs, an edge is the results. The graph in Fig. 1(a) is vertices [1,2]. By replacing
added from each vertex of the left-hand argument to each vertex of overlay with connect in the above definition, we obtain a directed
the right-hand argument3 . Note that the connect operation is the clique – a fully connected graph on a given list of vertices:
only source of edges when constructing graphs. As we will see in §3, clique :: Graph g => [Vertex g] -> g
overlay and connect are very similar to addition and multiplication. clique = foldr connect empty . map vertex
We therefore give connect a higher precedence, i.e. 1 + 2 → 3 is For example, clique [1,2,3] expands to 1 → 2 → 3 → ε, i.e. the
interpreted as 1 + (2 → 3). Fig. 1 illustrates a few examples of graph graph with three vertices {1, 2, 3} and three edges (1, 2), (1, 3) and
construction using the defined primitives: (2, 3). Note that it is different from the graph in Fig. 1(e).
• 1 + 2 is the graph with two isolated vertices 1 and 2. The graph construction functions defined above are total, fully
• 1 → 2 is the graph with an edge between vertices 1 and 2. polymorphic, and elegant. Thanks to the minimalistic core type
• 1 → (2+3) comprises vertices {1, 2, 3} and edges {(1, 2), (1, 3)}. class, it is easy to wrap your favourite graph library into the de-
• 1 → 1 is the graph with vertex 1 and the self-loop. scribed interface, and reuse the above functions, as well as many
• 1 → 2 + 2 → 3 is the path graph on vertices {1, 2, 3}. others that we define throughout this paper.
3 Ourdefinitions of overlay and connect coincide with those of graph union and join,
4 The name collision (data Graph and class Graph) is not a problem in practice,
respectively, e.g see Harary [1969], however the arguments of union and join are
typically assumed to have disjoint sets of vertices. We make no such assumptions, because the data type and type class are not used together and live in separate modules.
5 Associated types [Chakravarty et al. 2005] require the TypeFamilies GHC extension.
hence our definitions are total: any graphs can be composed using overlay and connect.
Algebraic Graphs with Class (Functional Pearl) Haskell’17, September 7-8, 2017, Oxford, UK
2 2 1 2 2 2 2
+
1 + = 1 = + 1 2 3 = =
3 3 1 3 1 3 1 1 3 3
vertices (h:t) = foldr overlay empty (map vertex (h:t)) (definition of vertices)
= foldr overlay empty (vertex h : map vertex t) (definition of map)
= overlay (vertex h) (vertices t) (definition of foldr)
⊆ overlay (vertex h) (clique t) (monotony and the inductive hypothesis)
⊆ connect (vertex h) (clique t) (overlay-connect order)
= foldr connect empty (vertex h : map vertex t) (definition of foldr)
= foldr connect empty (map vertex (h:t)) (definition of map)
= clique (h:t) (definition of clique)
It turns out that this definition corresponds to the subgraph relation, 4.1 Binary Relation
i.e. we can define: We start by a direct encoding of the graph construction primitives
def
x ⊆ y = x + y = y. defined in §2.1 into the abstract data type Relation isomorphic to a
pair of sets (V , E), see Fig. 4. As we have seen, this implementation
Indeed, expanding x +y = y to (Vx , E x ) + (Vy , Ey ) = (Vy , Ey ) gives satisfies the axioms of the graph algebra. Furthermore, it is a free
us Vx ∪ Vy = Vy and E x ∪ Ey = Ey , which is equivalent to Vx ⊆ Vy graph in the sense that it does not satisfy any other laws. This
and E x ⊆ Ey , as desired. follows from the fact that any algebraic graph expression д can be
Therefore, we can check if a graph is a subgraph of another one rewritten in the following canonical form:
if we know how to compare graphs for equality: X X
д= v + u →v ,
isSubgraphOf :: (Graph g, Eq g) => g -> g -> Bool
v ∈Vд (u,v ) ∈Eд
isSubgraphOf x y = overlay x y == y
The following theorems about the partial order on graphs can where Vд is the set of vertices that appear in д, and (u, v) ∈ Eд
be proved: if vertices u and v appear in the left-hand and right-hand argu-
ments of the connect operation → at least once (and should thus
• Least element: ε ⊆ x. be connected by an edge). The canonical form of an expression д
• Overlay order: x ⊆ x + y. can be represented as R Vд Eд , and any additional law on Relation
• Overlay-connect order: x + y ⊆ x → y. would therefore violate the canonicity property. The existence of
• Monotony: x ⊆ y ⇒ the canonical form was proved by Mokhov and Khomenko [2014]
(x + z ⊆ y + z) ∧ (x → z ⊆ y → z) ∧ (z → x ⊆ z → y). for an extended version of the algebra. The proof fundamentally
builds on the decomposition axiom: one can apply it repeatedly
3.3 Equational Reasoning
to an expression, breaking up connect sequences x → y → z into
In this subsection we show how to use equational reasoning and pairs x → y until the decomposition can no longer be applied. We
the laws of the algebra to prove properties of functions on graphs. can then open parentheses, such as x → (y + z), using the distribu-
For example, to prove that vertex x = vertices [x] we rewrite tivity axiom and rearrange terms into the canonical form by the
the right-hand side using the function definitions and x + ε = x: commutativity and idempotence of overlay +.
vertices [x] = foldr overlay empty (map vertex [x]) It is convenient to make Relation an instance of the Num type
= foldr overlay empty [vertex x] class to use the standard + and ∗ operators as shortcuts for overlay
= overlay (vertex x) empty and connect, respectively:
= vertex x instance (Ord a, Num a) => Num (Relation a) where
fromInteger = vertex . fromInteger
Proving that vertices xs ⊆ clique xs requires more work.
(+) = overlay
We start with the case when xs is the empty list [], which is straight-
(*) = connect
forward: vertices [] = ε ⊆ ε = clique [], as follows from the
signum = const empty
definition of foldr. If xs is non-empty, i.e. xs = h:t, we make the
abs = id
inductive hypothesis that vertices t ⊆ clique t and proceed
negate = id
as shown in Fig. 3.
We formally proved all properties and theorems discussed in Note that the Num law abs x * signum x == x is satisfied by the
this paper in Agda7 . above definition since x → ε = x. Any Graph instance can be made
a Num instance if need be, using a definition similar to the above.
4 Graphs à la Carte We can now experiment with graphs and binary relations using
the interactive GHC:
In this section we define several useful Graph instances, and show
that the algebra presented in the previous section §3 is not restricted λ> 1 * (2 + 3) :: Relation Int
to directed graphs, but can be extended to axiomatically represent R {domain = fromList [1,2,3], relation = fromList [(1,2),(1,3)]}
λ> 1 * (2 + 3) + 2 * 3 == (clique [1..3] :: Relation Int)
undirected (§4.3), reflexive (§4.4) and transitive (§4.5) graphs, their
True
various combinations (§4.6), and even hypergraphs (§4.7).
λ> 1 * 2 == (2 * 1 :: Relation Int)
7 The
False
proofs are available at https://github.com/snowleopard/alga-theory.
Algebraic Graphs with Class (Functional Pearl) Haskell’17, September 7-8, 2017, Oxford, UK
We can embed the core graph construction primitives into a simple (x ↔ y) ↔ z = x ↔ y +x ↔ z +y ↔ z (decomposition)
data type (excuse and ignore the name clash with the type class): = y ↔ z +y ↔ x +z ↔ x (commutativity)
data Graph a = Empty = (y ↔ z) ↔ x (decomposition)
| Vertex a = x ↔ (y ↔ z) (commutativity)
| Overlay (Graph a) (Graph a) Therefore, the minimal algebraic characterisation of undirected
| Connect (Graph a) (Graph a) graphs comprises only 6 axioms:
The instance definition is a direct mapping from the shallow • + is commutative and associative, i.e. x + y = y + x and
embedding of the core primitives, represented by the type class, x + (y + z) = (x + y) + z.
into the corresponding deep embedding, represented by the data • ↔ is commutative x ↔ y = y ↔ x and has ε is the identity:
type. It is known, e.g. see Gibbons and Wu [2014], that by folding x ↔ ε = x.
the data type one can always obtain the inverse mapping: • Left distributivity: x ↔ (y + z) = x ↔ y + x ↔ z.
fold :: Graph g => Graph (Vertex g) -> g • Left decomposition: (x ↔ y) ↔ z = x ↔ y + x ↔ z +y ↔ z.
fold Empty = empty Commutativity of the connect operator forces graph expressions
fold (Vertex x ) = vertex x that differ only in the direction of edges into the same equiva-
fold (Overlay x y) = overlay (fold x) (fold y) lence class. One can implement this by the symmetric closure of the
fold (Connect x y) = connect (fold x) (fold y) underlying binary relation:
We cannot use the derived Eq instance of the Graph data type, newtype Symmetric a = S (Relation a) deriving (Graph, Num)
because it would clearly violate the axioms of the algebra, e.g.
Overlay Empty Empty is structurally different from Empty, but they instance Ord a => Eq (Symmetric a) where
must be equal according to the axioms. One way to implement a S x == S y = symmetricClosure x == symmetricClosure y
custom law-abiding Eq instance is to reinterpret the graph expres-
Note that algebraic expressions of undirected graphs have the
sion as a binary relation, thereby gaining access to the canonical
canonical form where all edges are directed in a canonical order,
graph representation:
e.g. according to some total order on vertices.
instance Ord a => Eq (Graph a) where Let’s test that the custom equality works as desired:
x == y = fold x == (fold y :: Relation a)
λ> clique "abcd" == (clique "dcba" :: Relation Char)
An interesting feature of this graph instance is that it allows us False
to represent densely connected graphs more compactly. For exam-
ple, clique [1..n] :: Graph Int has a linear-size representa- λ> clique "abcd" == (clique "dcba" :: Symmetric Char)
tion in memory, while clique [1..n] :: Relation Int stores True
each edge separately and therefore requires O (n2 ) memory. Exploit-
ing the compact graph representation for deriving algorithms that As you can see, polymorphic graph construction functions, such
are asymptotically faster on dense graphs, compared to conven- as clique, can be reused when working with undirected graphs.
tional algorithms operating on ‘uncompressed’ graph representa- We can define a subclass class Graph g => UndirectedGraph g
tions isomorphic to (V , E), is outside the scope of this paper, but is and use the UndirectedGraph g constraint for functions that rely
an interesting direction of future research. on the commutativity of the connect method.
Haskell’17, September 7-8, 2017, Oxford, UK Andrey Mokhov
1 1 1 1
2
= 2
+ 2
+ + 2
4 4 4 4
3 3 3 3
Figure 5. 3-decomposition: 1 → 2 → 3 → 4 = 1 → 2 → 3 + 1 → 2 → 4 + 1 → 3 → 4 + 2 → 3 → 4.
vertices :: Graph g => [Vertex g] -> g path :: Graph g => [Vertex g] -> g
clique :: Graph g => [Vertex g] -> g circuit :: Graph g => [Vertex g] -> g
edge :: Graph g => Vertex g -> Vertex g -> g star :: Graph g => Vertex g -> [Vertex g] -> g
edges :: Graph g => [(Vertex g, Vertex g)] -> g tree :: Graph g => Tree (Vertex g) -> g
graph :: Graph g => [Vertex g] -> [(Vertex g, Vertex g)] -> g forest :: Graph g => Forest (Vertex g) -> g
isSubgraphOf :: (Graph g, Eq g) => g -> g -> Bool fold :: Graph g => Graph (Vertex g) -> g
(a) Derived graph construction primitives and the subgraph relation (b) Standard families of graphs and graph folding
5.2 Graph Transpose Essentially, we are defining another newtype wrapper, which
In the rest of this section we present a toolbox for transforming pushes the given function all the way towards the vertices of a
polymorphic graph expressions. The functions in the presented given graph expression. This has no runtime cost, just as before,
toolbox are listed in Fig. 6(c). although the actual evaluation of the given function at each vertex
One of the simplest transformations one can apply to a graph is will not be free, of course. Here is gmap in action:
to flip the direction of all of its edges. Transpose is usually straight- λ> edgeList $ 1 * 2 * 3 + 4 * 5
forward to implement but whichever data structure you use to [(1,2),(1,3),(2,3),(4,5)]
represent graphs, you will spend at least O (1) time to modify it (say,
by flipping the treatAsTransposed flag); much more often you λ> edgeList $ gmap (+1) $ 1 * 2 * 3 + 4 * 5
will have to traverse the data structure and flip every edge, resulting [(2,3),(2,4),(3,4),(5,6)]
in O (|V | + |E|) time complexity. However, by working with poly-
As you can see, we can increment the value of each vertex by map-
morphic graphs, i.e. graphs of type forall g. Graph g => g, and
ping the function (+1) over the graph. The resulting expression is a
using Haskell’s zero-cost newtype wrappers, we can implement
polymorphic graph, as desired. Note that gmap satisfies the functor
transpose that takes zero time.
laws gmap id = id and gmap f . gmap g = gmap (f . g), be-
Consider the following Graph instance:
cause it does not change the structure of the given expression and
newtype Transpose g = T { transpose :: g } deriving Eq only pushes the given function down to its leaves – the vertices.
An alert reader might wonder: what happens if the function
instance Graph g => Graph (Transpose g) where
maps two different vertices into the same one? They will be merged.
type Vertex (Transpose g) = Vertex g
Merging graph vertices is a useful graph transformation, so let us
empty = T empty
define it in terms of gmap:
vertex = T . vertex
overlay x y = T $ overlay (transpose x) (transpose y) mergeVertices :: Graph g => (Vertex g -> Bool)
connect x y = T $ connect (transpose y) (transpose x) -> Vertex g -> GraphFunctor (Vertex g) -> g
That is, we wrap a graph in a newtype flipping the order of connect mergeVertices p v = gmap $ \u -> if p u then v else u
arguments. Let us check if this works: λ> edgeList $ mergeVertices odd 3 $ 1 * 2 * 3 + 4 * 5
λ> edgeList $ 1 * (2 + 3) * 4 [(2,3),(3,2),(3,3),(4,3)]
[(1,2),(1,3),(1,4),(2,4),(3,4)]
The function takes a predicate on graph vertices and a target vertex
λ> edgeList $ transpose $ 1 * (2 + 3) * 4 and maps all vertices satisfying the predicate into the target, thereby
[(2,1),(3,1),(4,1),(4,2),(4,3)] merging them. In our example the odd vertices {1, 3, 5} are merged
into 3, in particular creating the self-loop 3 → 3. Note: it takes
The transpose has zero runtime cost, because all we do is wrap- linear time O (|д|) for mergeVertices to traverse the graph and
ping and unwrapping the newtype, which is guaranteed to be free apply the predicate to each vertex (where |д| is the size of the graph
or, to be more precise, is handled by GHC at compile time. expression д), which may be much more efficient than merging
To make sure transpose is only applied to polymorphic graphs, vertices in a concrete data structure. For example, if the graph is
we do not export the constructor T, therefore the only way to call represented by an adjacency matrix, it will likely be necessary to
transpose is to give it a polymorphic argument and let the type rebuild the resulting matrix from scratch, which takes O (|V | 2 ) time.
inference interpret it as a value of type Transpose. Since for many graphs we have |д| = O (|V |), our mergeVertices
may be quadratically faster than the matrix-based one.
5.3 Graph Functor As another application of gmap, we implement the Cartesian
We now implement a function gmap that given a function a -> b and graph product operation box, or G H , where the resulting vertex
a polymorphic graph whose vertices are of type a will produce a set is VG × VH and vertex (x, y) is connected to vertex (x ′, y ′ ) if
polymorphic graph with vertices of type b by applying the function either x = x ′ and (y, y ′ ) ∈ E H , or y = y ′ and (x, x ′ ) ∈ EG . An
to each vertex. This is almost a Functor but it does not have the example of the Cartesian product of graphs pentagon and p4 is
usual type signature, because Graph is not a higher-kinded type8 : shown in Fig. 7.
box :: (Graph g, Vertex g ∼ (a, b))
newtype GraphFunctor a =
=> GraphFunctor a -> GraphFunctor b -> g
F { gfor :: forall g. Graph g => (a -> Vertex g) -> g }
box x y = foldr overlay empty $ xs ++ ys
instance Graph (GraphFunctor a) where where
type Vertex (GraphFunctor a) = a xs = map (\b -> gmap (,b) x) . toList $ gmap id y
empty = F $ \_ -> empty ys = map (\a -> gmap (a,) y) . toList $ gmap id x
vertex x = F $ \f -> vertex (f x) The Cartesian product G H is assembled by creating |VH |
overlay x y = F $ \f -> overlay (gmap f x) (gmap f y) copies of graph G and overlaying them with |VG | copies of graph
connect x y = F $ \f -> connect (gmap f x) (gmap f y) H . We get access to the list of graph vertices using toList and
turn vertices of original graphs into pairs of vertices by gmap. Note
gmap :: Graph g => (a -> Vertex g) -> GraphFunctor a -> g
that we need to reinterpret the input of type GraphFunctor as a
gmap = flip gfor
polymorphic graph by gmap id before passing it to the toList
8 It is possible to define a higher-kinded version of Graph, but it has fewer instances. function, which expects inputs of type ToList. As you can see,
Algebraic Graphs with Class (Functional Pearl) Haskell’17, September 7-8, 2017, Oxford, UK
• xu = removeVertex u x and yuv = removeEdge u v y, thus the alphabet {0, 1} is illustrated in Fig. 9. Here are all the ingredients
xu → yuv definitely does not contain the edge (u, v) at the of the solution:
cost of losing the vertex u in the left-hand side xu .
• overlaps contains all possible words of length len-1 that
• yv = removeVertex v y and xuv = removeEdge u v x, thus
correspond to overlaps of connected vertices.
xuv → yv definitely does not contain the edge (u, v) at the
• skeleton contains one edge per overlap, with Left and
cost of losing the vertex v in the right-hand side yv .
Right vertices acting as temporary placeholders.
The overlay xu → yuv + xuv → yv contains the vertices u and v, • We replace a vertex Left s with a subgraph of two vertices
because at least one copy of each vertex has been preserved, but {0s, 1s}, i.e. the vertices whose suffix is s. Symmetrically,
the edge (u, v) is removed in both subexpressions as intended. Right s is replaced by vertices {s0, s1}. This is captured by
We demonstrate removeEdge on two simple examples: the function expand.
λ> edgeList $ path "Hello" • The result is obtained by computing bind skeleton expand.
[(’H’,’e’),(’e’,’l’),(’l’,’l’),(’l’,’o’)]
Below we construct the De Bruijn graph shown in Fig. 9.
λ> edgeList $ removeEdge ’H’ ’e’ $ path "Hello"
λ> edgeList $ deBruijn 3 "01"
[(’e’,’l’),(’l’,’l’),(’l’,’o’)]
[("000","000"),("000","001"),("001","010"),("001","011")
,("010","100"),("010","101"),("011","110"),("011","111")
λ> edgeList $ removeEdge ’l’ ’l’ $ path "Hello"
,("100","000"),("100","001"),("101","010"),("101","011")
[(’H’,’e’),(’e’,’l’),(’l’,’o’)]
,("110","100"),("110","101"),("111","110"),("111","111")]
The removeEdge function is expensive: given an expression of size
λ> g = deBruijn 9 "abc"
|д| it may produce a transformed expression of the quadratic size λ> all (\(x,y) -> drop 1 x == dropEnd 1 y) $ edgeList g
O (|д| 2 ). Many concrete Graph instances provide much faster equiv- True
alents of removeEdge.
λ> Set.size $ domain g
5.6 De Bruijn Graphs 19683 -- i.e. 3^9
To demonstrate that one can easily construct sophisticated graphs
using the presented library, let us try it on De Bruijn graphs, an λ> Set.size $ relation g
interesting combinatorial object that frequently shows up in com- 59049 -- i.e. 3^10
puter engineering and bioinformatics. The implementation is very
short, but requires some explanation: Note that a De Bruijn graph of dimension len on the alphabet has
deBruijn :: (Graph g, Vertex g ∼ [a]) => Int -> [a] -> g |alphabet| len vertices and |alphabet| len+1 edges.
deBruijn len alphabet = bind skeleton expand
where 5.7 Summary
overlaps = mapM (const alphabet) [2..len] We have presented a library of polymorphic graph construction and
skeleton = edges [ (Left s, Right s) | s <- overlaps ] transformation functions that provide a flexible and elegant way
expand v = vertices to manipulate graph expressions polymorphically. Polymorphic
[ either ([a]++) (++[a]) v | a <- alphabet ] graphs are highly reusable and composable, and can be interpreted
The function builds a De Bruijn graph of dimension len from using any of the Graph instances defined in §4, as well as other
symbols of the given alphabet. The vertices of the graph are all instances provided by the algebraic-graphs library that is available
possible words of length len containing symbols of the alphabet, on Hackage. The library is written in the vanilla functional pro-
and two words are connected x → y whenever x and y match gramming style and has no dependencies apart from core GHC
after we remove the first symbol of x and the last symbol of y libraries. Many of the presented graph transformation algorithms
(equivalently, when x = az and y = zb for some symbols a and b). are expressed using familiar functional programming abstractions,
The process of construction of a 3-dimensional De Bruijn graph on such as functors and monads.
Algebraic Graphs with Class (Functional Pearl) Haskell’17, September 7-8, 2017, Oxford, UK
Left 00 Right 00
000 100 000 001 100 001
Left 01 Right 01
=
Left 10 Right 10 001 101 010 011 010
Left 11 Right 11
010 110 100 101 101
expand
011 111 110 111 110 011
Left s Right s
0s 1s s0 s1 111
6 Related Work are compatible. Graphs in this paper are homogeneous, i.e. an edge is
Historically, first approaches to graph representation in functional allowed between any pair of vertices. This is a limitation for some
programming used edge lists, adjacency lists, as well as mutually applications, but it allows us to have a much simpler theory and
recursive data structures representing cyclic graphs by the so-called implementation. Petri nets [Murata 1989] is an example of graphs
‘tying the knot’ approach. The former were generally slower than where not all edges are allowed10 . Algebraic graphs proposed in
their imperative counterparts, while the latter were very difficult this paper cannot represent Petri nets in a safe way.
to work with. An asymptotically optimal implementation of the From a very different angle, simple algebraic structures, such as
depth-first search algorithm developed by King and Launchbury semirings, have been successfully applied to solving various path
[1995] used arrays to represent graphs and state-transformer mon- problems on graphs using functional programming, e.g. see Dolan
ads [Launchbury and Peyton Jones 1994] to mimic imperative array [2013]. These approaches typically use matrix-based data struc-
updates in pure functional programming. The developed algorithms tures for manipulating connectivity and distance information with
are still in use today and are available from the containers library the goal of solving optimisation problems on graphs, and are not
shipped with GHC. The API of the library contains partial functions. suitable as an abstract interface for graph representation.
A fundamentally different approach by Erwig [2001] is based on Simple graph construction cores are known for special families
inductive graphs, whereby a graph can be decomposed into a context of graphs. For example, non-empty series-parallel graphs require
(a node with its neighbourhood) and the rest of the graph. This only three primitives: a single vertex, and series and parallel com-
inductive definition makes it possible to share common subgraphs position operations. A classical result [Valdes et al. 1979] states that
and provides a way to implement graph algorithms in a more func- only N -free graphs can be constructed using these primitives. Simi-
tional style compared to the previous approaches based on array larly, the family of cographs corresponds to P4 -free graphs, which
representations. Inductive graphs are implemented in the fgl library also require only three graph construction primitives: a single ver-
that contains implementations of many standard graph algorithms, tex, graph complement, and disjoint graph union [Corneil et al.
from depth-first search to maximum flow on weighted graphs. The 1981]. Interestingly, there is an alternative core for cographs: a
library defines type classes Graph and DynGraph for working with single vertex, disjoint graph union, and disjoint graph join. The
static (unchangeable) and dynamic (changeable) graphs, comprising only difference from the core used in this paper is the disjointness
10 class methods in total. Compared to algebraic graphs proposed requirement. By dropping this requirement, we can construct ar-
in this paper, fgl has a larger core of graph construction primitives bitrary graphs. In particular, both N = 1 → 2 + 3 → (2 + 4) and
(10 vs 4), some of which are partial. An important advantage of fgl P4 = 1 → 2 + 2 → 3 + 3 → 4 can be easily constructed.
is the support of edge-labelled graphs. This paper builds on the work by Mokhov and Khomenko [2014],
Several other authors investigated ways to define graphs com- where the algebra of parameterised graphs, a mathematical structure
positionally, e.g. Gibbons [1995] proposed an algebraic framework very similar to a semiring, was proposed as a complete and sound
for modelling directed acyclic graphs comprising 6 core graph con- formalism for graph representation in the context of digital circuit
struction primitives, but the approach was not general enough to design. In that paper the authors did not investigate applications of
handle other practically useful classes of graphs. the algebra in functional programming but proved many important
Gibbon’s algebra is an example of a large body of research on results that are essential for this work. Alekseyev [2014] derived a
categorical graph algebras, e.g. see a survey by Selinger [2010]. formalisation of the algebra of parameterised graphs in Agda, using
These algebras are typically much more complex than the one an encoding similar to the core type class that we define.
presented in this paper9 , because they can represent graphs with
heterogeneous vertices and edges, where not all vertices and edges 7 Discussion and Future Research Opportunities
The paper presented a new algebraic foundation for working with
9 As an example, Signal Flow Graphs [Bonchi et al. 2015] have 17 primitives and a graphs. It is particularly well-suited for functional programming
few dozens of laws. Smaller characterisations of Signal Flow Graphs exist, however
minimising the number of graph construction primitives has not (so far) been a priority 10 Petri nets have vertices of two types, called places and transitions, and edges are only
for the authors (private communication with Pawel Sobocinski). allowed between vertices of different types.
Haskell’17, September 7-8, 2017, Oxford, UK Andrey Mokhov
languages and benefits from functional programming abstractions, speeding up their processing. Modular decomposition is a
such as functors and monads. Compared to the state-of-the-art, canonical graph representation, which can therefore be used
algebraic graphs are easier to use and reuse, more compositional, to efficiently compare algebraic graph expressions for equal-
and have a smaller core of only four graph construction primitives, ity. Exploiting the compactness of algebraic graphs in algo-
fully characterised by an elegant algebra of graphs. rithms is a promising research direction.
We demonstrated the flexibility of algebraic graphs by several • By using the algebraic approach to graph representation
examples and developed a Haskell library for constructing and one can formulate graph algorithms in the form of solving
transforming polymorphic graphs. systems of algebraic equations with unknowns. This may
The presented approach has a few important limitations: potentially open way to the discovery of novel graph algo-
• This paper has not addressed edge-labelled graphs. In partic- rithms.
ular, there is no known extension of the presented algebra
characterising graphs with arbitrary vertex and edge labels. Acknowledgments
However, Mokhov and Khomenko [2014] give an algebraic I would like to thank Arseniy Alekseyev and Neil Mitchell for
characterisation for graphs labelled with Boolean functions, numerous discussions on algebraic graphs, their advice, criticism
which can be generalised to labels that form a semiring. and encouragement. Without their help this work would have likely
We found that one can represent edge-labelled graphs by remained an unfinished toy project. Simon Peyton Jones, Brent
functions from labels to graphs. For example, a finite automa- Yorgey, Danil Sokolov, Ben Lippmeier, Ulan Degenbaev and several
ton can be thought of as a collection of graphs, one for each anonymous reviewers provided constructive feedback on an earlier
symbol of the alphabet: draft of this paper, helping to substantially improve it. Last but not
least, I’m very grateful to Victor Khomenko for his contribution to
type Automaton a s = a -> Relation s
the algebra of parameterised graphs that forms the mathematical
Here a and s stand for the alphabet and the set of states of foundation of this work.
the automaton, respectively. This representation of labelled
graphs is supported by the following graph instance: References
Arseniy Alekseyev. 2014. Compositional approach to design of digital circuits. Ph.D.
instance Graph g => Graph (a -> g) where Dissertation. Newcastle University.
type Vertex (a -> g) = Vertex g Jonathan Beaumont, Andrey Mokhov, Danil Sokolov, and Alex Yakovlev. 2015. Compo-
sitional design of asynchronous circuits from behavioural concepts. In ACM/IEEE
empty = pure empty International Conference on Formal Methods and Models for Codesign (MEMOCODE).
vertex = pure . vertex IEEE, 118–127.
overlay x y = overlay <$> x <*> y Filippo Bonchi, Pawel Sobocinski, and Fabio Zanasi. 2015. Full abstraction for signal
flow graphs. In ACM SIGPLAN Notices, Vol. 50. ACM, 515–526.
connect x y = connect <$> x <*> y Manuel Chakravarty, Gabriele Keller, and Simon Peyton Jones. 2005. Associated type
synonyms. In ACM SIGPLAN Notices, Vol. 40. ACM, 241–253.
Therefore, Automaton a s is a valid Graph instance. Derek G Corneil, H Lerchs, and L Stewart Burlingham. 1981. Complement reducible
• As mentioned in §6, the presented approach is designed for graphs. Discrete Applied Mathematics 3, 3 (1981), 163–174.
Stephen Dolan. 2013. Fun with semirings: a functional pearl on the abuse of linear
homogeneous graphs, where an edge is allowed between algebra. In ACM SIGPLAN Notices, Vol. 48. ACM, 101–110.
any pair of vertices. It is an open research question whether Martin Erwig. 2001. Inductive graphs and functional graph algorithms. Journal of
Functional Programming 11, 05 (2001), 467–492.
it is possible to extend algebraic graphs for modelling het- Jeremy Gibbons. 1995. An initial-algebra approach to directed acyclic graphs. In
erogeneous graphs, such as Petri nets, without sacrificing International Conference on Mathematics of Program Construction. Springer, 282–
the simplicity of the algebraic core. 303.
Jeremy Gibbons and Nicolas Wu. 2014. Folding domain-specific languages: Deep and
• Many graph instances, e.g. Relation, incur a logarithmic shallow embeddings (functional pearl). In ACM SIGPLAN Notices, Vol. 49. ACM,
overhead during graph construction, and may therefore be 339–347.
Jonathan S Golan. 1999. Semirings and their Applications. Springer Science & Business
unsuitable for high-performance applications. One possible Media.
solution is to operate on deeply-embedded algebraic graphs Frank Harary. 1969. Graph theory. Addison-Wesley.
(such as data Graph), and perform conversions to more David J King and John Launchbury. 1995. Structuring depth-first search algorithms in
Haskell. In Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles
conventional representations only when necessary. of programming languages. ACM, 344–354.
• There are no known efficient implementations of fundamen- John Launchbury and Simon L Peyton Jones. 1994. Lazy functional state threads. In
ACM SIGPLAN Notices, Vol. 29. ACM, 24–35.
tal graph algorithms, such as depth-first search, that work Saunders Mac Lane and Garrett Birkhoff. 1999. Algebra. Chelsea Publishing Company.
directly on the algebraic core. Therefore, we need to trans- Ross M McConnell and Fabien De Montgolfier. 2005. Linear-time modular decomposi-
late core expressions to conventional graph representations, tion of directed graphs. Discrete Applied Mathematics 145, 2 (2005), 198–209.
Andrey Mokhov. 2015. Algebra of switching networks. IET Computers & Digital
such as adjacency lists, and utilise existing graph libraries, Techniques (2015).
which may be suboptimal for certain algorithmic problems. Andrey Mokhov and Victor Khomenko. 2014. Algebra of Parameterised Graphs. ACM
Transactions on Embedded Computing Systems 13, 4s (2014), 1–22.
Despite these limitations, algebraic graphs have been success- Tadao Murata. 1989. Petri nets: Properties, analysis and applications. Proc. IEEE 77, 4
fully used in the design of processor microcontrollers [Mokhov and (1989), 541–580.
Peter Selinger. 2010. A survey of graphical languages for monoidal categories. In New
Khomenko 2014] and asynchronous circuits [Beaumont et al. 2015]. structures for physics. Springer, 289–355.
Our future research will focus on addressing the above limita- Robert E Tarjan and Jan Van Leeuwen. 1984. Worst-case analysis of set union algo-
tions, and on the exploration of the following topics: rithms. Journal of the ACM (JACM) 31, 2 (1984), 245–281.
Jacobo Valdes, Robert E Tarjan, and Eugene L Lawler. 1979. The recognition of series
• Algebraic graph expressions can be minimised via the modu- parallel digraphs. In Proceedings of the eleventh annual ACM symposium on Theory
lar decomposition of graphs [McConnell and De Montgolfier of computing. ACM, 1–12.
2005], thereby reducing their memory footprint, as well as