Functional Programming Essentials
Functional Programming Essentials
The most powerful techniques of functional programming are those that treat
functions as data. Most functional languages give function values full rights,
free of arbitrary restrictions. Like other values, functions may be arguments and
results of other functions and may belong to pairs, lists and trees.
Procedural languages like Fortran and Pascal accept this idea as far as is con-
venient for the compiler writer. Functions may be arguments: say, the compar-
ison to be used in sorting or a numerical function to be integrated. Even this
restricted case is important.
A function is higher-order (or a functional) if it operates on other functions.
For instance, the functional map applies a function to every element of a list,
creating a new list. A sufficiently rich collection of functionals can express
all functions without using variables. Functionals can be designed to construct
parsers (see Chapter 9) and theorem proving strategies (see Chapter 10).
Infinite lists, whose elements are evaluated upon demand, can be implemented
using functions as data. The tail of a lazy list is a function that, if called, pro-
duces another lazy list. A lazy list can be infinitely long and any finite number
of its elements can be evaluated.
Chapter outline
The first half presents the essential programming techniques involving
functions as data. The second half serves as an extended, practical example.
Lazy lists can be represented in ML (despite its strict evaluation rule) by means
of function values.
The chapter contains the following sections:
Functions as values. The fn notation can express a function without giving it
a name. Any function of two arguments can be expressed as a ‘curried’ function
of one argument, whose result is another function. Simple examples of higher-
order functions include polymorphic sorting functions and numerical operators.
General-purpose functionals. Higher-order functional programming largely
173
174 5 Functions and Infinite Data
Functions as values
Functions in ML are abstract values: they can be created; they can be
applied to an argument; they can belong to other data structures. Nothing else is
allowed. A function is given by patterns and expressions but taken as a ‘black
box’ that transforms arguments to results.
fn x => E
fn P1 => E1 | · · · | Pn => En
denotes the function defined by the patterns P1 , . . . , Pn . It has the same meaning
as the let expression
Exercise 5.2 Modify these function declarations to use val instead of fun:
fun area (r ) = pi *r *r ;
fun title(name) = "The Duke of " ˆ name;
fun lengthvec (x ,y) = Math.sqrt(x *x + y *y);
Given a string pre, the result of prefix is a function that concatenates pre to the
front of its argument. For instance, prefix "Sir " is the function
1 It has been credited to Schönfinkel, but Schönfinkeling has never caught on.
176 5 Functions and Infinite Data
Syntax for curried functions. The functions above are declared by val, not
fun. A fun declaration must have explicit arguments. There may be several
arguments, separated by spaces, for a curried function. Here is an equivalent
declaration of prefix :
5.2 Curried functions 177
Recursion works by the usual evaluation rules, even with currying. The result
of replist 3 is the function
The final call returns nil and the overall result is [true, true, true].
An analogy with arrays. The choice between pairing and currying is analogous
to the choice, in Pascal, between a 2-dimensional array and nested arrays.
The former array is subscripted A[i , j ], the latter as B [i ][j ]. Nested arrays permit
partial subscripting: B [i ] is a 1-dimensional array.
178 5 Functions and Infinite Data
Exercise 5.3 What functions result from partial application of the following
curried functions? (Do not try them at the machine.)
fun plus i j : int = i+j ;
fun lesser a b : real = if a<b then a else b;
fun pair x y = (x ,y);
fun equals x y = (x =y);
Exercise 5.4 Is there any practical difference between the following two decla-
rations of the function f ? Assume that the function g and the curried function h
are given.
fun f x y = h (g x ) y;
fun f x = h (g x );
This is a curried function call: hd titlefns returns the function dukify. The
polymorphic function hd has, in this example, the type
2 Recall that the keyword op yields the value of an infix operator, as a function.
5.4 Functions as arguments and results 179
The functions stored in the tree must have the same type, here real → real .
Although different types can be combined into one datatype, this can be incon-
venient. As mentioned at the end of Section 4.6, type exn can be regarded as
including all types. A more flexible type for the functions is exn list → exn.
Exercise 5.5 What type does the polymorphic function Dict.lookup have in
the example above?
Functions ins and sort are declared locally, referring to lessequal . Though it
may not be obvious, insort is a curried function. Given an argument of type τ ×
τ → bool it returns the function sort, which has type τ list → τ list. The types
of the ordering and the list elements must agree.
Integers can now be sorted. (Although the operator <= is overloaded, its type
is constrained by the list of integers.)
insort (op<=) [5,3,7,5,9,8];
> [3, 5, 5, 7, 8, 9] : int list
The fn notation works well with functionals. Here it eliminates the need to
declare a squaring function prior to computing the sum 9k =0 k 2 :
P
Observe that summation f has the same type as f , namely int → real , and that
Pm −1 Pi −1
i =0 j =0 f (j ) may be computed by summation (summation f ) m.
They were indeed legal in earlier versions of ML, but now trigger a message such as
‘Non-value in polymorphic declaration.’ This restriction has to do with references;
Section 8.3 explains the details. Changing the function declaration val f = E to
fun f x = E x
Exercise 5.6 Write a polymorphic function for top-down merge sort, passing
the ordering predicate (≤) as an argument.
tegers m and n.
General-purpose functionals
Functional programmers often use higher-order functions to express
programs clearly and concisely. Functionals to process lists have been popu-
lar since the early days of Lisp, appearing in infinite variety and under many
names. They express operations that otherwise would require separate recursive
function declarations. Similar recursive functionals can be defined for trees.
A comprehensive set of functionals provides an abstract language for express-
ing other functions. After reading this section, you may find it instructive to re-
view previous chapters and simplify the function definitions using functionals.
182 5 Functions and Infinite Data
5.5 Sections
Imagine applying an infix operator to only one operand, either left or
right, leaving the other operand unspecified. This defines a function of one
argument, called a section. Here are some examples in the notation of Bird and
Wadler (1988):
Sections can be added to ML (rather crudely) by the functionals secl and secr :
fun secl x f y = f (x ,y);
> val secl = fn : ’a -> (’a * ’b -> ’c) -> ’b -> ’c
fun secr f y x = f (x ,y);
> val secr = fn : (’a * ’b -> ’c) -> ’b -> ’a -> ’c
These functionals are typically used with infix functions and op, but may be
applied to any function of suitable type. Here are some left sections:
val knightify = (secl "Sir " opˆ);
> val knightify = fn : string -> string
knightify "Geoffrey";
> "Sir Geoffrey" : string
val recip = (secl 1.0 op/);
> val recip = fn : real -> real
recip 5.0;
> 0.2 : real
Exercise 5.8 Is there any similarity between sections and curried functions?
Exercise 5.9 What functions do the following sections yield? Recall that take
removes elements from the head of a list (Section 3.4) while inter forms the
intersection of two lists (Section 3.15).
secr op@ ["Richard"]
secl ["heed", "of", "yonder", "dog!"] List.take
secr List.take 3
secl ["his", "venom", "tooth"] inter
5.6 Combinators 183
5.6 Combinators
The theory of the λ-calculus is in part concerned with expressions known
as combinators. Many combinators can be coded in ML as higher-order func-
tions, and have practical applications.
Composition. The infix o (yes, the letter ‘o’) denotes function composition. The
standard library declares it as follows:
infix o;
fun (f o g) x = f (g x );
> val o = fn : (’b -> ’c) * (’a -> ’b) -> ’a -> ’c
verts integers to reals) are composed. Composition is more readable than fn no-
tation:
summation (Math.sqrt o real ) 10;
The combinator K makes constant functions. Given x it makes the function that
always returns x :
184 5 Functions and Infinite Data
fun K x y = x ;
> val K = fn : ’a -> ’b -> ’a
summation (K 7.0) 5;
> 35.0 : real
Every function in the λ-calculus can be expressed using just S and K — with no
variables! David Turner (1979) has exploited this celebrated fact to obtain lazy
evaluation: since no variables are involved, no mechanism is required for bind-
ing their values. Virtually all lazy functional compilers employ some refinement
of this technique.
Here is a remarkable example of the expressiveness of S and K . The identity
function I can be defined as S K K :
S K K 17;
> 17 : int
map f [x1 , . . . , xn ] = [f x1 , . . . , f xn ]
fun map f [] = []
| map f (x ::xs) = (f x ) :: map f xs;
> val map = fn : (’a -> ’b) -> ’a list -> ’b list
map recip [0.1, 1.0, 5.0, 10.0];
> [10.0, 1.0, 0.2, 0.1] : real list
map size ["York","Clarence","Gloucester"];
> [4, 8, 10] : int list
Similarly, map(filter pred )[l1 , l2 , . . . , ln ] applies filter pred to each of the lists l1 ,
l2 , . . . . It returns a list of lists of elements satisfying the predicate pred .
map (filter (secr op< "m"))
[["my","hair","doth","stand","on","end"],
["to","hear","her","curses"]];
> [["hair", "doth", "end"], ["hear", "her", "curses"]]
> : string list list
Many list functions can be coded trivially using map and filter . Our matrix
transpose function (Section 3.9) becomes
186 5 Functions and Infinite Data
Recall how we defined the intersection of two ‘sets’ in terms of the membership
relation, in Section 3.15. That declaration can be reduced to a single line:
fun inter (xs,ys) = filter (secr (op mem) ys) xs;
> val inter = fn : ’’a list * ’’a list -> ’’a list
[x0 , . . . , xi −1 , xi , . . . , xn −1 ]
| {z } | {z }
takewhile dropwhile
The initial segment, which consists of elements satisfying the predicate, is re-
turned by takewhile:
fun takewhile pred [] = []
| takewhile pred (x ::xs) =
if pred x then x :: takewhile pred xs
else [];
> val takewhile = fn : (’a -> bool) -> ’a list -> ’a list
The remaining elements (if any) begin with the first one to falsify the predicate.
This list is returned by dropwhile:
5.9 The list functionals exists and all 187
These two functionals can process text in the form of character lists. The pred-
icate Char .isAlpha recognizes letters. Given this predicate, takewhile returns
the first word from a sentence and dropwhile returns the remaining characters.
Since they are curried, takewhile and dropwhile combine with other functionals.
For instance, map(takewhile pred ) returns a list of initial segments.
The function disjoint tests whether two lists have no elements in common:
fun disjoint(xs,ys) = all (fn x => all (fn y => x <>y) ys) xs;
> val disjoint = fn : ’’a list * ’’a list -> bool
Because of their argument order, exists and all are hard to read as quantifiers
when nested; it is hard to see that disjoint tests ‘for all x in xs and all y in ys,
x 6 = y.’ However, exists and all combine well with the other functionals.
188 5 Functions and Infinite Data
exists(exists pred )
filter (exists pred )
takewhile(all pred )
Since expressions are evaluated from the inside out, the foldl call applies f to
the list elements from left to right, while the foldr call applies it to them from
right to left. The functionals are declared by
fun foldl f e [] = e
| foldl f e (x ::xs) = foldl f (f (x , e)) xs;
> val foldl = fn : (’a * ’b -> ’b) -> ’b -> ’a list -> ’b
fun foldr f e [] = e
| foldr f e (x ::xs) = f (x , foldr f e xs);
> val foldr = fn : (’a * ’b -> ’b) -> ’b -> ’a list -> ’b
Numerous functions can be expressed using foldl and foldr . The sum of a list
of numbers is computed by repeated addition starting from 0:
val sum = foldl op+ 0;
> val sum = fn : int list -> int
sum [1,2,3,4];
> 10 : int
These definitions work because 0 and 1 are the identity elements of + and ×, re-
spectively; in other words, 0+k = k and 1× k = k for all k . Many applications
of foldl and foldr are of this sort.
Both functionals take as their first argument a function of type σ × τ → τ .
This function may itself be expressed using functionals. A nested application
of foldl adds a list of lists:
5.10 The list functionals foldl and foldr 189
foldl (fn (ns,n) => foldl op+ n ns) 0 [[1], [2,3], [4,5,6]];
> 21 : int
This is more direct than sum(map sum [[1], [2, 3], [4, 5, 6]]), which forms the
intermediate list of sums [1, 5, 15].
List construction (the operator ::) has a type of the required form. Supplying
it to foldl yields an efficient reverse function:
foldl op:: [] (explode "Richard");
> [#"d", #"r", #"a", #"h", #"c", #"i", #"R"] : char list
To append xs and ys, apply :: through foldr to each element of xs, starting
with ys:
foldr op:: ["out", "thee?"] ["And", "leave"];
> ["And", "leave", "out", "thee?"] : string list
Applying append through foldr joins a list of lists, like the function List.concat;
note that [] is the identity element of append:
foldr op@ [] [[1], [2,3], [4,5,6]];
> [1, 2, 3, 4, 5, 6] : int list
Recall that newmem adds a member, if not already present, to a list (Sec-
tion 3.15). Applying that function through foldr builds a ‘set’ of distinct ele-
ments:
foldr newmem [] (explode "Margaret");
> [#"M", #"g", #"a", #"r", #"e", #"t"] : char list
Cartesian products can be computed more clearly using map and List.concat,
at the expense of creating an intermediate list. Declare a curried pairing func-
tion:
fun pair x y = (x ,y);
> val pair = fn : ’a -> ’b -> ’a * ’b
Exercise 5.14 Express the function union (Section 3.15) using functionals.
Surprisingly many functions have this form. Examples include drop and replist
(declared in Sections 3.4 and 5.2, respectively):
repeat tl 5 (explode "I’ll drown you in the malmsey-butt...");
> [#"d", #"r", #"o", #"w", #"n", #" ", ...] : char list
repeat (secl "Ha!" op::) 5 [];
> ["Ha!", "Ha!", "Ha!", "Ha!", "Ha!"] : string list
Tree recursion. The functional treerec, for binary trees, is analogous to foldr .
Calling foldr f e xs, figuratively speaking, replaces :: by f and nil by e in a
list. Given a tree, treefold replaces each leaf by some value e and each branch
by the application of a 3-argument function f .
192 5 Functions and Infinite Data
fun treefold f e Lf = e
| treefold f e (Br (u,t1,t2)) = f (u, treefold f e t1, treefold f e t2);
> val treefold = fn
> : (’a * ’b * ’b -> ’b) -> ’b -> ’a tree -> ’b
This functional can express many of the tree functions of the last chapter. The
function size replaces each leaf by 0 and each branch by a function to add 1 to
the sizes of the subtrees:
treefold (fn(_,c1,c2) => 1+c1+c2) 0
To compute a preorder list, each branch joins its label to the lists for the subtrees:
treefold (fn(u,l 1,l 2) => [u] @ l 1 @ l 2) []
Exercise 5.18 Declare the functional prefold such that prefold f e t is equiv-
alent to foldr f e (preorder t).
Exercise 5.22 Consider counting the Fun nodes in a term. Express this as
a function modelled on vars, then as a function modelled on accumVars and
finally without using functionals.
Exercise 5.23 Note that the result of vars tm mentions x twice. Write a func-
tion to compute the list of variables in a term without repetitions. Can you find
a simple solution using functionals?
To inspect the tail, apply the function xf to (). The argument, the sole value of
type unit, conveys no information; it merely forces evaluation of the tail.
fun tl (Cons(x ,xf )) = xf ()
| tl Nil = raise Empty;
> val tl = fn : ’a seq -> ’a seq
Calling cons(x , xq) combines a head x and tail sequence xq to form a longer
sequence:
fun cons(x ,xq) = Cons(x , fn()=>xq);
> val cons = fn : ’a * ’a seq -> ’a seq
take(from 30, 2)
⇒ take(Cons(30, fn()=>from(30 + 1)), 2)
⇒ 30 :: take(from(30 + 1), 1)
⇒ 30 :: take(Cons(31, fn()=>from(31 + 1)), 1)
⇒ 30 :: 31 :: take(from(31 + 1), 0)
⇒ 30 :: 31 :: take(Cons(32, fn()=>from(32 + 1)), 0)
⇒ 30 :: 31 :: []
⇒ [30, 31]
Observe that the element 32 is computed but never used. Type α seq is not really
lazy; the head of a non-empty sequence is always computed. What is worse,
inspecting the tail repeatedly evaluates it repeatedly; we do not have call-by-
need, only call-by-name. Such defects can be cured at the cost of considerable
extra complication (see Section 8.4).
Exercise 5.24 Explain what is wrong with this version of from, describing the
computation steps of take(badfrom 30, 2).
fun badfrom k = cons(k , badfrom(k +1));
5.13 Elementary sequence processing 197
Exercise 5.25 This variant of type α seq represents every non-empty sequence
by a function, preventing premature evaluation of the first element (Reade, 1989,
page 324). Code the functions from and take for this type of sequences:
datatype 0 a seq = Nil
| Cons of unit -> 0 a * 0 a seq;
Exercise 5.26 This variant of α seq, declared using mutual recursion, is even
lazier than the one above. Every sequence is a function, delaying even the com-
putation needed to tell if a sequence is non-empty. Code the functions from and
take for this type of sequences:
datatype 0 a seqnode = Nil
| Cons of 0 a * 0 a seq
and 0 a seq = Seq of unit -> 0 a seqnode;
The append function for sequences works like the one for lists. The elements
of xq @ yq are first taken from xq; when xq becomes empty, elements are taken
from yq.
198 5 Functions and Infinite Data
fun Nil @ yq = yq
| (Cons(x ,xf )) @ yq = Cons(x , fn()=> (xf ()) @ yq);
> val @ = fn : ’a seq * ’a seq -> ’a seq
In its recursive call, interleave exchanges the two sequences so that neither can
exclude the other.
Functionals for sequences. List functionals like map and filter can be general-
ized to sequences. The function squares is an instance of the functional map,
which applies a function to every element of a sequence:
fun map f Nil = Nil
| map f (Cons(x ,xf )) = Cons(f x , fn()=> map f (xf ()));
> val map = fn : (’a -> ’b) -> ’a seq -> ’b seq
To filter a sequence, successive tail functions are called until an element is found
to satisfy the given predicate. If no such element exists, the computation will
never terminate.
fun filter pred Nil = Nil
| filter pred (Cons(x ,xf )) =
if pred x then Cons(x , fn()=> filter pred (xf ()))
else filter pred (xf ());
> val filter = fn : (’a -> bool) -> ’a seq -> ’a seq
filter (fn n => n mod 10 = 7) (from 50);
> Cons (57, fn) : int seq
take(it, 8);
> [57, 67, 77, 87, 97, 107, 117, 127] : int list
5.13 Elementary sequence processing 199
A structure for sequences. Let us again gather up the functions we have ex-
plored, making a structure. As in the binary tree structure (Section 4.13), we
leave the datatype declaration outside to allow direct reference to the construc-
tors. Imagine that the other sequence primitives have been declared not at top
level but in a structure Seq satisfying the following signature:
signature SEQUENCE =
sig
exception Empty
val cons : 0 a * 0 a seq -> 0 a seq
val null : 0 a seq -> bool
val hd : 0 a seq -> 0 a
val tl : 0 a seq -> 0 a seq
val fromList : 0 a list -> 0 a seq
val toList : 0 a seq -> 0 a list
val take : 0 a seq * int -> 0 a list
val drop : 0 a seq * int -> 0 a seq
val @ : 0 a seq * 0 a seq -> 0 a seq
val interleave : 0 a seq * 0 a seq -> 0 a seq
val map : (0 a -> 0 b) -> 0 a seq -> 0 b seq
val filter : (0 a -> bool ) -> 0 a seq -> 0 a seq
val iterates : (0 a -> 0 a) -> 0 a -> 0 a seq
val from : int -> int seq
end;
Exercise 5.27 Declare the missing functions null and drop by analogy with
the list versions. Also declare toList, which converts a finite sequence to a list.
Exercise 5.28 Show the computation steps of add (from 5, squares(from 9)).
[ x1 , . . . , x1 , x2 , . . . , x2 , . . . ]
| {z } | {z }
k times k times
200 5 Functions and Infinite Data
Exercise 5.31 Which of the list functionals takewhile, dropwhile, exists and all
can sensibly be generalized to infinite sequences? Code those that can be, and
explain what goes wrong with the others.
Making change, revisited. The function allChange (Section 3.7) computes all
possible ways of making change. It is not terribly practical: using British coin
values, there are 4366 different ways of making change for 99 pence!
If the function returned a sequence, it could compute solutions upon demand,
saving time and storage. Getting the desired effect in ML requires care. Replac-
ing the list operations by sequence operations in allChange would achieve little.
The new function would contain two recursive calls, with nothing to delay the
second call’s execution. The resulting sequence would be fully evaluated.
Seq.@ (allChange(c::coins, c::coinvals, amount-c),
allChange(coins, coinvals, amount))
Better is to start with the solution of Exercise 3.14, where the append is replaced
by an argument to accumulate solutions. An accumulator argument is usually a
list. Should we change it to a sequence?
fun seqChange (coins, coinvals, 0, coinsf ) = Cons(coins,coinsf )
| seqChange (coins, [], amount, coinsf ) = coinsf ()
| seqChange (coins, c::coinvals, amount, coinsf ) =
if amount<0 then coinsf ()
else seqChange(c::coins, c::coinvals, amount-c,
fn()=> seqChange(coins, coinvals, amount, coinsf ));
> val seqChange = fn : int list * int list * int *
> (unit -> int list seq) -> int list seq
Instead of a sequence there is a tail function coinsf of type unit → int list seq.
This allows us to use Cons in the first line, instead of the eager Seq.cons. And
it requires a fn around the inner recursive call, delaying it. This sort of thing is
easier in Haskell.
We can now enumerate solutions, getting each one instantly:
seqChange([], gbc oins, 99, fn ()=> Nil );
5.14 Elementary applications of sequences 201
> Cons ([2, 2, 5, 20, 20, 50], fn) : int list seq
Seq.tl it;
> Cons ([1, 1, 2, 5, 20, 20, 50], fn) : int list seq
Seq.tl it;
> Cons ([1, 1, 1, 1, 5, 20, 20, 50], fn) : int list seq
The overheads are modest. Computing all solutions takes 354 msec, which is
about 1/3 slower than the list version of the function and twice as fast as the
original allChange.
Prime numbers. The sequence of prime numbers can be computed by the Sieve
of Eratosthenes.
• Start with the sequence [2, 3, 4, 5, 6, . . .].
• Take 2 as a prime. Delete all multiples of 2, since they cannot be prime.
This leaves the sequence [3, 5, 7, 9, 11, . . .].
• Take 3 as a prime and delete its multiples. This leaves the sequence
[5, 7, 11, 13, 17, . . .].
• Take 5 as a prime . . . .
202 5 Functions and Infinite Data
At each stage, the sequence contains those numbers not divisible by any of the
primes generated so far. Therefore its head is prime, and the process can con-
tinue indefinitely.
The function sift deletes multiples from a sequence, while sieve repeatedly
sifts a sequence:
fun sift p = Seq.filter (fn n => n mod p <> 0);
> val sift = fn : int -> int seq -> int seq
fun sieve (Cons(p,nf )) = Cons(p, fn()=> sieve (sift p (nf ())));
> val sieve = fn : int seq -> int seq
The sequence primes results from sieve [2, 3, 4, 5, . . .]. No primes beyond the
first are generated until the sequence is inspected.
val primes = sieve (Seq.from 2);
> val primes = Cons (2, fn) : int seq
Seq.take (primes, 25);
> [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43,
> 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97] : int list
The simplest termination test is to stop when the absolute difference between
two approximations is smaller than a given tolerance > 0 (written eps below).4
fun within (eps:real ) (Cons(x ,xf )) =
let val Cons(y,yf ) = xf ()
in if Real .abs(x -y) < eps then y
else within eps (Cons(y,yf ))
end;
> val within = fn : real -> real seq -> real
Putting 10−6 for the tolerance and 1 for the initial approximation yields a square
root function:
fun qroot a = within 1E ˜6 (Seq.iterates (nextApprox a) 1.0);
> val qroot = fn : real -> real
qroot 5.0;
> 2.236067977 : real
it *it;
> 5.0 : real
Would not a Fortran program be better? This example follows Hughes (1989)
and Halfant and Sussman (1988), who show how interchangeable parts involv-
ing sequences can be assembled into numerical algorithms. Each algorithm is
tailor made to suit its application.
For instance, there are many termination tests to choose from. The absolute
difference (|x − y| < ) tested by within is too strict for large numbers. We
could test relative difference (|x /y − 1| < ) or something fancier:
|x − y|
<
(|x | + |y|)/2 + 1
4 The recursive call passes Cons(y, yf ) rather than xf (), which denotes the same
value, to avoid calling xf () twice. Recall that our sequences are not truly lazy,
but employ a call-by-name rule.
204 5 Functions and Infinite Data
1 x1 x2 x3 xk
ex = + + + + ··· + + ···
0! 1! 2! 3! k!
A sequence of sequences can be viewed using takeqq(xqq, (m, n)). This list of
lists is the m × n upper left rectangle of xqq.
The function List.concat appends the members of a list of lists, forming one
list. Let us declare an analogous function enumerate to combine a sequence
of sequences. Because the sequences may be infinite, we must use interleave
instead of append.
Here is the idea. If the input sequence has head xq and tail xqq, recursively
enumerate xqq and interleave the result with xq. If we take List.concat as a
model we end up with faulty code:
5.16 Interleaving and sequences of sequences 205
If the input to this function is infinite, ML will make an infinite series of recur-
sive calls, generating no output. This version would work in a lazy functional
language, but with ML we must explicitly terminate the recursive calls as soon
as some output can be produced. This requires a more complex case analysis.
If the input sequence is non-empty, examine its head; if that is also non-empty
then it contains an element for the output.
fun enumerate Nil = Nil
| enumerate (Cons(Nil , xqf )) = enumerate (xqf ())
| enumerate (Cons(Cons(x ,xf ), xqf )) =
Cons(x , fn()=> Seq.interleave(enumerate (xqf ()), xf ()));
> val enumerate = fn : ’a seq seq -> ’a seq
The second and third cases simulate the incorrect version’s use of interleave, but
the enclosing fn()=>· · · terminates the recursive calls.
Here is the sequence of all pairs of positive integers.
val pairqq = makeqq (Seq.from 1, Seq.from 1);
> val pairqq = Cons (Cons ((1, 1), fn), fn)
> : (int * int) seq seq
Seq.take(enumerate pairqq, 18);
> [(1, 1), (2, 1), (1, 2), (3, 1), (1, 3), (2, 2), (1, 4),
> (4, 1), (1, 5), (2, 3), (1, 6), (3, 2), (1, 7), (2, 4),
> (1, 8), (5, 1), (1, 9), (2, 5)] : (int * int) list
We can be more precise about the order of enumeration. Consider the following
declarations:
fun powof 2 n = repeat double n 1;
> val powof2 = fn : int -> int
fun pack (i,j ) = powof 2(i-1) * (2*j - 1);
> val pack = fn : int * int -> int
> [8, 24, 40, 56, 72, 88]] : int list list
Our enumeration decodes the packing function, returning the sequence of posi-
tive integers in their natural order:
Seq.take (enumerate nqq, 12);
> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] : int list
It is not hard to see why this is so. Each interleaving takes half its elements
from one sequence and half from another. Repeated interleaving distributes the
places in the output sequence by powers of two, as in the packing function.
Exercise 5.35 Generate the sequence of all finite lists of positive integers.
(Hint: first, declare a function to generate the sequence of lists having a given
length.)
Exercise 5.36 Show that for every positive integer k there are unique positive
integers i and j such that k = pack (i , j ). What is pack (i , j ) in binary notation?
Exercise 5.37 Adapt the definition of type α seq to declare a type of infinite
binary trees. Write a function itr that, applied to an integer n, constructs the
tree whose root has the label n and the two subtrees itr (2n) and itr (2n + 1).
By representing the set of solutions as a lazy list, the search strategy can be
chosen independently from the process that consumes the solutions. The lazy
list serves as a communication channel: the producer generates its elements
and the consumer removes them. Because the list is lazy, its elements are not
produced until the consumer requires them.
Figures 5.1 and 5.2 contrast the depth-first and breadth-first strategies, apply-
ing both to the same tree. The tree is portrayed at some point during the search,
with subtrees not yet visited as wedges. Throughout this section, no tree node
may have an infinite number of branches. Trees may have infinite depth.
In depth-first search, each subtree is fully searched before its brother to the
right is considered. The numbers in the figure show the order of the visits.
Node 5 is reached because node 4 is a leaf, while four subtrees remain to be
visited. If the subtree below node 5 is infinite, the other subtrees will never
be reached: the strategy is incomplete. Depth-first search is familiarly called
backtracking.
Breadth-first search visits all nodes at the current depth before moving on to
the next depth. In Figure 5.2 it has explored the tree to three levels. Because
of finite branching, all nodes will be reached: the strategy is complete. But it
is seldom practical, except in trivial cases. To reach a given depth, it visits an
exponential number of nodes and uses an exponential amount of storage.
4 5
2 3 4
5 6 7 8 9
5.18 Generating palindromes 209
Breadth-first search stores the pending nodes on a queue, not on a stack. When y
is visited, its successors in next y are put at the end of the queue.5
fun breadthFirst next x =
let fun bfs [] = Nil
| bfs(y::ys) = Cons(y, fn()=> bfs(ys @ next y))
in bfs [x ] end;
> val breadthFirst = fn : (’a -> ’a list) -> ’a -> ’a seq
Both strategies simply enumerate all nodes in some order. Solutions are iden-
tified using the functional Seq.filter with a suitable predicate on nodes. Other
search strategies can be obtained by modifying these functions.
Best-first search. Searches in Artificial Intelligence frequently employ a heuris-
tic distance function, which estimates the distance to a solution from any given
node. The estimate is added to the known distance from that node to the root, thereby
estimating the distance from the root to a solution via that node. These estimates impose
an order on the pending nodes, which are stored in a priority queue. The node with the
least estimated total distance is visited next.
If the distance function is reasonably accurate, best-first search converges rapidly to
a solution. If it is a constant function, then best-first search degenerates to breadth-first
search. If it overestimates the true distance, then best-first search may never find any
solutions. The strategy takes many forms, the simplest of which is the A* algorithm.
See Rich and Knight (1991) for more information.
5 Stacks and queues are represented here by lists. Lists make efficient stacks but
poor queues. Section 7.3 presents efficient queues.
210 5 Functions and Infinite Data
A B C
AA BA CA AB BB CB AC BC CC
A palindrome is a list that equals its own reverse. Let us declare the correspond-
ing predicate:
fun isPalin l = (l = rev l );
> val isPalin = fn : ’’a list -> bool
There are, of course, more efficient ways of generating palindromes. Our ap-
proach highlights the differences between different search strategies. Let us
declare a function to help us examine sequences of nodes (implode joins a list
of characters to form a string):
fun show n csq = map implode (Seq.take(csq,n));
> val show = fn : int -> char list seq -> string list
Breadth-first search is complete and generates all the palindromes. Let us in-
spect the sequences before and after filtering:
show 8 (breadthFirst nextChar []);
> ["", "A", "B", "C", "AA", "BA", "CA", "AB"] : string list
show 8 (Seq.filter isPalin (breadthFirst nextChar []));
> ["", "A", "B", "C", "AA", "BB", "CC", "AAA"] : string list
Depth-first search fails to find all solutions. Since the tree’s leftmost branch is
infinite, the search never leaves it. We need not bother calling Seq.filter :
show 8 (depthFirst nextChar []);
> ["", "A", "AA", "AAA", "AAAA", "AAAAA", "AAAAAA",
> "AAAAAAA"] : string list
. . . runs forever.
On the other hand, breadth-first search explores the entire subtree below B .
Filtering yields the sequence of all palindromes ending in B :
show 6 (breadthFirst nextChar [#"B"]);
> ["B", "AB", "BB", "CB", "AAB", "BAB"] : string list
show 6 (Seq.filter isPalin (breadthFirst nextChar [#"B"]));
> ["B", "BB", "BAB", "BBB", "BCB", "BAAB"] : string list
To generate the search tree, function nextQueen takes a board and returns the
list of the safe board positions having a new Queen. Observe the use of the
list functionals, map with a section and filter with a curried function. The
Eight Queens problem is generalized to the n Queens problem, which is to place
n Queens safely on an n × n board. Calling upto (declared in Section 3.1)
generates the list [1, . . . , n] of candidate Queens.
fun nextQueen n qs =
map (secr op:: qs) (List.filter (safeQueen qs) (upto(1,n)));
> val nextQueen = fn : int -> int list -> int list list
Let us declare a predicate to recognize solutions. Since only safe board positions
are considered, a solution is any board having n Queens.
fun isFull n qs = (length qs=n);
> val isFull = fn : int -> ’a list -> bool
Function depthFirst finds all 92 solutions for 8 Queens. This takes 130 msec:
fun depthQueen n = Seq.filter (isFull n) (depthFirst (nextQueen n) []);
> val depthQueen = fn : int -> int list seq
Seq.toList (depthQueen 8);
> [[4, 2, 7, 3, 6, 8, 5, 1], [5, 2, 4, 7, 3, 8, 6, 1],
> [3, 5, 2, 8, 6, 4, 7, 1], [3, 6, 4, 2, 8, 5, 7, 1],
> [5, 7, 1, 3, 8, 6, 4, 2], [4, 6, 8, 3, 1, 7, 5, 2],
> ...] : int list list
Since sequences are lazy, solutions can be demanded one by one. Depth-first
search finds the first solution quickly (6.6 msec). This is not so important for
the Eight Queens problem, but the 15 Queens problem has over two million
solutions. We can compute a few of them in one second:
Seq.take(depthQueen 15, 3);
> [[8, 11, 7, 15, 6, 9, 13, 4, 14, 12, 10, 2, 5, 3, 1],
> [11, 13, 10, 4, 6, 8, 15, 2, 12, 14, 9, 7, 5, 3, 1],
> [13, 11, 8, 6, 2, 9, 14, 4, 15, 10, 12, 7, 5, 3, 1]]
> : int list list
Imagine the design of a procedural program that could generate solutions upon
demand. It would probably involve coroutines or communicating processes.
Function breadthFirst finds the solutions slowly.6 Finding one solution takes
6 It takes 310 msec. A version using efficient queues takes 160 msec.
5.20 Iterative deepening 213
nearly as long as finding all! The solutions reside at the same depth in the search
tree; finding the first solution requires searching virtually the entire tree.
Let us examine this declaration in detail. Tail functions (of type unit → α seq)
rather than sequences must be used in order to delay evaluation. The function
call dfs k (y,sf ) constructs the sequence of all solutions found at depth k
below node y, followed by the sequence sf (). There are two cases to consider.
It can also solve the Eight Queens problem, quite slowly (340 msec). With a
larger depth interval d , iterative deepening recovers some of the efficiency of
depth-first search, while remaining complete.
Exercise 5.41 A flaw of depthIter is that it explores ever greater depths even
if the search space is finite. It can run forever, seeking the 93rd solution to the
Eight Queens problem. Correct this flaw; is your version as fast as depthIter ?