Statistical Physics: an
introduction in two parts
Posted by tselilschramm • September 15, 2018
Statistical physics has deep connections to
many computational problems, including
statistical inference, counting and sampling,
and optimization. Perhaps especially
compelling are the fieldʼs insights and
intuitions regarding “average-case
complexity” and information-computation
gaps. These are topics for which the
traditional theoretical CS approaches seem ill-
suited, while on the other hand statistical
physics has supplied a rich (albeit not always
mathematically rigorous) theory.
Statistical physics is the first topic in the
seminar course I am co-teaching with Boaz
this fall, and one of our primary goals is to
explore this theory. This blog post is a re-
working of a lecture I gave in class this past
Friday. It is meant to serve as an introduction
to statistical physics, and is composed of two
parts: in the first part, I introduce the basic
concepts from statistical physics in a hands-
on manner, by demonstrating a phase
transition for the Ising model on the complete
graph. In the second part, I introduce random
k-SAT and the satisfiability conjecture, and
give some moment-method based proofs of
bounds on the satisfiability threshold.
Update on September 16, 3248pm: the first
version of this post contained an incorrect
plot of the energy density of the Ising model
on the complete graph, which I have amended
below.
I. From particle interactions to
macroscopic behaviors
In statistical physics, the goal is to understand
how materials behave on a macroscopic scale
based on a simple model of particle-particle
interactions.
For example, consider a block of iron. In a
block of iron, we have many iron particles, and
each has a net
-polarization or “spin” which is induced by the
quantum spins of its unpaired electrons. On
the microscopic scale, nearby iron atoms
“want” to have the same spin. From what I
was able to gather on Wikipedia, this is
because the unpaired electrons in the distinct
iron atoms repel each other, and if two nearby
iron atoms have the same spins, then this
allows them to be in a physical configuration
where the atoms are further apart in space,
which results in a lower energy state
(because of the repulsion between electrons).
When most of the particles in a block of iron
have correlated spins, then on a macroscopic
scale we observe this correlation as the
phenomenon of magnetism (or
ferromagnetism if we want to be technically
correct).
In the 1890ʼs, Pierre Curie showed that if you
heat up a block of iron (introducing energy
into the system), it eventually loses its
magnetization. In fact, magnetization exhibits
a phase transition: there is a critical
temperature, , below which a block of iron
will act as a magnet, and above which it will
suddenly lose its magnetism. This is called
the “Curie temperature”. This phase transition
is in contrast to the alternative, in which the
iron would gradually lose its magnetization as
it is heated.
The Ising Model
Weʼll now set up a simple model of the
microscopic particle-particle interactions, and
see how the global phenomenon of the
magnetization phase transition emerges. This
is called the Ising model, and it is one of the
more canonical models in statistical physics.
Suppse that we have iron atoms, and that
their interactions are described by the (for
simplicity unweighted) graph with
adjacency matrix . For example, we may
think of the atoms as being arranged in a 3D
cubic lattice, and then would be the 3D
cubic lattice graph. We give each atom a label
in
, and we associate with each atom
a spin
For each choice of spins or state
we associate the total energy
If two interacting particles have the same
spin, then they are in a “lower energy”
configuration, and then they contribute to
the total energy. If two neighboring particles
have opposite spins, then they are in a “higher
energy” configuration, and they contribute
to the total energy.
We also introduce a temperature parameter .
At each , we want to describe what a
“typical” configuration for our block of iron
looks like. When
, there is no kinetic energy in the system, so
we expect the system to be in the lowest-
energy state, i.e. all atoms have the same
spin. As the temperature increases, the kinetic
energy also increases, and we will begin to
see more anomalies.
In statistical physics, the “description” takes
the form of a probability distribution over
states . To this end we define the Boltzmann
distribution, with density function
As
becomes supported entirely on the that
minimize
; we call these the ground states (for
connected these are exactly
). On the other hand as
, all
are weighted equally according to .
Above we have defined the Boltzmann
distribution to be proportional to
. To spell it out,
The normalizing quantity
is referred to as the partition function, and is
interesting in its own right. For example, from
we can compute the free energy
of the system, as well as the internal energy
and the entropy
Using some straightforward calculus, we can
then see that
is the Shannon entropy,
that
is the average energy in the system,
and that the free energy is the difference of
the internal energy and the product of the
temperature
and the entropy,
just like the classical thermodynamic
definitons!
Why these functions?
The free energy, internal energy, and entropy
encode information about the typical behavior
of the system at temperature . We can get
some intuition by considering the extremes,
and
In cold systems with
, if we let be the energy of the ground
state,
be the number of ground state configurations,
and
be the energy gap, then
where the notation hides factors that do
not depend on . From this it isnʼt hard to
work out that
We can see that the behavior of the system is
dominated by the few ground states. As
, all of the free energy can be attributed to the
internal energy term.
On the other hand, as
and the behavior of the system is chaotic,
with the free energy dominated by the
entropy term.
Phase transition at the critical
temperature.
We say that the system undergoes a phase
transition at if the energy density
is not analytic at
. Often, this comes from a shift in the relative
contributions of the internal energy and
entropy terms to
. Phase transitions are often associated as
well with a qualitative change in system
behavior.
For example, weʼll now show that for the Ising
model with
the complete graph with self loops, the
system has a phase transition at
(the self-loops donʼt make much physical
sense, but are convenient to work with).
Furthermore, weʼll show that this phase
transition corresponds to a qualitative change
in the system, i.e. the loss of magnetism.
Define the magnetization of the system with
spins to be
. If
, then we say the system is magnetized.
In the complete graph, normalized so that the
total interaction of each particle is , there is a
direct relationship between the energy and
the magnetization:
Computing the energy density.
The magnetization takes values for
. So, letting be the number of states with
magnetization , we have that
Now, is just the number of strings with
Hamming weight
, so
. By Stirlingʼs approximation
, where is the entropy function, so up to
lower-order terms
Now we apply the following simplification: for
, and then
. Treating our summands as the entries of the
vector , from this we have,
By definition of the energy density,
and since
independently of , we also have
because the error from rounding
to the nearest factor of is
We can see that the first term in the
expression for corresponds to the square
of the magnetization (and therefore the
energy); the more magnetized the system is,
the larger the contribution from the first term.
The second term corresponds to the entropy,
or the number of configurations in the
support; the larger the support, the larger the
contribution of the second term. As
, the contribution of the entropy term
overwhelms the contribution of the energy
term; this is consistent with our physical
intuition.
A phase transition.
Weʼll now demonstrate that there is indeed a
phase transition in . To do so, we solve for
this maximum. Taking the derivative with
respect to , we have that
so the derivative is whenever
. From this, we can check the maxima. When
, there are two maxima equidistant from
the origin, corresponding to negatively or
positively-magnetized states. When , the
maximizer is , corresponding to an
unmagnetized state.
Given the maximizers, we now have the
energy density. When we plot the energy
density, the phase transition at
is subtle (an earlier version of this post
contained a mistaken plot):
But when we plot the derivative, we can see
that it is not smooth at
And with some calculus it is possible to show
that the second derivative is indeed not
continuous at
Qualitatively, it is convincing that this phase
transition in the energy density is related to a
transition in the magnetization (because the
maximizing corresponds to the typical
magnetization). One can make this formal by
performing a similar calculation to show that
the internal energy undergoes a phase
transition, which in this case is proportional to
the expected squared magnetization,
Beyond the complete graph
The Ising model on the complete graph
(also called the Curie-Weiss model) is
perhaps not a very convincing model for a
physical block of iron; we expect that locality
should govern the strength of the
interactions. But because the energy and the
magnetization are related so simply, it is easy
to solve.
Solutions are also known for the 1D and 2D
grids; solving it on higher-dimensional
lattices, as well as in many other interesting
settings, remains open. Interestingly, the
conformal bootstrap method that Boaz
mentioned has been used towards solving the
Ising model on higher-dimensional grids.
II. Constraint Satisfaction
Problems
For those familiar with constraint satisfaction
problems (CSPs), it may have already been
clear that the Ising model is a CSP. The spins
are Boolean variables, and the energy
function
is an objective function corresponding to the
EQUALITY CSP on (a pretty boring CSP,
when taken without negations). The
Boltzmann distribution gives a probability
distribution over assignments to the variables
, and the temperature determines the
objective value of a typical
We can similarly define the energy, Boltzmann
distribution, and free energy/entropy for any
CSP (and even to continuous domains such as
). Especially popular with statistical physicists
are:
CSPs (such as the Ising model) over
grids and other lattices.
Gaussian spin glasses: CSPs over
or
where the energy function is proportional
to
, where is an order- symmetric tensor
with entries sampled i.i.d. from
. The Ising model on a graph with
random Gaussian weights is an example
for
.
Random instances of -SAT: Of the
possible clauses (with negations) on
variables, clauses
are sampled uniformly at random, and
the energy function is the number of
satisfied clauses,
.
Random instances of other Boolean and
larger-alphabet CSPs.
In some cases, these CSPs are reasonable
models for physical systems; in
other cases, they are primarily of theoretical
interest.
Algorithmic questions