Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
68 views11 pages

Generating Conjectures On Fundamental Constants With The Ramanujan Machine

Uploaded by

sbwjlvnbs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views11 pages

Generating Conjectures On Fundamental Constants With The Ramanujan Machine

Uploaded by

sbwjlvnbs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Article

Generating conjectures on fundamental


constants with the Ramanujan Machine

https://doi.org/10.1038/s41586-021-03229-4 Gal Raayoni1,4, Shahar Gottlieb1,4, Yahel Manor1,2,4, George Pisha1, Yoav Harris1,


Uri Mendlovic3, Doron Haviv1, Yaron Hadad1 & Ido Kaminer1 ✉
Received: 30 April 2020

Accepted: 13 November 2020


Fundamental mathematical constants such as e and π are ubiquitous in diverse fields
Published online: 3 February 2021
of science, from abstract mathematics and geometry to physics, biology and
Check for updates
chemistry1,2. Nevertheless, for centuries new mathematical formulas relating
fundamental constants have been scarce and usually discovered sporadically3–6. Such
discoveries are often considered an act of mathematical ingenuity or profound
intuition by great mathematicians such as Gauss and Ramanujan7. Here we propose a
systematic approach that leverages algorithms to discover mathematical formulas for
fundamental constants and helps to reveal the underlying structure of the constants.
We call this approach ‘the Ramanujan Machine’. Our algorithms find dozens of well
known formulas as well as previously unknown ones, such as continued fraction
representations of π, e, Catalan’s constant, and values of the Riemann zeta function.
Several conjectures found by our algorithms were (in retrospect) simple to prove,
whereas others remain as yet unproved. We present two algorithms that proved useful
in finding conjectures: a variant of the meet-in-the-middle algorithm and a gradient
descent optimization algorithm tailored to the recurrent structure of continued
fractions. Both algorithms are based on matching numerical values; consequently,
they conjecture formulas without providing proofs or requiring prior knowledge of
the underlying mathematical structure, making this methodology complementary to
automated theorem proving8–13. Our approach is especially attractive when applied to
discover formulas for fundamental constants for which no mathematical structure is
known, because it reverses the conventional usage of sequential logic in formal
proofs. Instead, our work supports a different conceptual framework for research:
computer algorithms use numerical data to unveil mathematical structures, thus
trying to replace the mathematical intuition of great mathematicians and providing
leads to further mathematical research.

Throughout history, simple formulas of fundamental constants sym- famous for saying: “I have the result, but I do not yet know how to get
bolized simplicity, aesthetics and mathematical beauty2. A couple of it”15, which emphasizes the role of identifying patterns and RFs in data
well known examples include Euler’s identity eiπ + 1 = 0 and the contin- as enabling acts of mathematical discovery.
ued fraction representation of the golden ratio: In a different field but a similar manner, Johannes Rydberg’s discov-
ery of his formula of hydrogen spectral lines16 resulted from his analy-
1 −1
sis of the spectral emission by chemical elements: λ = RH(n−2 −2
φ=1+ . 1 − n 2 ),
1+
1 (1) where λ is the emission wavelength, RH is the Rydberg constant, and n1
1
1+
1+…
and n2 are the upper and lower quantum energy levels, respectively.
This insight, emerging directly from identifying patterns in data, had
We use the term regular formulas (RFs) for any mathematical expres- profound implications on quantum mechanics and modern physics.
sion that can be encapsulated using a computable expression14, such Unlike measurements in physics and all other sciences, most
as equation (1). mathematical constants can be calculated to an arbitrary precision
The act of discovering new RFs is often attributed to profound intui- (number of digits) with an appropriate formula, thus providing an
tion, such as in the case of Gauss’ ability to see meaningful patterns absolute ground truth. In this sense, mathematical constants contain
in numerical data that led to the famous prime number theorem and an unlimited amount of data (for example, the digits in an irrational
new fields of analysis such as elliptic and modular functions. He is even number), which we use as ground truth for finding new RFs. Since the

1
Technion—Israel Institute of Technology, Haifa, Israel. 2The Technion Harry and Lou Stern Family Science and Technology Youth Center, Pre-University Education, Haifa, Israel. 3Google,
Tel Aviv, Israel. 4These authors contributed equally: Gal Raayoni, Shahar Gottlieb, Yahel Manor. ✉e-mail: [email protected]

Nature | Vol 590 | 4 February 2021 | 67


Article

Pattern
Redundancy Rigorous
learning and RF Validation Conjecture
elimination proof
generalization

Conjecturing Proving

Fig. 1 | Conceptual flow of the wider concept of the Ramanujan Machine. validated results form mathematical conjectures that need to be proven
First, using approaches of pattern learning and generalization, we can generate analytically, thus closing a complete research endeavour from pattern
a space of RF conjectures, for example, PCFs. We then apply a search algorithm, generation to proof, potentially yielding further mathematical insight.
validate potential conjectures, and remove redundant results. Finally,

fundamental constants are universal and ubiquitous in their applica- Our MITM-RF algorithm was able to produce several novel conjec-
tions, finding such patterns can reveal new mathematical structures tures that have short proofs, for example:
with broad implications, for example, the Rogers−Ramanujan con-
tinued fraction (which has implications on modular forms)17. Con-
4 1×1
=3− 2×3
sequently, having systematic methods to derive new RFs could help 3π − 8 6− 3×5
9−
research in many fields of science. 12 −
4×7

In this Article, we present a concept of learning mathematical rela-
tions of fundamental constants and provide a list of conjectures found 2 1 × (3 − 2 × 1)
=0−
using this method. Although the concept can be leveraged for many π+2 3−
2 × (3 − 2 × 2)
3 × (3 − 2 × 3)
forms of RFs, we demonstrate its potential with equations involving 6−
4 × (3 − 2 × 4)
9−
polynomial continued fractions (PCFs)18 …

b1 e 1
a0 + =4− 2
a1 +
b2
(2) e−2 5−
b3 3
a2 + 6− 4
a3 + … 7−

where the partial numerators and denominators an,bn are the evalua- 1 1
=1+
tions (at x = n) of polynomials α(x), β(x) ∈ ℤ[x ] , respectively. PCFs have e−2 1+
−1

been of interest to mathematicians for centuries and still are, for exam- 1+
2 (3)
−1
1+
ple, William Broucker’s π representation19 and Zudilin’s work on dif- 1+

3

ference equations and Catalan’s constant (for example, ref. 5).


One reason we chose to focus on PCFs is their ability to balance These RFs are auto-generated conjectures for mathematical formu-
simplicity and broad implications. Their structure is accessible for las of fundamental constants, generated by applying the MITM-RF
computer-based exploration using large integer operations, making algorithm. These conjectures were proven by contributions from the
them a good testing-ground for automated conjecturing. At the same community following the first appearance of our work on the arXiv
time, PCFs are related to many special functions and generalize all infi- preprint server53 (see Supplementary Information section F). Both
nite sums. PCFs also allow us to isolate unique aspects of importance results for π converge exponentially, and both results for e converge
to fundamental constants such as testing irrationality and normality6 super-exponentially. Supplementary Information section A presents
using efficient computation methods—see Supplementary Informa- additional results (Supplementary Tables 1−3) found by our algorithms
tion sections D and G. Moreover, PCFs are abundant in many areas of along with their convergence rates.
mathematics20,21 because they constitute an important special case of Our MITM-RF algorithm also produced novel conjectures that are
a general mathematical object: linear recurrence relations with poly- currently still unproved:
nomial coefficients. Recurrences of depth 2 correspond to PCFs and
appear in this form in many problems (PCFs with alternating polyno- 8 2 × 14 − 13
2 =1− 2 × 2 4 − 23
mials, as shown below, correspond to recursion depths >2). The solu- π 7− 4 3
tions to such recurrences are usually very complex and include special 19 −
2×3 −3
2 × 44 − 43
functions (for example, hypergeometric functions and the incomplete 37 −

gamma function). For this reason, finding new PCF identities is valuable
for different mathematical objects, especially when incorporated as
12 16 × 16
a part of symbolic calculation programs (such as Maple and Wolfram =1×2−
7ζ (3) 16 × 26
Mathematica). More on PCFs in the Supplementary Information. 3 × 12 − 6
16 × 3
5 × 32 −
We demonstrate our approach by finding identities between a PCF 16 × 46
7 × 62 −
and a fundamental constant substituted into a rational function. For …

efficient enumeration and expression aesthetics, we limit ourselves to


integer polynomials on both sides of the equality. We propose two search 8 16
=1×1−
algorithms: The first algorithm uses a meet-in-the-middle (MITM) tech- 7ζ (3) 26
3×7− 6
nique, first executed to a relatively small precision to reduce the search 5 × 19 −
3
46
space and eliminate mismatches. We then increase its precision with a 7 × 37 −

higher number of PCF iterations on the remaining matching sequences
to validate them as conjectured RFs—the algorithm is therefore called
2 6 × 13
MITM-RF. The second algorithm uses an optimization-based gradient =3+0×7−
−1 + 2G 8 × 23 (4)
descent (GD) method, which we call Descent&Repel, converging to 3 + 1 × 10 − 3
10 × 3
integer lattice points that define conjectured RFs. 3 + 2 × 13 −

68 | Nature | Vol 590 | 4 February 2021


To the best of our knowledge, these results are previously unknown Table 1 | A sample of fundamental constants that are relevant
conjectures. ζ refers to the Riemann zeta function, and G refers to the targets for our method
Catalan constant. The section ‘Efficient computation and irrationality
bounds of the Catalan constant’ presents implications of our results Field Name Decimal expansion
for the computation of the Catalan constant. Related to Lévy’s constant γ = 3.275822…
One may wonder whether the conjectures discovered by this work continued fractions
Khinchin’s constant K0 = 2.685452…
are indeed mathematical identities or merely mathematical coinci- Chaos theory First Feigenbaum constant δ = 4.669201…
dences that break down once enough digits are calculated. However,
Second Feigenbaum constant α = 2.502907…
the method employed in this work makes it fairly unlikely for the con-
Laplace limit r* = 0.662743…
jectures to break down. For example, the probability of finding a false
positive for an enumeration space of 109 with accuracy of 50 digits is Number theory Twin prime constant Π2 = 0 . 660161…
smaller than 10−40. Our algorithms tested the conjectures for up to Meissel–Mertens constant M = 0.261497…
2,000 digits of accuracy. Landau–Ramanujan constant Λ = 0.764223…
Nevertheless, high accuracy will never substitute for a formal Combinatorics Euler–Mascheroni constant γ = 0.577215…
proof, as there exist mathematical coincidences of RFs that appear
Golomb–Dickman constant λ = 0.624329…
to accurately represent a constant to a high degree of approxima-
There are thousands of additional constants for which enough numerical data exist, and our
tion despite being fallacies22. We believe and hope that proofs of new
method is applicable. For all of these, new RF conjectures will point to deep underlying
computer-generated conjectures on fundamental constants will help connections. With further improvement in our approach, along with new algorithms provided
to create mathematical knowledge. by the community, we expect that more expressions will be found. Note that some constants
In contrast to the method we present, many known RFs for funda- in the table, such as the Feigenbaum constants, have no analytical expression whatsoever,
mental constants were discovered by conventional proofs, that is, and so far can only be computed using numerical simulations. Therefore, having a RF for them
sequential logical steps derived from known properties23. In our work, will reveal a hidden truth not only about the constant but also about the entire field to which it
relates. A wider list of constants is available in ref. 1.
we aim to reverse this process, finding new RFs for the fundamental
constants using their numerical data alone, without any prior knowl-
edge about their mathematical structure (Fig. 1). Each RF may enable
reverse-engineering of the mathematical structure that produces it. In envisioned the use of computers for the entire process of scientific
certain cases, where the proof uses new techniques, it may also provide discovery. Notable work by Fajtlowicz (called GRAFFITI) has found new
insight into the field. Our approach could be especially valuable when conjectures in graph theory and matrix theory30 by analysing properties
applied for empirical constants, such as the Feigenbaum constant from such as chromatic index and independence number on a large number
chaos theory (Table 1), which are derived numerically from simulations of graphs and deducing general rules. Recent work applied machine
and have no analytic representation. learning techniques to analyse millions of elliptic curves31 and explore
Given the success of our approach to finding new RFs for fundamental their characteristics. Automated conjecture generation has also been
constants, there are additional avenues for more advanced algorithms used as part of a combined approach with ATP32,33, for example, on the
and future research. Inspired by worldwide collaborative efforts in irrationality measure of π (ref. 4).
mathematics such as the Great Internet Mersenne Prime Search (GIMPS; A particularly noteworthy algorithm in this context is PSLQ34, which
https://www.mersenne.org/), we launched the initiative http://www. was employed to study the Riemann zeta function, finding formulas “by
RamanujanMachine.com, dedicated to finding new RFs for funda- a combination of inspired guessing and extensive searching”35. PSLQ
mental constants. The general community can donate computational numerically discovered a new formula for π (which was later proved)36,
time to find RFs, propose mathematical proofs for conjectured RFs, or ∞
suggest new algorithms for finding them (Supplementary Informa- 1  4 2 1 1 
π= ∑  − − − .
16n  8n + 1 8n + 4 8n + 5 8n + 6 
tion section B). Since its inception53, the Ramanujan Machine initiative n =0
has already yielded fruit, and several of the conjectures posed by our
algorithms have already been proved (Supplementary Information With further manipulation and analysis, this formula gave rise to an
section F). algorithm that computes strings of binary or base-16 digits of π start-
ing at a given position, without needing to know the preceding digits.
The general approach of using algorithmic and computational tools
Related work to explore the mathematical universe and discover conjectures wor-
The process of mathematical research is complex, nonlinear and often thy of further examination is known in the mathematical literature as
leverages abstract mathematical intuition, all of which are difficult experimental mathematics. A famous example is the work of Wolfram,
to express and study thoroughly. Respecting this fact, one may think who has championed experimental-computational methods to inves-
in an oversimplified manner about mathematical research as being tigate the properties of cellular automata37.
separated into two main steps: conjecturing and proving (as in Fig. 1). A similar approach for using computer algorithms in mathematical
Although both steps have received some attention in the literature, research is now developed and applied in physics. Specifically, super-
it is the second step that has been studied more extensively in the com- vised and unsupervised machine learning have been applied to discover
puter science literature and is known as automated theorem proving physical laws from measured data (for example, refs. 38–43).
(ATP)24, which focuses on proving existing conjectures. In ATP, algo- Our work differs from all of the above in several respects. We present
rithms have already proved many theorems such as the Four Colour an end-to-end automated conjecture generation that can validate con-
Theorem8, the Robbins’ problem10, the Kepler Conjecture on the density jectures to arbitrary precision using numerical data as ground truth and
of sphere packing11, a conjectured identity for ζ(4) (see ref. 25), and vari- allowing for a fully-automatic process that removes redundancy and
ous combinatorial identities9. There are also recent machine learning false positives without user input. Our conjectures focus on formulas
applications for ATP, such as graph neural networks12,13. for fundamental constants.
Our work focuses on automating the first step of the process, auto- Proposing conjectures is sometimes more important than proving
mated conjecture generation. Early work on automated conjecture them. For this reason, some of the most original mathematicians and
generation appeared 60 years ago26 and included substantial contribu- scientists are known for their famous unsolved conjectures rather than
tions such as the Automated Mathematician and EURISKO27–29, which for their solutions to other problems, such as Fermat’s last theorem,

Nature | Vol 590 | 4 February 2021 | 69


Article
Preprocessing
LHS hash
0 + const

LHS enumeration
1 Key Matching values
1 + const 3.1423 2.7183 6.4123 17.5438 0.7841 …
Numerical value (‘hits’)
1


Value
1 + const Symbolic lhs1 lhs2 lhs3 lhs4 lhs5 …
2 + const expression


f1 PCF a = 1 , b = 1
f2 PCF a = 1, b = 1
RHS enumeration

Calculation of RHS numerical values


f1 PCF a = 1, b = 2
Value 2.718281 3.718281 17.543901 0.748123 0.412318 …

f1 PCF a = 1 + n , b = 1

f1 PCF a = 2 + n , b = 1

High- Check hits


New conjecture precision
hits 3.1423
Validate to Increase
–1 high precision
3.1423
accuracy
2.7183
e=3+ –2 2.7182812 6.4123
4 + 5+ –3 6.4123 17.5439
6+ –4 0.7481
7+ …
17.5439
0.748108



Fig. 2 | The Meet-In-The-Middle Regular Formula algorithm. The figure and search for matches. Finally, the matches are re-evaluated to higher
describes the MITM-RF algorithm that finds PCFs for fundamental constants. precision and compared again, thus eliminating false positives. The final
First, we enumerate the LHS to a low precision (for example, 10 digits) and store results are then presented as new conjectures.
the results in a hash table. Second, we enumerate over the RHS at low precision

Hilbert’s problems, Landau’s problems, and of course the Riemann Since the LHS and RHS calculations are performed up to a limited
Hypothesis44,45. Maybe the most famous example is Ramanujan, who precision, some of the candidate solutions are typically false positives,
posed dozens of conjectures involving fundamental constants and con- eliminated by calculating the RHS and LHS to higher precision in the
sidered them to be revelations from his family’s goddess7. Our work aims last stage (Fig. 2). See Methods for the algorithm complexity and imple-
to automate the process of conjecture generation and demonstrate it mentation details (see code at http://www.RamanujanMachine.com).
by providing new conjectures for fundamental constants. By analysing Our proposed MITM algorithm discovered previously known PCFs
mathematical relationships of fundamental constants that are aesthetic and new PCF conjectures for mathematical constants such as ζ(3) (that
and concise, the Ramanujan Machine can eventually extend the work is, the Apéry constant) and the Catalan constant, presented in equa-
of great mathematicians such as Gauss, Riemann and Ramanujan. tion (4). (Supplementary Information section A provides details of the
constants for which we ran searches, successful or otherwise). After
discovering dozens of PCFs, we empirically observed (and later proved,
The MITM-RF algorithm Supplementary Information section D) a relationship between the ratio
The first algorithm we present searches for a PCF of a given fundamental of the polynomial order of an and bn, and the formula’s convergence rate
constant c (for example, c = π) of the following form: (Extended Data Fig. 1). Supplementary Information section C provides
γ(c) a wider outlook on PCFs.
= f (PCF(α, β)), (5)
δ ( c) i

for a set of four integer-coefficient polynomials (α, β, γ and δ), and a The Descent&Repel algorithm
1
given set of functions {fi} (for example, f1 (x) = x , f2 (x) = x , ⋯). PCF(α, β) We propose a GD optimization method and demonstrate its success in
means the PCF with the partial numerator an = α(n) and denominator finding RFs. Although proved successful, the MITM-RF method is not
bn = β(n) defined in equation (2). trivially scalable. This issue can be targeted by either a more sophisti-
As showcased in Fig. 2, we start by enumerating over the two sides cated variant or by switching to an optimization-based method, as is
of equation (5) and successively generate integer polynomials for α, done by the following algorithm (Fig. 3).
β, γ and δ. We calculate the left-hand side (LHS) of each instance up to To find integer solutions to equation (5), we write the following con-
limited precision and store the results in a hash table. We continue by strained optimization problem with the loss function ℒ:
evaluating the right-hand side (RHS) and attempt to match each result
in the hash table, where successful attempts are considered candidate γ(π )
min ℒ = − PCF(α, β) where {α, β , γ , δ } ⊂ ℤ[x ]. (7)
solutions. The RHS is calculated with arbitrary-size integers, directly α, β , γ, δ δ (π )
using the recurrence formula for the numerators pn and the denomina-
tors qn of the rational approximation of the PCF: Solving this optimization problem with GD seems implausible because
we are only satisfied with exact ℒ = 0 for integer parameters. Non-zero
q−1 = 0, p−1 = 1, ℒ solutions are usually meaningless as mathematical conjectures, as
q0 = 1, p0 = a 0, they are only approximations.
(6)
Nevertheless, we found a feature of ℒ that helped us develop a slightly
qn +1 = an +1qn + bn +1qn −1, pn +1 = an +1pn + bn +1pn −1 . modified GD, which we name Descent&Repel (Fig. 3). Examples of the

70 | Nature | Vol 590 | 4 February 2021


y y
20 20
Global minima curves

10 10
es Coulomb
urv GD repulsion GD Repeat…
ac
0 inim es 0

Initial points
l m v Map Points
ba ur
Glo rc –20 –10
ro
er
–10 in
g –10
rg
i ve
D
–20 x –20 x
–20 –10 0 10 20 –20 –10 0 10 20 20 10
Coulomb
y y repulsion y
20 20 20

10 10 10
Seek
GD nearby
0 0 0
integers

–10 –10 –10

–20 x –20 x – 20 x
–20 –10 0 10 20 –20 –10 0 10 20 –20 –10 0 10 20

Fig. 3 | The Descent&Repel algorithm. The figure describes the initial conditions (in this example, consisting of 600 points on a vertical line),
Descent&Repel algorithm that finds RFs for fundamental constants we perform ordinary GD alternated with ‘Coulomb’ repulsion between all the
by relying on GD optimization. The x and y axes are parameters defining the points. Finally, we alternate two GD optimizations to reach grid points: towards
polynomials of the continued fraction (in this case α(n) = n, β(n) = n2 + yn + x; integer points and the minimum curves. Lastly, we check whether any point
see Supplementary Information section E, Supplementary Table 4, and satisfies the equation. The colours indicate the loss ℒ (logarithmic scale): for
Supplementary Fig. 1). The key observation that enables this method is that the background, purple indicates larger loss and white indicates zero loss; for
almost all minima have zero loss ( ℒ = 0) and appear as (d − 1)-dimensional the points, red indicates larger loss and dark blue indicates zero loss.
manifolds, where d is the number of optimization variables. Starting with our

results appear in Extended Data Table 1. Without the restriction of of digits of the Catalan constant. Efficient formulas for calculating
being integers, the zero ℒ minima are not 0-dimensional points but fundamental constants to high precision are used for checking their
rather (d−1)-dimensional manifolds with d being the number of opti- statistical consistencies and properties, such as normality (the distri-
mization variables. Specifically, in the case plotted in Fig. 3, there are bution of digits in different integer bases)49.
d = 2 optimization variables, and therefore a 1-dimensional manifold As a consequence of the MITM-RF results for the Catalan constant, we
of minima, appearing as bright curves in the maps. This dimensional- found an infinite family of PCFs for the Catalan constant (see Methods).
ity of the minima is expected given the definition of the loss function Part of these PCFs have faster convergence rates than the current best
ℒ, which poses only a single constraint. Consequently, the GD process formula48. Figure 4a summarizes the convergence rates alongside the
is expected to result in a solution with ℒ = 0. The high dimension of the computational effort per term, conveying the comparative advantage
manifold of minima motivates our approach of adding the repel step of the new PCFs we found.
to the algorithm since most minima have a neighbourhood that con- Another important implication for such expressions is their potential
tains additional minima. See Methods for the algorithm initialization to help prove the irrationality of the Catalan constant. Each PCF pro-
and stages. vides a Diophantine approximation sequence that can be characterized
We ran the algorithm on several different search spaces (mostly with by an effective irrationality exponent that quantifies how ‘efficiently’
d = 2, Supplementary Information section E). The current implemen- it approximates the constant (see Methods).
tation of the algorithm serves as a proof of concept and as a testing A paper from 20035 found the state-of-the-art exponent of the
environment for GD variants. As such, it had not yet been executed Catalan constant to be approximately 0.524. A paper from 201650
on large search spaces. The success we had in finding conjectures in proved this value and presented the better value of about 0.554 as a
these limited runs shows the prospects of using this algorithm on larger conjecture. These values are now the best exponents available in the
search spaces with different parameter choices. literature. One of the PCFs we found here has an exponent of around
0.567, which surpasses all the previous values in the literature, as
shown in Fig. 4b. Finding an explicit sequence for which the exponent
Irrationality bounds of the Catalan constant is larger than 1 will directly prove irrationality. However, it is not trivial
Finding RFs for fundamental constants can have important pros- to find such a sequence explicitly (see, for example, ref. 51), and thus,
pects for proving their intrinsic properties. An example is Apéry’s it is of interest to try to find sequences for which the exponent is as
proof that ζ(3) is irrational, which uses a PCF representation3, and led large as possible.
to similar proofs for other constants46. Finding fast-converging RFs Figure 4b summarizes the convergence of the approximation expo-
could also provide more efficient ways of computing fundamental nent as a function of the number of computed terms. This compari-
constants; for example, one of the most efficient historical methods son includes the best values in the literature and several of our PCFs
of computing π was based on a formula by Ramanujan47. Similarly, the (detailed in Supplementary Information section G). We write the numer-
fastest-converging expression for the Catalan constant was a PCF by ical value of approximation exponent for each of the results in Sup-
Zudilin5 until a relatively recent contribution48. The latter was recently plementary Tables 5 and 6. Looking forward, it may well be that the
used in the y-cruncher algorithm for calculating the record number automated exploration of PCF Diophantine approximation sequences

Nature | Vol 590 | 4 February 2021 | 71


Article
a 0.6 b 0.7
PCF Ref. 56 (Supplementary Table 6, row 1)
Infinite sums New (Supplementary Table 6, row 2)
New (Supplementary Table 5, row 1)
Digits/term New (Supplementary Table 6, row 3)
=C New (Supplementary Table 6, row 4)
Degree
0.5 New (Supplementary Table 6, row 5)
Ref. 56 (Supplementary Table 6, row 1) 0.65 Ref. 50
Ref. 48 (Supplementary Table 5, row 2)

0.4 Guillera (2019)

Approximation exponent
0.6
Term/digits

New (Supplementary Table 5, row 3)


0.3
New (Supplementary Table 6, row 2)
New (Supplementary Table 6, row 3) C = 0.16
0.55
C = 0.2
0.2
C = 0.3

C = 0.56
0.5
0.1 C=1
New (Supplementary Table 6, row 4)
C = 2.5 New (Supplementary Table 6, row 5)

0 0.45
–1 2 5 8 11 14 17 20 23 0 200 400 600 800 1,000
Compute degree Number of terms

Fig. 4 | Efficient computation of the Catalan constant with new PCFs. to see this result. b, The convergence of the effective irrationality exponent
Comparison of computational metrics with previous results. a, For each (lower bound on the Liouville–Roth irrationality measure, see Methods) as a
formula computing the Catalan constant, the scatter plot shows the function of the number of computed terms. The previous best result, first
asymptotic number of terms required per digit of accuracy, relative to the found in ref. 5, is presented in dark blue. A conjecture for a better value, from
computational effort (compute degree: defined as the smallest possible ref. 50, is presented by a horizontal orange line. The new PCF marked in green
polynomial degree that can be used in the calculation, found after surpasses both previous values and yields the new best value for the Catalan
transforming the PCF into a matrix of balanced degrees). Green hyperbolas constant’s approximation exponent. See Supplementary Information section
mark the relative efficiencies. Readers should search ‘Guillera (2019)’ within G (specifically, Supplementary Tables 5 and 6).
the page http://www.numberworld.org/y-cruncher/internals/formulas.html

will eventually provide a higher approximation exponent that can lead Their method combines the proof as an inherent part of the discovery
to proving the irrationality of the Catalan constant. and thus can be viewed as a successful case study of algorithms that
The same approach can also be used with other constants. More gen- combine automated conjecture generation and ATP.
erally, we expect further explorations of PCFs based on the Ramanujan A wide range of such identities is likely to be useful in future
Machine to lead to additional advances in Diophantine approxima- approaches for different math problems, especially in adjacent fields
tions and irrationality measures. For example, it could be intriguing to (for example, proving the irrationality of Riemann zeta function val-
look for PCFs for values of the Riemann zeta function at odd integers, ues20). More generally, automatically discovered formulas can assist
and specifically ζ(5) (ref. 52), because such PCFs may help prove their further research efforts by enriching the modern ‘integral books’, which
irrationality and provide more efficient ways of calculating ζ values. are software and computing environments such as Maple or Wolfram
Mathematica. This process provides an elegant example of the symbio-
sis between computer-generated mathematics and human-generated
Correspondence with the community mathematics.
Following the appearance of the initial version of our work on arXiv in Although our work focuses on PCFs, we think that it can be systemati-
201953, numerous people ran our algorithms, some found new conjec- cally extended to other space of candidate RF conjectures. We envision
tures, and a few provided proofs for the new formulas. Over the span harvesting the scientific literature (for example, over 1.5 million papers
of a few months, proofs for all the original manuscript formulas were on http://arXiv.org) to generalize known formulas and identify new RFs
presented. This led us to expand our search with the MITM-RF algorithm using machine learning algorithms such as clustering methods (see,
and find more intriguing results such as PCFs for ζ(3), π2, and Catalan’s for example, ref. 55). The scientific literature provides a strong ground
constant, most of which are still unproved. truth for candidate RFs, and this method may discover mathematical
This back-and-forth dynamics between algorithms and mathema- conjectures that go far beyond PCFs.
ticians is the essence of what we believe can be achieved with auto-
matically generated conjectures of fundamental constants. A recent
example of this successful correspondence is the work of Zeilberger’s Outlook on the universality of fundamental constants
group54, generalizing and proving part of the conjectures that appeared Our work provides the groundwork for a more comprehensive study
in the earlier arXiv version of our work53 (Supplementary Information into fundamental constants and their underlying mathematical
section F.3). An example from their paper is the elegant formula structure. Our proposed algorithms found PCFs for the constants π,
e, Catalan’s constant and ζ(3). Table 1 presents a selection of additional
a×1 a a + k +1 fundamental constants of particular interest to our approach. For some
1+k+ = ,
( )
a×2 s
2+k+ (a + k)! ea − ∑ s=0
a+k a of them, such as the Feigenbaum constants, no PCF (or any RF) is known.
3+k+… s!
Potentially the most interesting constants for further research are
a > − k , a ∈ ℕ, k ∈ ℤ, from fields like number theory (not so ironically, some of them are

72 | Nature | Vol 590 | 4 February 2021


also named after Ramanujan) and various fields of physics. There, any 27. Lenat, D. B. & Brown, J. S. Why AM and EURISKO appear to work. Artif. Intell. 23, 269–294
(1984).
new RF can point to a hidden connection between fields of science. 28. Lenat, D. B. The nature of heuristics. Artif. Intell. 19, 189–249 (1982).
We believe it would be particularly interesting to extend our work to 29. Davis, R. & Lenat, D. B. Knowledge-Based Systems In Artificial Intelligence (McGraw-Hill,
test RFs that involve several different constants. With such algorithms 1982).
30. Fajtlowicz, S. On conjectures of Graffiti. In Annals of Discrete Mathematics Vol. 38, 113–118
applied to the thousands of fundamental constants in the literature, (Elsevier, 1988).
we expect many new RFs to be found. 31. Alessandretti, L., Baronchelli, A. & He, Y.-H. Machine learning meets number theory: the
data science of Birch–Swinnerton-Dyer. Preprint at https://arxiv.org/abs/1911.02008
(2019).
32. Chen, W. Y., Hou, Q. H. & Zeilberger, D. Automated discovery and proof of congruence
Online content theorems for partial sums of combinatorial sequences. J. Diff. Equ. Appl. 22, 780–788
Any methods, additional references, Nature Research reporting sum- (2016).
33. Buchberger, B. et al. Theorema: towards computer-aided mathematical theory
maries, source data, extended data, supplementary information, exploration. J. Appl. Log. 4, 470–504 (2006).
acknowledgements, peer review information; details of author contri- 34. Ferguson, H., Bailey, D. & Arno, S. Analysis of PSLQ, an integer relation finding algorithm.
butions and competing interests; and statements of data and code avail- Math. Comput. Am. Math. Soc. 68, 351–369 (1999).
35. Bailey, D., Borwein, P. & Plouffe, S. On the rapid computation of various polylogarithmic
ability are available at https://doi.org/10.1038/s41586-021-03229-4. constants. Math. Comput. Am. Math. Soc. 66, 903–913 (1997).
36. Bailey, D. & Broadhurst, D. J. Parallel integer relation detection: techniques and
applications. Math. Comput. 70, 1719–1737 (2000).
1. Finch, S. Mathematical Constants (Cambridge Univ. Press, 2003). 37. Wolfram, S. A New Kind Of Science Vol. 5 (Wolfram Media, 2002).
2. Bailey, D., Plouffe, S. M., Borwein, P. & Borwein, J. The quest for pi. Math. Intell. 19, 50–56 38. Schmidt, M. & Lipson, H. Distilling free-form natural laws from experimental data. Science
(1997). 324, 81–85 (2009).
3. Apéry, R. Irrationalité de ζ(2) et ζ(3). Asterisque 61, 11–13 (1979). 39. He, Y.-H. Deep-learning the landscape. Preprint at https://arxiv.org/abs/1706.02714
4. Zeilberger, D. & Zudilin, W. The irrationality measure of pi is at most 7.103205334137…. (2017).
Moscow J. Combin. Number. Theory 9, 407–419 (2019). 40. Wu, T. & Tegmark, M. Toward an artificial intelligence physicist for unsupervised learning.
5. Zudilin, W. An Apéry-like difference equation for Catalan’s constant. J. Combin. 10, R14 Phys. Rev. E 100, 033311 (2019).
(2003). 41. Greydanus, S., Dzamba, M. & Yosinski, J. Hamiltonian neural networks. In Advances in
6. Hardy, G. H. & Wright, E. M. An Introduction to the Theory of Numbers 5th edn (Oxford Neural Information Processing Systems (NEURIPS2019) Vol. 32, 15379−15389 (2019).
Univ. Press, 1980). 42. Iten, R., Metger, T., Wilming, H., del Rio, L. & Renner, R. Discovering physical concepts
7. Berndt, B. C. Ramanujan’s Notebooks (Springer Science & Business Media, 2012). with neural networks. Phys. Rev. Lett. 124, 010508 (2020).
8. Appel, K. I. & Haken, W. Every Planar Map Is Four Colorable Vol. 98 (American 43. Udrescu, S. & Tegmark, M. AI Feynman: a physics-inspired method for symbolic
Mathematical Society, 1989). regression. Sci. Adv. 6, eaay2631 (2020).
9. Wilf, H. S. & Zeilberger, D. Rational functions certify combinatorial identities. J. Am. Math. 44. Wiles, A. Modular elliptic curves and Fermat’s last theorem. Ann. Math. 141, 443–551
Soc. 3, 147–158 (1990). (1995).
10. McCune, W. Solution of the Robbins problem. J. Autom. Reason. 19, 263–276 (1997). 45. Smale, S. Mathematical problems for the next century. Math. Intell. 20, 7–15 (1998).
11. Hales, T. C. A proof of the Kepler conjecture. Ann. Math. 162, 1065–1185 (2005). 46. Van der Poorten, A. & Apéry, R. A proof that Euler missed…. Math. Intell. 1, 195–203
12. Lample, G. & Charton, F. Deep learning for symbolic mathematics. In ICLR Conf. (1979).
https://openreview.net/forum?id=S1eZYeHFDS (2020). 47. Borwein, J., Borwein, P. & Bailey, D. Ramanujan, modular equations, and approximations
13. Cranmer, M. et al. Discovering symbolic models from deep learning with inductive to pi or how to compute one billion digits of pi. Am. Math. Mon. 96, 201–219 (1989).
biases. Preprint at https://arxiv.org/abs/2006.11287 (2020). 48. Pilehrood, K. H. & Pilehrood, T. H. Series acceleration formulas for beta values. Discret.
14. Turing, A. M. On computable numbers, with an application to the Entscheidungsproblem. Math. Theor. Comput. Sci. 12, 223–236 (2010).
Proc. Lond. Math. Soc. s2-42, 230–265 (1937). 49. Kim, S. Normality analysis of current world record computations for Catalan’s constant
15. Asimov, I. & Shulman, J. A. Isaac Asimov’s Book of Science and Nature Quotations and arc length of a lemniscate with a = 1. Preprint at https://arxiv.org/abs/1908.08925
(Weidenfeld & Nicolson, 1988). (2019).
16. Bohr, N. Rydberg’s Discovery Of The Spectral Laws (C.W.K. Gleerup, 1954). 50. Nesterenko, Y. V. On Catalan’s constant. Proc. Steklov Inst. Math. 292, 153–170 (2016).
17. Shimura, G. Modular forms of half integral weight. In Modular Functions of One Variable I 51. Zudilin, W. Well-poised hypergeometric service for diophantine problems of zeta values.
57–74 (Springer, 1973). J. Théor. Nomb. Bordeaux 15, 593–626 (2003).
18. Cuyt, A. A., Petersen, V., Verdonk, B., Waadeland, H. & Jones, W. B. Handbook Of 52. Zudilin, W. One of the odd zeta values from ζ(5) to ζ(25) is irrational. By elementary
Continued Fractions For Special Functions (Springer Science & Business Media, 2008). means. Symmetry Integr. Geom. 14, 028 (2018).
19. Scott, J. F. The Mathematical Work Of John Wallis (1616–1703) (Taylor and Francis, 1938). 53. Raayoni, G. et al. The Ramanujan machine: automatically generated conjectures on
20. Bowman, D. & Laughlin, J. M. Polynomial continued fractions. Acta Arith. 103, 329–342 fundamental constants. Preprint at https://arxiv.org/abs/1907.00205 (2019).
(2002). 54. Dougherty-Bliss, R. & Zeilberger, D. Automatic conjecturing and proving of exact values
21. McLaughlin, J. M. & Wyshinski, N. J. Real numbers with polynomial continued fraction of some infinite families of infinite continued fractions. Preprint at https://arxiv.org/abs/
expansions. Acta Arith. 116, 63–79 (2005). 2004.00090 (2020).
22. Press, W. H. Seemingly Remarkable Mathematical Coincidences Are Easy To Generate 55. Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from
(Univ. Texas, 2009). materials science literature. Nature 571, 95–98 (2019).
23. Euler, L. Introductio In Analysin Infinitorum Vol. 2 (MM Bousquet, 1748).
24. Petkovšek, M., Wilf, H. S. & Zeilberger, D. A = B (A. K. Peters Ltd., 1996). Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
25. Bailey, D., Borwein, J. & Girgensohn, R. Experimental evaluation of Euler sums. Exp. Math. published maps and institutional affiliations.
3, 17–30 (1994).
26. Wang, H. Toward mechanical mathematics. IBM J. Res. Develop. 4, 2–22 (1960). © The Author(s), under exclusive licence to Springer Nature Limited 2021

Nature | Vol 590 | 4 February 2021 | 73


Article
Methods difference between the value xi and its closest integer (round),
ℒI = ‖round(xi ) − xi ‖. In cases where this stage converges (being a heu-
Complexity of the MITM-RF algorithm ristic algorithm, this is not guaranteed), the method can find points
A naïve enumeration is very computationally intensive with time that satisfy ℒ = ℒI = 0 for both losses, meaning an integer solution to
complexity of O(MN), where M and N are the LHS and RHS space size, our optimization problem.
respectively, and space complexity of O(1). In our algorithm, we store the
LHS in the hash table in order to substantially reduce computation time at An infinite family of PCFs for the Catalan constant
the expense of space. This makes the algorithm’s time complexity O(M + N) The PCF results for the Catalan constant in Supplementary Table 3
and its space complexity O(M). We also implemented another version of can be generalized to an infinite family of PCFs. This generalization
the algorithm in which the hash table stores the RHS (the PCF results). In revealed an underlying mathematical structure related to the Catalan
both cases, the hash table can be saved and reused to reduce the dura- constant (there is now active research regarding additional algebraic
tion of future enumerations. The main computational bottleneck is the properties of this mathematical structure, to be presented in a separate
enumeration and calculation of the RHS terms (O(N) time complexity). publication). We produce eight examples of formulas resulting from
By parallelizing the process on C central processing unit (CPU) cores, this generalization and present them in Supplementary Information
we are able to speed up the process, and the time complexity drops to section G, in Supplementary Tables 5 and 6. Interestingly, part of the
O(N/C). Moreover, we decrease the space needed in the memory by PCF results can be expressed as infinite sums (Supplementary Table
using a Bloom filter to store the LHS hash-table keys (instead of keeping 5). However, not every PCF can be written as a sum, as is the case in the
the whole hash table in memory). Using a Bloom filter decreases the expressions in Supplementary Table 6, which we found to have a faster
space by a factor of about 100 during the RHS enumeration. convergence rate than the state of the art48. Importantly, the complex-
Our code handles edge cases, like discarding PCFs that provide rep- ity of these expressions may help to demonstrate how the approach
resentations of rational numbers by skipping β polynomials with roots proposed in this work can handle complexity that may be difficult to
at natural numbers. For the full implementation of our MITM-RF algo- address without computer algorithms (we show here specific examples
rithm, see the code on http://www.RamanujanMachine.com. of polynomials of order >20 with coefficients that have >30 digits).

Generalizations of the MITM-RF algorithm The irrationality measure of a constant and its lower bound
We also generalized the algorithm to allow for α and β to be integer The irrationality measure of x, sometimes called the approximation
sequences generated by any countable parametric function. For exam- exponent or the Liouville–Roth constant6, is defined as the largest
ple, α and β can be interlaced sequences, that is, they may consist of μ = μ(x) for which there exists a sequence of rational numbers pn/qn
multiple (alternating) integer polynomials. For example, in the case of that satisfy 0 < |x − pn /qn| < q−μ. For every x, μ(x) is always either exactly
just two interlaced sequences, odd values of n are equal to one polyno- 1 when x is rational or ≥2 when x is irrational.
mial, and even values of n are equal to a different polynomial. We can define the effective irrationality exponent of a sequence
Seeing how successful our algorithm was despite its relative simplic- as the largest (supremum) that satisfies the inequality. Sequences of
ity, we believe there is still ample room for new results. By leveraging this kind are called Diophantine approximations6. Every PCF we find is
more sophisticated algorithms, other results will follow, thus discover- such a sequence of rational numbers and it has an effective irrational-
ing hidden truths about even more fundamental constants, perhaps ity exponent μ′. Generally, each explicit Diophantine approximation
with formulas that are more complex than the PCFs used in this work. log(| x − p n / q n |)
sequence gives a μ′ that can be calculated by μ′ = lim inf log(q n / gcd(p n , q n))
 ,
n →+∞
where gcd indicates the greatest common divisor. Each μ′ provides a
Stages of the Descent&Repel algorithm
lower bound for the irrationality measure μ(x) of the value x to which
We chose the optimization problem’s variables as the coefficients of
the sequence converges.
the α, β, γ, δ polynomials in equation (7). The algorithm is initialized
However, finding an explicit sequence from which the value of μ(x)
with a large set of points. In the specific examples we present, all initial
can be extracted is a challenge. This challenge motivated the search for
conditions were set on a line, as shown in Fig. 3.
such sequences for important fundamental constants, with the goal of
The algorithm is then constructed of three main stages: GD, ‘Repel’,
extracting bounds on their value of μ(x). When a constant is not known
and Lattice GD. We iterate between the first two stages and then perform
to be rational, the sequences all still have μ′ ≤ 1, as in the case of the
the third stage once to converge to a possible solution.
Catalan constant. Then, finding an explicit sequence for which μ′ > 1
(1) GD. We perform a standard GD separately for each point xi, which
will directly prove irrationality. In principle, there must be a sequence
is a d-dimensional vector. The loss function ℒ is defined in equation (7),
with μ′ = 1 or μ′ ≥ 2. However, it is not trivial to find such a sequence
and thus, for each point xi, we define its next iteration t + 1 as
explicitly51,56,57, and thus, it is of interest to try to find sequences pn/qn
xi(t +1) = xi(t ) − μ∇ℒ| x (i t ), where μ is some small enough step size.
for which μ′ is as large as possible.
(2) ‘Repel’. We update the values of all the points so that they ‘push
1
off’ one another via a Coulomb-like repulsion proportional to  .
‖ xi − xj ‖2 An infinite family of PCFs with complex variables
Namely, we define the ‘repel’ iterations as
Example outcomes of the mathematics−algorithm correspondence in
xi(t ) − x (jt ) our work are aesthetic generalizations that we found based on results
xi(t +1) = xi(t ) + ν ∑ 3 ,
of the Ramanujan Machine algorithms. One example is the following
j xi(t ) − x (jt )
PCF with a complex variable:

with another small step size ν that accounts for the strength of the 1 × (2 × z − 1) 2 2× z +1
∀z ∈ ℂ : 1+ 2 × (2 × z − 3)
=
repulsion. The ‘repel’ mechanism is used to increase the search space 4+ 3 × (2 × z − 5) π2 × z 
to more effectively cover the space of integer parameters and thus 7+
4 × (2 × z − 7)  z 
10 +
13 + …
increase the probability of finding a match. We tune the repulsion
strength heuristically.
(3) Lattice GD. We enforce the constraint of integer results by alter- This PCF was found as a conjecture—by generalizing several automati-
nating the GD optimization between the original loss ℒ of equation (7) cally generated conjectures (specific integer values for z), generated
and a different loss function ℒI that scales like the square of the by the MITM-RF algorithm. Like many other results involving π, it can
be proved using generalized hypergeometric functions. The proof is 56. Zudilin, W. A third-order Apéry-like recursion for ζ(5). Mathematical Notes [Mat. Zametki]
72, 733–737 [796–800] (2002).
quite straightforward, provided one finds certain identities involving 57. Rivoal, T. Rational approximations for values of derivatives of the Gamma function. Trans.
ratios of generalized hypergeometric functions, presented in Supple- Am. Math. Soc. 361, 6115–6149 (2009).
mentary Information section F.2.1 along with other proofs and related
information. It remains to be seen whether related methods would be Acknowledgements We thank M. Soljačić, B. Weiss, D. Soudry and D. Carmon for helpful
able to prove the unproved conjectures in Supplementary Tables 1−3 discussions. I.K. is grateful for the support of R. Magid and B. Magid and for the support of the
Azrieli Faculty Fellowship. Y.M. acknowledges the support and guidance of the Israeli Alpha
of Supplementary Information section A. The above family of PCFs
Program for Excellent High-School Students.
is brought here as an example of how automatically generated con-
jectures can be generalized to a wider conjecture and later a proof. Author contributions G.R., G.P. and I.K. implemented the first proof-of-concept algorithms.
We believe that this process could be used more widely with future G.R. implemented the first generation MITM-RF algorithm. S.G. and Y. Harris implemented the
state-of-the-art MITM-RF algorithm. S.G. made the developments that led to the discovery of
results of the Ramanujan Machine, so that automatically generated the ζ(3) and Catalan PCFs. Y.M. implemented the Descent&Repel algorithm. Y.M., S.G., U.M.
conjectures on fundamental constants become a catalyst for math- and I.K found how to convert the Catalan PCFs into expressions with record approximation
ematical research. For an extended discussion, see Supplementary exponents and fast convergence rates. U.M., Y.M., G.R., S.G., Y. Harris and I.K. proposed parts
of the algorithms and developed proofs for some of the conjectures. D.H. and Y. Hadad
Information section B. developed the online community. Y. Hadad, G.P. and I.K. came up with the conceptual flow of
the wider concept. I.K. conceived the idea and led the research. All authors provided
substantial input to all aspects of the project and to the writing of the paper.
Data availability Competing interests The authors declare no competing interests.
All the results of the Ramanujan Machine project are shared in the paper,
with newer updates appearing periodically on the project website. Additional information
Supplementary information The online version contains supplementary material available at
https://doi.org/10.1038/s41586-021-03229-4.
Correspondence and requests for materials should be addressed to I.K.
Code availability Peer review information Nature thanks Yang-Hui He, Doron Zeilberger and the other,
anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer
Code is available at: http://www.ramanujanmachine.com/ and the reports are available.
GitHub links therein. Reprints and permissions information is available at http://www.nature.com/reprints.
Article

Extended Data Fig. 1 | Convergence rates of the PCFs. The plots present convergence rates, and on the right are PCFs that converge polynomially.
the absolute difference between the PCF value and the corresponding The majority of previously known PCFs for π converge polynomially, whereas
fundamental constant (that is, the error) versus the number of terms calculated all of our newly found results converge exponentially.
in the PCF. On the left are PCFs with exponential/super-exponential
Extended Data Table 1 | RFs for π and e found in a proof-of-concept run of the Descent&Repel algorithm

You might also like