CUTTING A GRAPH INTO TWO DISSIMILAR HALVES.
by
Paul Erdos(*), Mark Goldberg(**), Janos Pach(*,***), Joel Spencer(***).
ABSTRACT
Given a graph G and a subset S of the vertex set of G , the discrepancy of S is defined as the
difference between the actual and expected numbers of the edges in the subgraph induced on S . We
n (n −1) n
show that for every graph with n vertices and e edges, n <e < , there is an -element sub-
4 2
set with the discrepancy of the order of magnitude of √ne . For graphs with fewer than n edges we
n
calculate the asymptotics for the maximum guaranteed discrepancy of an -element subset. We also
2
introduce a new notion called the bipartite discrepancy and discuss related results and open problems.
(*) Mathematical Institute of the Hungarian Academy of Science, 1364 Budapest, POB 127, HUNGARY;
(**)Dept. of Comput. Sci., RPI, Troy, N.Y. 12181; the work of this author was supported in part by National Science Foundation
under grant DCR-8520872;
(***)Dept. of Math, State University of New York, Stony Brook, N.Y. 11794;
-2-
1. Introduction.
Let G be an arbitrary graph with v (G )=n vertices and e (G )=e edges. For any subset S of
the vertex set of G , let the discrepancy of S be defined as the difference between the actual and
expected numbers of edges in G [S ], i.e., in the subgraph of G induced by S . That is, let
S
2 S ( S − 1)
dis (S )=e (S )−e = e (S ) − e ,
n n (n −1)
2
where e (S ) is the shorthand form of e (G [S ]). The average behavior of dis (S ) is studied in [2].
On the problem session of the last South-Eastern Conference on Combinatorics, Boca Raton
(1986) the senior author raised the following question. Is it true that for every c >0 there exists a con-
n −c edges contains
stant ĉ >0 with the property that any graph G with n vertices and cn <e < n
n 2
two sets of vertices S and T such that S = T = and e (S ) −e (T ) >ĉn ? Our following result
2
answers this question in the affirmative.
n (n −1)
Theorem 1. Let G be a graph with n vertices and e edges, n <e < , and assume that n is
n4
even. Then one can find two subsets S ,T ⊂V (G ) such that S = T = and
2
e (S ) − e (T ) >α√en ,
where α is an absolute constant.
At first glance, one might naively conjecture (as we did) that in the above theorem S and T can
be chosen to be disjoint. However, if G is any regular graph and S ∪ T is any partition of its vertex
set into two equal halves, then e (S ) and e (T ) are always equal.
The following, slightly weaker assertion is still true.
µ, 0<µ< , there exists a ν>0 such that in any graph with n vertices and e
1
Theorem 2. For every
2
n (n −1)
edges, n <e < , one can find two disjoint subsets S and T such that S = T = µn and
4
e (S )−e (T ) >ν√en .
The proofs of the above theorems rely heavily on a generalization of an old quasi-Ramsey type result of
the first and the last named authors [5], [6], [1] (see Section 2) and on the following Expansion-
Retraction Theorem .
Theorem 3. Let G be a graph with n vertices and assume that dis (R ) =D for some subset
n
R ⊂V (G ). Then there exists a subset S ⊂V (G ) with S = such that
2
1
dis (S ) >( +o (1))D ,
4
where the o (1) term goes to 0 as D tend to infinity.
In the case when G has fewer than n edges we have much sharper results. To formulate them we
introduce some further notations. For any graph G with n vertices, let
-3-
d +(G ) = max dis (S ),
d −(G ) = −min dis (S ),
d (G ) = max (d +(G ),d −(G )) = max dis (S ),
n
where the max and min are taken over all - element subsets S ⊂V (G ). Further, for any c >0,
2
let
d +(n ,c ) = min {d +(G ) : e = cn }, d −(n ,c ) = min {d −(G ) : e = cn },
d (n ,c ) = min {d (G ) : e = cn }.
Theorem 4.
d −(n ,c ) c /4 if 0<c ≤1/2
(* ) lim = (2−c )/4 if 1/2<c ≤1.
n →∞ n
3c /4 if 0<c ≤1/4,
d +(n ,c )
(** ) lim = (1−c )/4 if 1/4<c ≤1/2,
n →∞ n
c /4 if 1/2<c ≤1.
d (n ,c ) d +(n ,c )
(*** ) lim = lim if 0<c ≤1.
n →∞ n n →∞ n
d +(G ) and d −(G ) can be essentially different from each other. For
Note that, in general,
n n2 n
example, if G consists of two disjoint cliques of size , then d +(G )∼
∼ and d −(G )∼
∼ .
2 16 16
The proofs of Theorems 1-3 and Theorems 4 can be found in Sections 2 and 3, respectively. The
last section contains some generalizations, related results and open problems. In particular, we will
introduce and discuss a new parameter of a graph called the bipartite discrepancy, which depends on
the deviance of the most irregular bipartitions.
2. Discrepancy of graphs.
Let G be a graph with n vertices and e edges, and let A and B be two disjoint subsets of
V (G ). Set
A B
dis (A ,B ) = e (A ,B ) − e ,
n
2
where e (A ,B ) denotes the number of edges in G running between A and B .
The following theorem is a straightforward generalization of a result in [5], [3].
Theorem 5. For every ε>0 there exists ε̂>0 such that any graph G with n vertices and e >n edges
contains two disjoint subsets A and B with the property that A , B <εn and
dis (A ,B ) >ε̂√en
.
-4-
n is even, ε< , and decompose V (G ) into disjoint parts U and
1
Proof. Assume, for simplicity that
16
V , U = V . Let A be a randomly chosen εn -element subset of U , and set
√ εne
V (A) ={v ∈V : dis (v ,A) >10−2 }.
Then
√ εne ]> 12
Pr[ dis (v ,A) >10−2 .
Hence, the expected size of V (A) equals
√
εe
]> .
n
Σ Pr[ dis (v ,A) >10−2
n 4
v ∈V
On the other hand
n < E[ V (A) ]≤ n Pr[ V (A) > n ]+ n (1−Pr[ V (A) > n ]),
4 2 8 8 8
implying
n
V (A) > ] > .
1
E[
8 3
Thus, one can choose a specific A and an εn -element subset B ⊂V (A ) such that
dis (v ,A )>10−2
√ εe
, or dis (v ,A )<−10−2
n √ εe
, hold for all v ∈B . In both cases A and B
n3
meet the requirements of the theorem with ε̂ = 10−2ε 2 .
Corollary. For every ε>0 there exists an δ>0 with the property that any graph G with n vertices and
e >n edges contains an at most 2εn -element subset R ⊆V (G ) such that
dis (R ) > δ√en .
Proof. It is sufficient to note that
dis (A ∪B ) = dis (A ) + dis (B ) + dis (A ,B ),
hence, if A and B satisfy the conditions in Theorem 5, then the absolute value of the discrepancy of at
√ en
least one of the sets A ,B or A ∪B exceeds ε̂ .
3
Next we prove the Expansion-Retraction Theorem stated in the Introduction.
R = m and suppose for convenience that n is even. If m ≥ , then let S
n
Proof of Theorem 3. Let
2
be a randomly chosen -element subset of R . The expected number of edges in G [S] is
n
2
-5-
n /2
n 2,
E[e (S)] = e (R ) ∼
2
∼ e (R )( )
1
m 4 m
2
implying
∼dis (R )( )2.
∼ n
E[dis (S)]
2m
S with dis (S ) ≥ dis (R ) /4.
Thus there exists a specific
Now assume m < and denote R the complement of R . Let P be a randomly chosen -
n n
2 2
element subset of R and let Q be a random set consisting of R and
n
−m randomly chosen vertices
2
of R . Denote E 1 = E[e (P)] and E 2 = E[e (Q)]. We will establish an upper bound for
min (E 1,E 2) in the case of D ≥0 and a lower bound for max (E 1,E 2) in the opposite case.
Clearly,
2
∼ e (R ) 2 = F 1,
1 n
E 1∼
4 (n −m )
(n /2)−m ((n /2)−m )2
∼e (R ) + e (R ,R ) +e (R ) 2 = F 2.
E 2∼
n −m (n −m )
Since e (R ,R ) = e −e (R )−e (R ), for fixed e and e (R ), F 1 and F 2 are linear functions of
x =e (R ). Therefore, min (max (F 1,F 2)) as well as max (min (E 1,E 2)) is achieved if F 1=F 2.
Thus,
1 x ( n ) = e (R ) + 1 (e −e (R )−x )) n−2m + 1 x ( n−2m ) ,
2 2
4 0
n −m 2 0
n −m 4 0
n −m
n −2m
x 0 = e (R ) + e .
n
Substituting e (R ) for e ( )2 + D we get
m
n
F 1(x 0)=F 2(x 0) = e + D ( )2.
1 1 n
4 4 n −m
This implies that for some specific -element subset S of the form P or Q,
n
2
dis ( S ) ≥( + o (1))D .
1
4
dis (S ) and dis (R ) are identical. Note, also, that the extreme value in
1
Moreover, the signs of
4
R
Theorem 3 is only achieved if is nearly 0 or 1; otherwise the constant can be improved.
n
Proof of Theorem 1. To obtain S , apply Theorem 3 to the set R constructed in Corollary. Let T be a
randomly chosen
n
-element subset of V (G ). Then
2
-6-
E[e (S ) − e (T)] = E[dis (S ) − dis (T)] = dis (S ),
yielding the result.
For the proof of Theorem 2 we need the following slightly generalized form of the Expansion-
Retraction Theorem.
Theorem 3′. Let G be a graph with n vertices, ε and ν positive numbers, ε<1−ν, and assume that
dis (R ) =D
for some subset R ⊂V (G ) having at most εn elements. Then there exists a subset S ⊂V (G ) with
S = νn such that
dis (S ) ≥(νmin (ν,1−ν) + o (1))D ,
where the o (1) term goes to 0 as D tends to infinity.
Proof of Theorem 2. Divide the vertex set ofG into two disjoint equal parts U and V such that
e
e (G [U ])≥ . Applying Corollary to the graph G [U ] with ε = 1−2µ, we obtain that there exists
4
√
n
e
an at most (1 − 2µ)n -element subset R of U with dis (R ) >δ
. By Theorem 3′, there is
4 2
1
S ⊂U with S = 2µ n = µn and
2
√
en
dis (S ) > (2µmin(2µ,1−2µ) + o (1))δ = D ′,
8
so we can choose another µn -element subset S ′⊂U , such that
e (S ) − e (S ′) ≥ D ′.
1
Then, for any µn -element subset T ⊂V , either e (S ) − e (T ) > D ′ or
2
1
e (S ′) − e (T ) >
D ′.
2
2. Sparse graphs.
In this section, we consider graphs with n vertices and cn edges, where c ≤1. The following
form of Turan’s theorem will be used.
Theorem 6. [7] Every graph with n vertices and e edges contains an independent set of size
n2
≥ .
2e + n
1
Proof of Theorem 4. If c ≤ , then by Turan’s theorem, we can find in G an independent set J of
2 2
n n 1 c
size ≥ ≥ . Obviously, dis (J ) = −cn ×( + o (1)) and thus d −(n ,c ) = n ( +o (1))
2e +n 2 4 4
1
for 0 ≤ c ≤ .
2
-7-
To prove the second part of (*), we show that every graph with n vertices and e edges
n 2n −e
( ≤e ≤n ) contains an independent set J of size ≥ . Indeed, this is true for n =2 and, due to
2 3
Turan’s theorem, it follows for every graph with n vertices and e = n edges. Let n >2 and e <n .
We may assume without loss of generality that G has no isolated vertices. Then G must have a ver-
tex of degree 1. Let w be such a vertex and let z be adjacent to w . We delete z together with all
edges incident to it. The remaining graph has an isolated vertex w and a subgraph H with n −2 ver-
tices and ≤e −1 edges. By induction, H contains an independent set Q of size
2(n −2)−(e −1) 2n −e 2n −e
≥ = −1. Thus, the independent set J = Q ∪w contains ≥ ver-
3 3 3
tices.
n
Having constructed J , we expand it to an -element subset S by adding one by one the neces-
2
sary number of vertices in such a way that each addition brings at most one new edge. Such an expan-
sion certainly exists, since otherwise we would find a subset T such that
n
(1) T > ;
2
(2) every x ∈T is adjacent to at least two vertices in V −T .
This would imply that E ≥ 2 T >n , which is impossible. Thus, S ⊇J induces a subgraph with
n 2n −e 2e − n 2−c
≤ − = edges. This proves that both d −(G ) and d −(n ,c ) are ≥ n + o (n ).
2 3 6 12
− 2−c
2c −1
To see that d (n ,c ) ≤ n + o (n ), take the union of (1 − c )n edges and triangles
12 3
(all are disjoint).
n n
Next we show (**). If e ≤ , then, evidently, G has a subgraph with -vertices which con-
4 2
+ ∼ 3c
tains all edges. This yields d (n ,c )∼ n.
4
n
If e > , then consider the connected components G 1,G 2,...,Gr of G . Let
4
e (Gi )=v (Gi )−1+δi (i =1,...,r ) and let δ1≥δ2≥...≥δr . If k is the smallest i with δi =0 , then
k +1 k +1
we assume that v (Gk )≥v (Gk +1)≥ . . . ≥v (Gr ). Let, also, H = ∪ Gi and s* = Σ v (Gi ).
i =1 i =1
n
Obviously, e (H ) ≥ s* −1. Therefore, if s* ≥ , then
2
2−c
d +(G ) ≥ n + o (n ).
4
n
In the case s* ≤ , we add to H some components Gk +2,Gk +3,... to get a graph, F , with
2
n e
vertices (it is possible that the last component will be only partially included). Clearly, e (F ) ≥
2 2
c n
and thus d +(n ,c ) ≥ . In addition, e (F )≥ , otherwise
4 4
n
e (F )= Σ dF (x )≤ −1
x ∈F 4
n
would imply that F contains at least two isolated vertices, therefore e (F )=e ≥ .
4
1
So, if c ≥ then
4
-8-
1−c 1 1
n + o (n )
if ≤c < ,
4 4 2
d +(n ,c ) ≥
c 1
n + o (n ) if
≤c ≤1.
4 2
To show that this bound is best possible, consider a graph with n vertices and e edges, which consists
e
of p =n −e −1 disjoint paths of length
, and another component, which is a path of length
p
e
l =e −p (in case l>0 ).
p
Finally, note that (***) follows from (*) and (**).
3. Bipartite discrepancy.
For any graph G with n vertices and e edges, let the bipartite discrepancy of G be defined by
n n
bdis (G )=max ( dis (S ,T ) : S ∪T = V (G ), S =
, T =
).
2 2
That is, bdis (G ) is the maximum deviation of the number of edges running between two complemen-
tary halves of V (G ) from
n n
2 2
e ,
n
2
i.e., from its expected value.
1
Conjecture 1. For any 0 < ε< , there exists a δ such that
2
bdis (G ) ≥ δn 3/2
1 n
holds for every graph G with n vertices and
≤e ≤(1−ε) n2 edges.
2 2
1
Conjecture 1′. For any 0 < ε < , there exists a δ̂ such that, if G is any graph with n vertices and
2
1 n n
≤e ≤(1−ε) 2 edges, and w 1,w 2,...,wn are any weights assigned to the vertices of G , then
2 2
one can always find an n /2 -element subset S ⊂V (G ) satisfying
e (S ) − Σ wi ≥ δ̂n 3/2.
i ∈S
Proposition . Conjecture 1′ implies Conjecture 1.
Proof. Assume, for simplicity, that n is a multiple of 6, and let T 0 be an arbitrary set of n /3 vertices
of G . For any i ∈V (G )−T 0 set
e (T 0)
wi = { t ∈T 0 : (i ,t )∈E (G )} − 3 .
n
-9-
Applying Conjecture 1′ to the subgraph of G induced by V (G )−T 0, we can find an n /3-element sub-
set S ⊆V (G ), disjoint from T 0, with
2n
e (S ) − Σ wi = e (S 0)+e (T 0)−e (S 0,T 0) ≥ δ̂( )3/2.
3
i ∈S
Now split V (G )−S 0−T 0 arbitrarily into n /6 pairs x j ,y j , and let S be a random set which contains
S 0 and exactly one vertex from each pair. Further, let T=V (G )−S. Then any edge of G with at
1
least one endpoint not in S 0∪T 0 has probability precisely of being in e (S ,T ), unless it is an
2
edge of the form (x j ,y j ). Thus
E[e (S)+e (T)−e (S,T)]=e (S 0)+e (T 0,T 0)−∆ ,
where 0<∆ ≤n /12=o (n 3/2). Hence there exist S and T with
e
dis (S ,T ) = e (S )+e (T )−e (S ,T ) ≥δn . Note that, in the special case when wi =
3/2
, the
2n
truth of Conjecture 1′ follows from [5] or from Corollary in Section 2.
Letc 0 denote the maximal positive c such that a random graph with n vertices and cn edges
n
n
has a partition of the vertex set into two subsets of sizes and respectively for which the
2 2
number of edges with endpoints in different parts is o (n ). By [4], a random graph with n vertices
1−x (c )
and cn edges consists of a "giant" component of size n and small components of sizes
2c
O (lnn ), where x (c ) is the solution satisfying 0<x (c )<1 of the equation x (c )e −x (c )=2ce −2c . For
n
c =ln 2, the size of the "giant" component is , implying that c 0≥ ln 2.
2
Conjecture 2. c 0 =ln 2.
Conjecture 2 would follow from the following
Conjecture 3. For every ε>0, there is but o ((1+ε)n ) partitions of the vertex set of a random tree T
n n
into two subsets of sizes and respectively, for which the number of edges with endpoints in
2 2
different parts is o (n ).
-10-
References
[1] B. Bollobas, Random Graphs, Academic Press, London-New York, 1985.
[2] D. de Caen, A Note on the Probabilistic approach to Turan’s Problem,
Journal of Combinatorial theory, series B, Vol. 34, No.3, 1983.
[3] P. Erdos, The Art of Counting (Selected Writings), MIT Press,
Cambridge, Mass. - London, 1973.
[4] P. Erdos and A. Renyi, On the Evolution of Random Graphs, Mat.
Kutato Int. Kozl. 5, 17-60.
[5] P. Erdos and J. Spencer, Imbalances in k-colorations,
Networks 1 (1972), 379-385.
[6] P. Erdos and J. Spencer, Probabilistic Methods in Combinatorics,
Academic Press, New York - London and Akademiai Kiado, Budapest, 1974.
[7] P. Turan, On the Theory of Graphs, Colloq. Math. 3(1954), 19-30.