Chapter 1 On The Secure Hash Algorithm Family
Chapter 1 On The Secure Hash Algorithm Family
1.1 Introduction
This report is on the Secure Hash Algorithm family, better known as the SHA hash
functions. We will try to analyse how these functions work, focussing especially on the
SHA-1 hash function. After dissecting the SHA-1 hash function itself, we will relate and
compare it to the other hash functions in the SHA family, SHA-0 and SHA-2.
Besides analysing the specific SHA functions, we will also treat the generic structure
of hash functions, as these are relevant for the security of a hash function. We will treat
the different aspects in some detail and illustrate how naive implementation of hash
functions can be hazardous for the security, some attacks can be mounted against any
hash function if they are not implemented correctly.
The main part of this report will focus on the alleged weaknesses found in the SHA-
0 hash function by different scientists. We first explain how the SHA-0 function was
initially broken by treating the so-called differential collision search against this function.
After explaining how the idea works, we will explain why this attack is less successful
against the SHA-0 successor, SHA-1. Finally we will look at SHA-2 and summarize the
cryptanalysis results for the SHA family.
The last part will deal with the implications of broken hash functions, and how
dangerous these actually are. Although the weaknesses found may not seem all too
severe, these attacks can indeed have serious consequences, as we will see in the final
section.
A hash function is a function which takes an arbitrary length input and produces a
fixed length ‘fingerprint’ string. Common usage of such a function is as index into
1
2 On the Secure Hash Algorithm family
1.1.2 Applications
The Secure Hash Algorithm (SHA) was developed by the NIST in association with the
NSA and first published in May 1993 as the Secure Hash Standard. The first revision to
this algorithm was published in 1995 due to a unpublished flaw found, and was called
SHA-1. The first version, later dubbed SHA-0, was withdrawn by the NSA. The SHA
hash function is similar to the MD4 hash function, but adds some complexity to the
algorithm and the block size used was changed. SHA was originally intended as part
of the Digital Signature Standard (DSS), a scheme used for signing data and needed a
hash function to do so.
In 1998, two French researchers first found a collision on SHA-0 with complexity
269 1 , as opposed to the brute force complexity of 280 [1]. This result was improved in
As the reader might recall, complexity is defined as the number of hash operations needed for a
1
specific attack.
1.2 Description of the SHA Algorithms 3
drastically improved by Wang e.a.which could find a collision with complexity 239 [13],
which is within practical reach. It took only five years for the initial SHA function to
be broken, and after another seven years, the best attack possible is only half of the
(logarithmic) complexity of the original hash function. Fortunately, the NSA already
foresaw this in 1995 and released SHA-1.
Cryptanalysis on SHA-1 proved to be much more difficult, as the full 80-round
algorithm was broken only in 2005, and this attack still has a complexity of 263 [12].
This restricts the attack only to the theoretical domain, as a complexity of 263 is still
unfeasible on present-day computers. Nonetheless, this collisional attack requires less
than the 280 computations needed for a brute-force attack on SHA-1. We will explain
why it took 7 years longer to break SHA-1 in section 1.4.2.
In addition to the SHA-1 hash, the NIST also published a set of more complex hash
functions for which the output ranges from 224 bit to 512 bit. These hash algorithms,
called SHA-224, SHA-256, SHA-384 and SHA-512 (sometimes referred to as SHA-2)
are more complex because of the added non-linear functions to the compression func-
tion. As of January 2008, there are no attacks known better than a brute force attack.
Nonetheless, since the design still shows significant similarity with the SHA-1 hash al-
gorithms, it is not unlikely that these will be found in the (near) future. Because of
this, an open competition for the SHA-3 hash function was announced on November 2,
20072 . The new standard is scheduled to be published in 2012. Fortunately, the SHA-2
hash functions produce longer hashes, making a feasible attack more difficult. Consider
for example the SHA-512 hash function, producing 512 bit hashes and thus having an
approximate complexity against collisional attacks of 2256 . Even if the logarithmic com-
plexity would be halved (from 256 to 128), this would still be out of reach for practical
purposes for the coming decade or so.
1.2.1 SHA-1
SHA-1 takes as input a message with maximum length 264 − 1 bits and returns a 160-bit
message digest. The input is processed in parts of 512 bit each, and is padded using
the following scheme. First a 1 is appended and then padded with 0’s until bit 448,
finally the length of the message is inserted in the last 64-bits of the message, the most
significant bits padded with 0’s. The reason that first a 1 is appended is that otherwise
2
http://csrc.nist.gov/groups/ST/hash/documents/FR_Notice_Nov07.pdf
4 On the Secure Hash Algorithm family
collisions occur between a sample messages and the same message with some zeros at
the end.
The intermediate results of each block are stored in five 32-bit registers denoted with
h0 , . . . , h5 . These five registers are initialized with the following hexadecimal values:
H0 = 0x67452301
H1 = 0xEFCDAB89
H2 = 0x98BADCFE
H3 = 0x10325476
H4 = 0xC3D2E1F0
Next, the algorithm uses four constants K0, K1, K2 and K3, with values:
K0 = 0x5A827999
K1 = 0x6ED9EBA1
K2 = 0x8F1BBCDC
K3 = 0xCA62C1D6
Finally we define four functions: fexp , fif , fmaj and fxor . The first takes the 512-bit
message as argument, each of the other functions take three 32-bit words (b, c, d) as
argument.
The fexp function expands the initial 512-bit input message M , consisting of 16 32-bit
words Mi with 0 ≤ i ≤ 15 to 80 32-bit words Wi with 0 ≤ i ≤ 79.
(
Mi , if 0 ≤ i ≤ 15
Wi =
Wi−3 ⊕ Wi−8 ⊕ Wi−14 ⊕ Wi−16 ≪ 1, if 16 ≤ i ≤ 79
fif (b, c, d) = b ∧ c ⊕ ¬b ∧ d
fmaj (b, c, d) = b ∧ c ⊕ b ∧ d ⊕ c ∧ d
fxor (b, c, d) = b ⊕ c ⊕ d
1.2.2 SHA-0
The design proposal for SHA-0 was retracted by the NSA shortly after publishing it.
The only difference between SHA-0 and SHA-1 is in the message expansion phase.
(
Mi , if 0 ≤ i ≤ 15
Wi = (1.1)
Wi−3 ⊕ Wi−8 ⊕ Wi−14 ⊕ Wi−16 , if 16 ≤ i ≤ 79
Note the absence of the leftrotate operation.
1.2.3 SHA-2
SHA-2 is a common name for four additional hash functions also referred to as SHA-
224, SHA-256, SHA-384 and SHA-512. Their suffix originates from the bit length of
the message digest they produce. The versions with length 224 and 384 are obtained by
1.2 Description of the SHA Algorithms 5
SHA-0(M ):
(* Let M be the message to be hashed *)
for each 512-bit block B in M do
W = fexp (B);
(* Initialize the registers with the constants. *)
a = H0 ; b = H1 ; c = H2 ; d = H3 ; e = H4 ;
for i = 0 to 79 do
(* Apply the 80 rounds of mixing. *)
if 0 ≤ i ≤ 19 then
T = a ≪ 5 + fif (b, c, d) + e + Wi +K0;
else if 20 ≤ i ≤ 39 then
T = a ≪ 5 + fxor (b, c, d) + e + Wi +K1;
else if 40 ≤ i ≤ 59 then
T = a ≪ 5 + fmaj (b, c, d) + e + Wi +K2;
else if 60 ≤ i ≤ 79 then
T = a ≪ 5 + fxor (b, c, d) + e + Wi +K3
e = d; d = c; c = b ≪ 30; b = a; a = T ;
(* After all the rounds, save the values in preparation of the next data block. *)
H0 = a + H0 ; H1 = b + H1 ; H2 = c + H2 ; H3 = d + H3 ; H4 = e + H4 ;
(* After all 512-bit blocks have been processed, return the hash. *)
return concat(H0 , H1 , H2 , H3 , H4 );
truncating the result from SHA-256 and SHA-512 respectively. SHA-256 uses a block
size of 512 bits, and iterates 64 rounds, while SHA-512 uses a 1024 bit block size and
has 80 rounds. Furthermore, SHA-512 uses an internal word size of 64 bits instead
of the 32 bit used by all other SHA variants. The SHA-2 algorithms follow the same
structure of message expansion and iterated state update transformation as SHA-1, but
both message expansion and state update transformation are much more complex. We
will discuss SHA-256 in more detail to indicate differences between SHA-1 and SHA-2.
SHA-256 uses sixty-four constants K0 , .., K63 of 32 bits each and eight registers to
store intermediate results H0 , .., H7 . Their values can be found in [6]. The function
definitions for SHA-256 are:
(
Mi , if 0 ≤ i ≤ 15
Wi = (1.2)
σ1 (Wi−2 ) + Wi−7 + σ0 (Wi−15 ) + Wi−16 , if 16 ≤ i ≤ 63
with
Σ0 (x) = x ≫ 2 ⊕ x ≫ 13 ⊕ x ≫ 22
Σ1 (x) = x ≫ 6 ⊕ x ≫ 11 ⊕ x ≫ 25
σ0 (x) = x ≫ 7 ⊕ x ≫ 18 ⊕ x 3
σ1 (x) = x ≫ 17 ⊕ x ≫ 19 ⊕ x 20
6 On the Secure Hash Algorithm family
3
Ai Bi Ci Di Ei
<<< 5
Ki
<<<
2 Wi
Figure 1: One step of the state update transformation of SHA and SHA-1
Figure 1.2: Schematic overview of a SHA-0/SHA-1 round.
where ∧ denotes the logical AND operation, ⊕ corresponds to addition modulo 2, and ¬B is
theand
bitwise complement of B. The state update transformation also uses step constants Ki .
where ROT Ra denotes cyclic rotation by a positions to the right, and SHRa denotes a logical
Algorithm 1.3: The SHA-256 algorithm.
shift by a positions to the right.
1.3 Generic Attacks 7
4
Ai Bi Ci Di Ei Fi Gi Hi
Ȉ0 Ȉ1
Ki
fMAJ
fIF
Wi
Figure 2: One step of the state update transformation of SHA-224 and SHA-256.
Figure 1.4: Schematic oveview of a 0 SHA-2 round. Note the added non-linear func-
tions in comparison with SHA-1.
The compression function consists of 64 identical steps. One step is depicted in Figure 2.
The step transformation employs the bitwise Boolean functions fM AJ and fIF , and two
1.3
GF(2)-linear functions Σ0 (x) and Σ1 (x):
Generic Attacks
Σ0 (x) = ROT R2 (x) ⊕ ROT R13 (x) ⊕ ROT R22 (x) (7)
1.3.1 6 Brute Force11Attacks
Σ1 (x) = ROT R (x) ⊕ ROT R (x) ⊕ ROT R25 (x) . (8)
When considering attacks on hash functions, there are several different attacks possible,
The i-th step uses a fixed constant Ki and the i-th word Wi of the expanded message.
with varying difficulty. We define several aspects of a hash function below (see also [4]).
Definition 1.1 Hash function H is one-way if, for random key k and an n-bit string
2.5 RIPEMD-160
w, it is hard for the attacker presented with k, w to find x so that Hk (x) = w.
The hash function RIPEMD-160 was proposed by Hans Dobbertin, Antoon Bosselaers and
Definition
Bart Preneel 1.2
[10]. Hash function
It produces H is second-preimage
a 160-bit hash value. Like itsresistant if itRIPEMD,
predecessor is hard for the at-
it consists
with a random key k and random string x to find y
of two parallel streams. While RIPEMD consists of two parallel streams of MD4, the that
tacker presented = x so two
H k (x) =are
streams Hkdesigned
(y). differently in the case of RIPEMD-160.
The message expansion of RIPEMD-160 is a permutation of the 16 message words in each
Definition 1.3 Hash function H is collision resistant if it is hard for the attacker pre-
round, where different permutations are used in each round of the left and the right stream.
sented with a random key k to find x and y = x such that Hk (x) = Hk (y).
In each stream 5 rounds of 16 steps each are used to update the 5 32-bit registers. Figure 3
It isone
shows clear
stepthat the last definition
transformation. The stepimplies the second
transformation definition,
employs if Boolean
5 bitwise it is hard to find
functions
af1collision
, . . . , f5 inbetween two chosen strings x and y, it is even harder to find an x given a y
each stream:
with the same hash. It is in general more difficult to find a relation between one-wayness
of a hash function and the fcollisional
1 (B, C, D) = B ⊕ C ⊕ D
resistance. If a function is one-way, it does not
mean it is difficult to find afcollision.
2 (B, C, D) For
= (B ∧ C) ⊕ (¬B
example take∧aD) function which takes a string
f3 (B,the
of arbitrary length and returns C, D)
last=10
(Bcharacters
∨ ¬C) ⊕ Dof this string. Clearly, from these (9)
ten characters, it is impossible toC,reconstruct
f4 (B, D) = (B ∧ D) the⊕input string (if this string was longer
(C ∧ ¬D)
than 10 characters), but it fis5 (B,
alsoC,easy to see that
D) = B ⊕ (C ∨ ¬D) , collisions can be generated without
any trouble.
8 On the Secure Hash Algorithm family
Now if we want to attack a hash function on the second definition, the (second-
)preimages attack, the best method that works on any hash function (i.e. a generic
attack) is an exhaustive search. Given a hash function Hk , i.e. given w, k find x such
that Hk (x) = w (with k l-bit and w n-bit), would on average take 2n−1 operations
(yielding a 50% chance of finding a preimage for w). If we are dealing with a random
H and treat it as a black box, a preimage attack is as difficult as a second-preimage
attack. This means that knowing that y with Hk (y) = w does not give an advantage
for finding x such that Hk (x) = Hk (y) = w.
On the other hand, if we are looking for a collision for a hash function H, things
are a lot easier. Because of the “birthday paradox” the complexity of such a problem
is only about 2n/2 . Given 23 random people in a room, the chance that there are two
people with the same birthday is a little over 50%, much higher than intuition would
suggest, hence the name. A simple explanation is that the chance two people do not
share their birthdays is 364/365. Now given p people the chance that two people share
a birthday is given by
p−1
Y k
P =1− 1− forp < 366 (1.3)
k=1
365
and for p = 23 this yields a chance of 50.7%. Of course for p ≥ 366 the chance is exactly
1.
The birthday paradox can be used on cryptographic hashes because the output of a
hash function can be considered (pseudo-)random. Therefore if the hash is N bits long,
there are 2N hashes, but after trying only a fraction of that, we have a high chance for
a collision. If we generalize 1.3 to range 2N instead of 365, and count trials with t, we
get the following expression
t−1
Y
N k
P (t; 2 ) = 1 − 1− N fort < 2N (1.4)
k=1
2
i.e. the result is that we only have to try about 2N/2 hashes before finding a collision.
This means that finding a collision is much simpler than a preimage attack. Again using
the birthday illustration, finding two people with the same birthday is relatively easy,
but finding someone else who shares your birthday is much harder.
To understand the impact of the different attacks possible, we will elaborate a bit on the
structure of hash functions. Most hash functions consist of two parts, one part being
the compression function, which takes a key (k) and some fixed-length input message
1.3 Generic Attacks 9
(x) and outputs a hash. The second part is called the domain extender and links several
compression functions together in such a way that the total hash function can operate
on an input string of arbitrary length.
The compression function usually works on blocks on data, much like DES or AES
do. The compression function takes a piece of data n bits long and runs several rounds
of mixing on it. Before the data is hashed, it is padded so that the total length is an
integer multiple of the block size, as explained in section 1.2.1. Usually, the mixing is
done with the data itself as well as with some key of length k. The output is then a hash
of these two input vectors and hopefully, the input vectors are not recoverable from the
output hash. If the compression function is not invertible and calculating collisions is
not easier than the birthday attack, then the compression function is strong and usually
the resulting hash function is too.
Before the data block is used in the (non-linear) functions, it is sometimes expanded.
In SHA, for example, the 512 bit input block (16 32-bit words) is expanded to 80 32-bit
words (2560 bits). Then in each of the 80 rounds, a different part of the expanded
message is used. This makes some kinds of attacks more difficult on the hash function,
as we shall see in 1.4.1. In fact, the only difference between SHA-0 and SHA-1 is the
message expansion, which makes the difference between a complexity of 239 for SHA-0
versus 269 for SHA-1.
In many hash functions, including SHA, the key used in the first round is some
initialisation vector. After the first round, the output of the hash function is used as
key for the next rounds. This construction is part of the Merkle-Damgård scheme,
explained below.
A domain extender that is used a lot, and also in the SHA family, is the Merkle-
Damgård domain extender, which works as follows. The compression function is used
on an initialisation vector (IV) used as key and the first block of the input message.
The output hash is then used as key for the next iteration where the second block of
the data is used as input. In this way, a compression function can be used on a message
of any length. Figure 1.5 shows the Merkle-Damgård scheme in action.
M1 M2 Mk Mk+1
Figure1:1.5:
Figure The Merkle-Damgård
Merkle-Damgård domain exten
construction.
There is a certain flexibility in the first two steps of the Merkle-Damgård construc-
tion. Any encoding will do as long as it satisfies the following three conditions:
– M is encoded as an integral number of m-bit blocks;
– the encoding is collision-free;
– the length of M is encapsulated in the last block.
The Merkle-Damgård construction is compatible with streaming APIs, where a
message is fed one block at a time into a cryptographic search engine. Its length need
10 On the Secure Hash Algorithm family
There are however several attacks possible against this domain extender, regardless
of the hash function used. It should be noted that the domain extender does not
add security to the hash scheme. A requirement of a secure hash scheme is that the
compression function is safe (i.e. non-invertible and strong collision resistant), only then
is the combination of compression function with domain extender secure.
MAC Weaknesses
An example of the way the Merkle-Damgård scheme can be attacked is the following.
Consider the signing of a message. If we have a hash function H which works like
Hk (M ) = t with k the key, M the message and t the signature. If we now would sign
our message, and we would protect the key, one would assume that signing M would be
be secure in this way.
However, if the hash function uses the domain extender explained above (like SHA-
1), this is not true. Let k be 80 bits long, and M 256 bits. Consider now the scenario
where the adversary captures M and the 512 bit signature t. The tag captured is equal
to
t = SHA-1(k||M ) = C(k||M || 100 . . . 0} || 0| . . . 101010000
| {z {z }), (1.6)
112 bits 64 bits
with C() the compression function used in SHA-1 and || the concatenation operator. If
we now want to forge a message with the same signature t, we can easily construct a new
message M 0 which is also ‘signed’ with k, without knowing it. Consider the message
M 0 = k||M ||100 . . . 101010000||T , with T arbitrary text we can choose to our liking.
Now when we hash this using SHA-1, this results in:
This hashing calls the compression function twice because the message length exceeds
the block length, and therefore the domain extender is employed. However, if we look
at the first iteration of the compression function, it is exactly equal to t! Since this is
known we can feed this to the compression function in the second round of the domain
extender as key, and then hash our own message T with that, and we have a message
signed with key k. Clearly, this method of signing messages is completely broken.
To better explain this attack, consider a colliding pair of messages N and N 0 . Now
consider the two messages T and T 0 which are constructed as follows:
with Mi a block of data. If we now let a trusted party sign T , this signature will also
work for T 0 , since N and N 0 collide. Now this may not seem very useful, but if we use
the conditional possibilities of for example the Postscript format, we can influence the
message shown depending on whether we embed N or N 0 :
If R1 and R2 are equal, Postscript executes the first instruction set, and the second
otherwise. Now if we use the poisoned blocks we can choose between these instruction
sets, and thus influence the output of the Postscript document. This attack is not only
theoretical, but has been carried out in practice by Magnus Daum and Stefan Lucks in
2005. They constructed two files with the same MD5 hash [2]. In 2006, Stevens, Lenstra
and de Weger constructed two colliding X.509 certificates [10].
The implication of the poisoned block attack is obvious. If you can find a collision
for a hash function using the Merkle-Damgård scheme, you can embed these blocks in
bigger files. When using some higher level formatting language (Postscript or another
language with conditionals), this block can be used to determine the flow of the format-
ting, branching into two different outputs. The bottom line is: don’t use broken hash
functions.
The SHA-0 hash function is 160 bits long, which means that a birthday attack needs
about 280 hash operations to be successful. In 1998 this hash function was broken by
the two French researchers Chabaud and Joux[1]. In this section, their approach is
explained.
The method Chabaud and Joux used on SHA-0 can be summarized as follows:
• From these collisions, find collisions for the real hash function
expansion E
2560 bits
· · · 80 blocks of 32 bits · · ·
A ROL5
ADD
B
ROL30
C f K
Now for SHI-1, we can construct two colliding W ’s. First we choose any (random)
W , and after that we construct a second W 0 . Now for every bit negated in W and W 0 ,
we apply the correction mechanism to W 0 . After we apply this correction over W 0 , these
two expanded messages have the same SHI-1 has. Because for a change in round i we
need to change bits up to round i + 5, this means that we cannot have differences in the
last five rounds, because these cannot be corrected.
Below are some graphics illustrating the above mechanism. We start with a change
in the first bit of the first round (located at the top right). After applying the corrections
on W , we get the result shown in Fig. 1.7(a) for the first ten 32-bit words of W . If we do
not apply the correction to the input message W , we get the error propagation shown
in Fig. 1.7(b). Displayed in that figure is the state of A for the first ten rounds. If one
looks closely, the pattern shown in Fig. 1.7(a) is visible, and in addition there is the
result of not correcting the error in time, it obviously gets out of hand.
(a) (b)
Figure 1.7: (a) The mask necessary for the correction of the negation of bit 1 in round
1. (b) The result of negating bit 1 in round 1 if no correction is applied for SHI-1.
Now we know how to construct colliding W ’s for SHI-1 using a mask, but this
is actually the result of the expansion process, and thus not directly under control.
Only the first 16 32-bit words can be influenced directly, and the rest is generated in
accordance with formula (1.1). Fortunately, this function is linear in the sense that the
bits do not mix. Thus we want to go back from a mask which we can apply over W to
a (much shorter) mask which we can apply over M . This M will then be a colliding
message for SHI-1.
As said above, the functions are linear and this thus means that we can exhaustively
search all possibilities, there are only 216 for each bit, so in total there are 32 · 216 = 221
expansion functions we have to carry out, which is doable. After iterating over all
possible input messages, we find masks to apply over M such that a collision is found,
this completes breaking the SHI-1 hash function.
Of course the SHI-1 hash function is a very weak hash and is not used at all. There-
fore the next step is to insert the non-linear functions again and see what effect this has
on the error propagation.
Chabaud and Joux continue to define two more simplified versions of SHA-0, SHI-2
and SHI-3. They use these to study the behavior of the non-linear functions in SHA-0.
In SHI-2 they add the non-linear functions fif and fmaj compared with SHI-1 and in
SHI-3 they re-introduces the non-linear ADD to update the state of variable a, but
replace fif and fmaj again with fxor (compare with SHA-0/1 in Section 1.2.1). They
analyze with what probability the fif and fmaj will behave as fxor and find that due
to the property of the fif no consecutive perturbations may occur before round 16.
14 On the Secure Hash Algorithm family
Given a change in c and d the output of fxor does not change, while the output of fif
always changes. Since the fif is active from round 0 to 19 a perturbation in both round
16 and round 17 will be propagated to state variables c and d by round 20. Finally
they are able to find a valid pattern which behaves as correctly under all constraints
with a probability of 1/22 4. Analyzing the ADD function we see that so called ‘carries’
occur, which means that a perturbation in a certain bit position might cause a bit in
the subsequent position to change.
1.6 Conclusion
As we have illustrated, SHA-0 is completely broken. The best attacks are becoming
feasible to calculate within a reasonable time and as such, this function is completely
insecure. Fortunately this was already foreseen at the time when SHA-1 was published.
This hash remedies parts of the problems with SHA-0, but at this time there are already
1.6 Conclusion 15
attacks possible that are faster than the birthday attack. This is not very remarkable as
the hash functions are almost identical, the only difference being the message expansion.
The chance that this hash function will also succumb is therefore not unlikely.
The SHA-2 hash functions on the other hand have as of now not been broken. The
best attacks possible only work on a reduced version, which gives hope for the strength
of these functions. Again, the fact that this hash function is not broken yet is not
very remarkable, as it is much more complex than SHA-1, something that can easily
be seen by comparing Figures 1.4 and 1.2. Even if SHA-2 is broken though, it will be
quite some time until these attacks will be feasible. Since a SHA-0 hash is 160 bits, a
birthday attack has a complexity around 280 , but since the SHA-2 functions produce
much longer hashes, the complexity of birthday attacks against these functions range
from 2112 to 2256 . Even if the latter would be broken with the complexity reduced to
2128 , which would be quite a successful attack, it would still be much much stronger
than the original SHA-1 hash function.
This leaves the authors to wonder why such relatively short hash functions are still
used. SHA-1 is very common and produces 160-bit hashes. Although a complexity of
280 is not yet reachable, it is quite close to what is actually feasible. If the hash length
would be doubled, the complexity would raise to 2160 which is completely impossible to
attack. Even when it would be broken severely, it would still pose no practical problems
for years to come. Furthermore, the cost of the additional hash length is small, a naive
doubling of the hash length would cost a factor of two more in computing time. If the
current rise of 64-bit processors is taken into account, optimized implementations could
perhaps even be faster.
In any case, the story of SHA seems to be a story true for any hash function. MD5,
the hash function used before SHA (and which is still used a lot today) also suffered
the same fate. First a seemingly unbreakable hash function is released, using all kinds
of complex functions to mix the input with some key. Then later some cracks begin to
appear in the design and finally the hash function is broken. It seems that widely used
hash functions are not much more than a collection of rounds, message expansion and
some complex linear functions consisting of AND, OR and XOR gates. The result is
something that looks impressive, but has never been proven to be secure.
This problem could be solved by using hash functions based on RSA or ElGamal,
which have been proven to be secure (if at least the discrete log is hard, which is very
probably is). The security of these hash functions would be beyond doubt and the only
worry we would face is that the keys at some point are too short and can be brute forced,
but this is of course true for any cryptographic function. The reason that these number
theoretical routines are not used is of course that they are way too slow, and the second
best people settle for is a seemingly complex function that runs fast on computers.
The cat and mouse game is not at an end yet, as in ...
- sha strength - problem with hashes in general (not number theoretically safe) -
outlook and future prospect of SHA
16 On the Secure Hash Algorithm family
Bibliography
[1] Florent Chabaud and Antoine Joux. Differential collisions in sha-0. In Advances in
Cryptology CRYPTO ’98, volume 1462, pages 253–261. Springer Berlin / Heidelberg,
february 1998.
[2] Magnus Daum and Stefan Lucks. Attacking hash functions by poisoned messages. Euro-
crypt 2005 presentation, 2005. http://www.cits.rub.de/MD5Collisions/.
[3] Krystian Matusiewicz, Josef Pieprzyk, Norbert Pramstaller, Christian Rechberger, and
Vincent Rijmen. Analysis of simplified variants of sha-256. In WEWoRC 2005 - Western
European Workshop on Research in Cryptology, pages 123–134, 2005.
[4] Ilya Mironov. Hash functions: Theory, attacks, and applications. Technical report, Mi-
crosoft Research, Silicon Valley Campus, november 2005.
[5] NIST. Fips publication 180-1: Secure hash standard. Technical report, National Institute
of Standards and Technology (NIST), April 1995.
[6] NIST. Fips publication 180-2: Secure hash standard. Technical report, National Institute
of Standards and Technology (NIST), August 2002.
[7] Vincent Rijmen. Current status of sha-1. Technical report, Rundfunk und Telekom
Regulierungs-GmbH, Austria, february 2007.
[9] R. L. Rivest, A. Shamir, and L. Adleman. A method for obtaining digital signatures and
public-key cryptosystems. Commun. ACM, 21(2):120–126, 1978.
[10] Marc Stevens, Arjen Lenstra, and Benne de Weger. Target collisions for md5 and colliding
x.509 certificates for different identities. Cryptology ePrint Archive, Report 2006/360,
2006. http://eprint.iacr.org/.
[11] Makoto Sugita, Mitsuru Kawazoe, Ludovic Perret, and Hideki Imai. Algebraic cryptanal-
ysis of 58-round sha-1. In Fast Software Encryption, pages 349–365, 2007.
[12] Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu. Finding collisions in the full sha-1. In
Advances in Cryptology - CRYPTO 2005, pages 17–36, 2005.
[13] Xiaoyun Wang, Hongbo Yu, and Yiqun Lisa Yin. Efficient collision search attacks on
sha-0. In Advances in Cryptology - CRYPTO 2005, pages 1–16, 2005.
17