Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
28 views180 pages

Introduction To Cryptography (2025)

The document provides an introduction to cryptography, defining it as the study of secure communication and data storage techniques against attackers. It outlines key features such as confidentiality, integrity, authenticity, and non-repudiation, while also discussing the historical evolution of cryptographic methods from ancient times to modern computational techniques. Additionally, it covers fundamental concepts, principles, and the importance of randomness in cryptographic systems.

Uploaded by

u1978867
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views180 pages

Introduction To Cryptography (2025)

The document provides an introduction to cryptography, defining it as the study of secure communication and data storage techniques against attackers. It outlines key features such as confidentiality, integrity, authenticity, and non-repudiation, while also discussing the historical evolution of cryptographic methods from ancient times to modern computational techniques. Additionally, it covers fundamental concepts, principles, and the importance of randomness in cryptographic systems.

Uploaded by

u1978867
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 180

Introduction to Cryptography

Alessandro Barenghi

Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB)


Politecnico di Milano
alessandro -dot- barenghi - at - polimi -dot- it

A. Barenghi 089165 - Computer Security


What is cryptography (alt. cryptology)?

Definition
• The study of techniques to allow secure communication and data storage in
presence of attackers

Features provided
• Confidentiality: data can be accessed only by chosen entities
• Integrity/freshness: detect/prevent tampering or replays
• Authenticity: data and their origin are certified
• Non-repudiation: data creator cannot repudiate created data
• Advanced features: proofs of knowledge/computation

A. Barenghi 089165 - Computer Security


The Problem to Solve

Alice Bob
"secret
message"
Untrusted
channel
(Internet)

Threat agent
A Brief History of Cryptography
● From Greek: kryptos, hidden, and graphein,
to write (i.e., “art of secret writing”)
● Ancient history: writing itself was already a
“secret technique”.
● Cryptography born in ancient society, when
writing became more common, and
hidden writing became a need.
Cryptographic prehistory

As old as written communication


• Born for commercial (recipe for lacquer on clay
tablets) or military (Spartans) uses
• Designed by humans, for human computers
• Algorithms computed by hand, with pen and
paper

A. Barenghi 089165 - Computer Security


Ancient view

Original approach
• A battle of wits between
• cryptographers: ideate a secret
method to obfuscate a text
• cryptanalysts: figure out the
method, break the “cipher”
• Bellaso (1553) [1] separates the
encryption method from the key

A. Barenghi 089165 - Computer Security


Cryptographic history

1883 - Kercho↵’s six principles for a good cipher (apparatus)


1 It must be practically, if not mathematically, unbreakable
2 It should be possible to make it public, even to the enemy
3 The key must be communicable without written notes and changeable whenever
the correspondants want
4 It must be applicable to telegraphic communication
5 It must be portable, and should be operable by a single person
6 Finally, given the operating environment, it should be easy to use, it shouldn’t
impose excessive mental load, nor require a large set of rules to be known

A. Barenghi 089165 - Computer Security


Cryptographic modern history

The advent of the machines


• Mechanical computation changes
cryptography
• First rotor machine in 1917 by Ed
Hebern

A. Barenghi 089165 - Computer Security


Cryptographic modern history

The advent of the machines


• Mechanical computation changes
cryptography
• First rotor machine in 1917 by Ed
Hebern
• Design “popularized” in WWII by
German Enigma
• Cryptanalysist at Bletchley park
(Turing among them) credited for a
decisive e↵ort in winning the war by
Eisenhower

A. Barenghi 089165 - Computer Security


When Math Won a War
● During WWII, Alan
Turing worked at
Bletchley Park to
break Axis ciphers,
in particular the
Enigma cipher.
● Birth of the first
universal computers
was stimulated by
this effort.
THE BOMBE (replica)
Moving into the modern age

An end to the battle of wits


• Shannon (1949) [4] - Proves that a mathematically unbreakable cipher exists

A. Barenghi 089165 - Computer Security


Key Concepts in Cryptography
● First formalized by Claude Shannon in his
1949 paper “Communication theory of
secrecy systems”.
● Cryptosystem: a system that takes in input
a message (known as plaintext) and
transforms it into a ciphertext with a
reversible function that usually takes a key
as a further input.
● The use of “text” is historical, and today we
mean “string of bits”.
Kerckhoffs' Principle

● The security of a cryptosystem relies only on the


secrecy of the key, and never on the secrecy of the
algorithm.
○ Auguste Kerckhoffs, “La criptographie militaire”, 1883
● This means that:
○ In a secure cryptosystem we cannot retrieve the
plaintext from the ciphertext without the key.
○ Also, we cannot retrieve the key from analyzing
ciphertext-plaintext pairs.
○ Algorithms must always be assumed known to
the attacker, no secret sauce!
Moving into the modern age

An end to the battle of wits


• Shannon (1949) [4] - Proves that a mathematically unbreakable cipher exists
• Nash (1955) [2] - Argues that computationally secure ciphers are ok
• Considers a cipher with a finite, bit long, key
• Conjecture: if “parts of the key interact complexly [...] in the determination of their
e↵ects on the ciphertext”, the attacker e↵ort to break the cipher would be O(2 )
• The owner of the key takes O( 2) to compute the cipher
• The computational gap is unsurmountable for large

A. Barenghi 089165 - Computer Security


Outline of the topics

In this course
• Definitions of ciphers as components with functionalities
• How to obtain confidentiality, integrity, data/origin authentication
• An overview of protocols (combinations of ciphers)
• Goal: be able to use cryptographic components properly

In Cryptography and Architectures for Computer Security


• Design and cryptanalysis techniques for ciphers
• Cryptographic protocols and their inner workings
• Efficient, side channel attack resistant implementations
• Goal: be able to engineer cryptographic components

A. Barenghi 089165 - Computer Security


Before we start...

https://xkcd.com/221/

A word on randomness
• Randomness (in this course) characterizes a generative process
• Stating: “00101 is a random string” actually makes little sense

A. Barenghi 089165 - Computer Security


Definitions

Data
• Plaintext space P: set of possible messages ptx 2 P
• Old times: words in some human-readable alphabet, modern times {0, 1}l
• Ciphertext space C: set of possible ciphertext ctx 2 C
• Usually {0, 1}l 0 , not necessarily l = l 0 (ciphertexts may be larger)
• Key space K: set of possible keys
• {0, 1} , keys with special formats are derived from bitstrings

A. Barenghi 089165 - Computer Security


Definitions

Functions
• Encryption function E : P ⇥ K ! C
• Decryption function D : C ⇥ K ! P
• Correctness: for all ptx 2 P, we need k, k 0 2 K s.t. D(E(ptx, k), k 0 ) = ptx

Decrypted
Plaintext Plaintext

Encryption Decryption
key key

Ciphertext Ciphertext

A. Barenghi 089165 - Computer Security


Providing confidentiality

Goal
• Prevent anyone not authorized from being able to understand data

Possible attacker models


• The attacker simply eavesdrops (Ciphertext only attack )
• The attacker knows a set of possible plaintexts (Known plainntext attack)
• Limit case: the attacker chooses the set of plaintexts (Chosen plainntext attack)

A. Barenghi 089165 - Computer Security


Providing confidentiality

Goal
• Prevent anyone not authorized from being able to understand data

Possible attacker models


• The attacker simply eavesdrops (Ciphertext only attack)
• The attacker knows a set of possible plaintexts (Known plainntext attack)
• Limit case: the attacker chooses the set of plaintexts (Chosen plainntext attack)
• The attacker may tamper with the data and observe the reactions of a
decryption-capable entity
• Limit case: the attacker sees the actual decrypted value

A. Barenghi 089165 - Computer Security


Symmetric Encryption
Confidentiality
Symmetric Encryption
Alice

key

secret

#$%#$fdasd
"Hello
Bob" E hasd4hhel3
45489dsf57

symmetric
plaintext encryption
ciphertext
function
over
untrusted
channel
Symmetric Encryption
Alice Bob

key key

secret secret

#$%#$fdasd
"Hello D "Hello
Bob" E hasd4hhel3
bob"
45489dsf57

symmetric symmetric
plaintext encryption decryption plaintext
ciphertext
function function
over
untrusted
channel
Symmetric Encryption
Alice Bob
key key
Trusted
secret secret
channel

#$%#$fdasd
"Hello D "Hello
Bob" E hasd4hhel3
bob"
45489dsf57

symmetric symmetric
plaintext encryption decryption plaintext
ciphertext
function function
over
untrusted
channel
Symmetric Encryption
● The basic idea of encryption
○ Use key K to encrypt plaintext in ciphertext
○ Use same key K to decrypt ciphertext in plaintext
● Synonyms: shared key encryption, secret
key encryption
● Issue: how do we agree on the key?
○ Cannot send key on same channel as message!
○ Off-band transmission mechanism needed
● Issue: scalability
● A symmetric algorithm is a cocktail...
First ingredient: substitution

Substitution: “replacing each byte with another”


Toy example (Caesar cipher)
○ replace each letter in a sentence with the one
following it by K positions in the alphabet
○ Example: “SECURE” becomes “VHFXUH” with K = 3

Many issues (it’s a toy example!)


First ingredient: substitution
Substitution: “replacing each byte with another”
Toy example (Caesar cipher)
○ replace each letter in a sentence with the one
following it by K positions in the alphabet
○ Example: “SECURE” becomes “VHFXUH” with K = 3

Many issues (it’s a toy example!):


○ if cipher known, with 25 attempts at most, 13 on
average, we have the key: keyspace too small
First ingredient: substitution
Substitution: “replacing each byte with another”
Toy example (Caesar cipher)
○ replace each letter in a sentence with the one
following it by K positions in the alphabet
○ Example: “SECURE” becomes “VHFXUH” with K = 3

Many issues (it’s a toy example!):


○ if cipher known, with 25 attempts at most, 13 on
average, we have the key: keyspace too small.
○ repetitions and structure "visible" in ciphertext:
monoalphabetic ciphers are weak (frequency analysis
- CPTX only attack) https://en.wikipedia.org/wiki/Letter_frequency
Second ingredient: transposition
Transposition (or diffusion) means “swapping
the values of given bits” plaintext

H A L L O

ciphertext
E V E R
Toy example (matrix):
Y O N E !
○ Write by rows, read by columns
○ Key: K = (R, C) with R * C ~ len(msg)
Many issues (it’s a toy example!)
Example - Diffusion
H A L L O

E V E R

Y O N E !

m= HALLO EVERYONE!
k=(3,5)
c=H YAEOLVNLEEOR!
Example - Diffusion

H A L L O

m= HALLO
k=(3,5)
c=H A L L O
R * C >> len(msg)
15 >> 4
Second ingredient: transposition
Transposition (or diffusion) means “swapping
the values of given bits” plaintext

H A L L O

ciphertext
E V E R
Toy example (matrix):
Y O N E
○ Write by rows, read by columns
○ Key: K = (R, C) with R * C ~ len(msg)
Many issues (it’s a toy example!):
○ Keyspace still relatively small
Second ingredient: transposition
Transposition (or diffusion) means “swapping
the values of given bits” plaintext

H A L L O

ciphertext
E V E R
Toy example (matrix):
Y O N E
○ Write by rows, read by columns
○ Key: K = (R, C) with R * C ~ len(msg)
Many issues (it’s a toy example!):
○ Keyspace still relatively small
But repetitions and structure gone
○ We now really need to test all possible structures
Perfectly secure cipher

Definition
• In a perfect cipher, for all ptx 2 P and ctx 2 C,
Pr(ptx sent = ptx) = Pr(ptx sent = ptx | ctx sent = ctx)
• In other words: seeing a ciphertext c 2 C gives us no information on what the
plaintext corresponding to c could be

Question
• The definition is not constructive! Does a perfect cipher exist?
• If yes, what does it look like?

A. Barenghi 089165 - Computer Security


Perfectly secure cipher

Theorem (Shannon 1949)


Any symmetric cipher hP, K, C, E, Di with |P| = |K| = |C| is perfectly secure if and
only if
1
• every key is used with probability |K|
• a unique key maps a given plaintext into a given ciphertext:
8(ptx, ctx) 2 P ⇥ C, 9!k 2 K s.t. E(ptx, k) = ctx

A. Barenghi 089165 - Computer Security


Perfectly secure cipher

Theorem (Shannon 1949)


Any symmetric cipher hP, K, C, E, Di with |P| = |K| = |C| is perfectly secure if and
only if
1
• every key is used with probability |K|
• a unique key maps a given plaintext into a given ciphertext:
8(ptx, ctx) 2 P ⇥ C, 9!k 2 K s.t. E(ptx, k) = ctx

A simple working example


• Assume P, K, C to be set of binary strings. The encryption function draws a
uniformly random, fresh key k out of K each time it is called and computes
ctx = ptx k

A. Barenghi 089165 - Computer Security


Anticausal implementations

A concrete apparatus
• Gilbert Vernam actually patented a
telegraphic machine implementing
ptx k on Baudot code in 1919
• Joseph Mauborgne suggested the use
of a random tape containing k
• Using Vernam’s encrypting machine
with Mauborgne’s suggestion
implements a perfect cipher

A. Barenghi 089165 - Computer Security


The One Time Pad: Perfect Cipher
● XOR of a message m and a random key k of
the same size of m: len(k) = len(m)
○ The key is pre-shared and consumed while writing.
Can never be re-used again!
● The OTP is a minimal perfect cipher
○ Minimal because |K| = |M|
Perfectly secure 6= perfectly usable

Key storage/management
• storing key material and changing keys
is a nightmare
• perfect cipher broken in practice due
to key theft/reuse
• generating random keys was also an
issue (and caused breaks)
Photo courtesy of Cryptomuseum.com

A. Barenghi 089165 - Computer Security


Imperfections and Brute Force
● Real-world algorithms are not perfect (|K|<|M|),
and so can be broken
○ each ciphertext-plaintext pair leaks a small amount of
information (because the key is re-used)
● Only thing unknown is the key (Kerckhoffs)
○ Remember: the algorithm itself is known!
● Brute forcing is possible for any real world cipher
○ Try all possible keys, until one produces an output that
“makes sense”.
● Perfect ciphers (one time pads) are not
vulnerable to Brute Force
○ because trying all the (random) keys will yield all the
(possible) plaintexts, which are all equally likely (= no clue)
Non Perfect Cipher Example

M = UGO (21 7 15)


K=+3
C = XJR (24 10 18)

Bruteforcing: k=-1-2-3.....-26 For each k, he/she


shifts all letters...
Non Perfect Cipher Example

M = UGO (21 7 15)


K=+3
C = XJR (24 10 18)

Bruteforcing: k=-1-2-3.....-26 For each k, he/she


shifts all letters...
K=-1 ... M1 = WIQ
Non Perfect Cipher Example

M = UGO (21 7 15)


K=+3
C = XJR (24 10 18)
Bruteforcing: k=-1-2-3.....-26 For each k, he/she
shifts all letters...
K=-1 ... M1 = WIQ
K=-2 ... M2 = VHP
Non Perfect Cipher Example

M = UGO (21 7 15)


K=+3
C = XJR (24 10 18)
Bruteforcing: k=-1-2-3.....-26 For each k, he/she
shifts all letters...
K=-1 ... M1 = WIQ
K=-2 ... M2 = VHP
K=-3 ... M3 = UGO
“Toy” Perfect Cipher Example
M = UGO (21 7 15)
K=+1+2+3
C = VIR (22 9 18)
Bruteforcing...For each letter try all k=-1-1-1,
-1-1-2,.....-26-26-26
“Toy” Perfect Cipher Example
M = UGO (21 7 15)
K=+1+2+3
C = VIR (22 9 18)
Bruteforcing...For each letter try all k=-1-1-1,
-1-1-2,.....-26-26-26
M1=ADA
M2=XKR
M3=ELM
M4=UGO
Cryptanalysis: Breaking Ciphers
A real (non perfect) cryptosystem is broken if there
is a way to break it that is faster than brute forcing.
Types of attacks:
● Ciphertext attack: analyst has only ciphertexts
● Known plaintext attack: analyst has a set of
pairs plain-ciphertext
● Chosen plaintext attack: analyst can choose
plaintexts and obtain their respective ciphertexts
Example: can you break this?
- I give you a ZIP-compressed file encrypted
with a (secret) 4-bytes key
- I tell you how I encrypted it -- algorithm
should not be secret (by Kerchoffs):
- C = K xor M
- Example:
- K(hex) = AA BB CC DD (repeat the key)
- M(hex) = 50 4B 03 04 BA DA 55 55 .. .. .. (and so on)
XOR
- C(hex) = FA F0 CF D9 10 61 99 88 .. .. .. .. .. .. ..

- I give you a ZIP file encrypted with a key:


can you recover the key w/o bruteforcing?
The Zip Example
Algorithm: C = K xor M
- K(hex) = AA BB CC DD .. .. .. .. .. .. .. (repeat the key)
- M(hex) = 50 4B 03 04 BA DA 55 55 .. .. .. (and so on)
XOR
- C(hex) = FA F0 CF D9 10 61 99 88 .. .. .. .. .. .. ..

PERFECT OR NOT ?
The Zip Example
Algorithm: C = K xor M
- K(hex) = AA BB CC DD .. .. .. .. .. .. .. (repeat the key)
- M(hex) = 50 4B 03 04 BA DA 55 55 .. .. .. (and so on)
XOR
- C(hex) = FA F0 CF D9 10 61 99 88 .. .. .. .. .. .. ..

NOT PERFECT -> len(k) < len(M) -> k is reused


The Zip Example:
Can you break this?
Algorithm: C = K xor M
- K(hex) = AA BB CC DD .. .. .. .. .. .. .. (repeat the key)
- M(hex) = 50 4B 03 04 BA DA 55 55 .. .. .. (and so on)
XOR
- C(hex) = FA F0 CF D9 10 61 99 88 .. .. .. .. .. .. ..

NOT PERFECT -> len(k) < len(M) -> k is reused

- C = K xor M
- K = M xor C
- K = X X X X.... xor FA F0 CF D9....
The Zip Example
Algorithm: C = K xor M
- K(hex) = AA BB CC DD .. .. .. .. .. .. .. (repeat the key)
- M(hex) = 50 4B 03 04 BA DA 55 55 .. .. .. (and so on)
XOR
- C(hex) = FA F0 CF D9 10 61 99 88 .. .. .. .. .. .. ..

- K = M xor C
- K = 50 4B 03 04 xor FA F0 CF D9 = AA BB CC DD ->
- ATTACK?
The Zip Example
Algorithm: C = K xor M
- K(hex) = AA BB CC DD .. .. .. .. .. .. .. (repeat the key)
- M(hex) = 50 4B 03 04 BA DA 55 55 .. .. .. (and so on)
XOR
- C(hex) = FA F0 CF D9 10 61 99 88 .. .. .. .. .. .. ..

- K = M xor C
- K = 50 4B 03 04 xor FA F0 CF D9 = AA BB CC DD ->
- KNOWN PLAINTEXT ATTACK
Computationally secure cryptography

A more practical assumption


• Build a cipher so that a successful attack is also able to solve a hard
computational problem efficiently
• Solve a generic nonlinear Boolean simultaneous equation set
• Factor large integers / find discrete logarithms
• Decode a random code/find shortest lattice vector

Can we avoid assumptions?


• Open question: prove an exponential lower bound for the time taken to solve a
hard problem, which has efficiently verifiable solution
• it would shift from a computational security assumption to a theorem
• ... and prove P 6= NP as a corollary

A. Barenghi 089165 - Computer Security


Factor Large Integers (hints)

If p and q are two large primes:


● computing n = p * q is easy
● but given n it is painfully slow to get p
and q
○ quadratic sieve field, basically “try all primes until
you get to the smaller between p and q”

Here “slow/difficult” means “computationally very


intensive”, for all practical purposes the problem
requires bruteforce over all possible values of x
Discrete logarithm (hints)
● If y = ax then x = logay (Math 101)
● given x, a, p,
○ it is easy to compute y = ax mod p,
○ but knowing y, it is difficult to compute x

Different problem than factorization, but it can


be shown that they are related
Proving computational security

Outline of the method


1 Define the ideal attacker behaviour

2 Assume a given computational problem is hard


3 Prove that any non ideal attacker solves the hard problem

How to represent attacker and properties?


• Attacker represented as a program able to call given libraries
• Libraries implement the cipher at hand
• Define the security property as answering to a given question
• The attacker wins the game if it breaks the security property more often than
what is possible through a random guess

A. Barenghi 089165 - Computer Security


Cryptographically Safe Pseudorandom Number Generators

Motivation and assumption


• We want to use a finite-length key and a Vernam cipher
• We somehow need to “expand” the key
• We assume that the attacker can only perform poly( ) computations

Definition
A CSPRNG is a deterministic function prng: {0, 1} ! {0, 1} +l whose output cannot
be distinguished from an uniform random sampling of {0, 1} +l in O(poly( )). l is the
CSPRNG stretch.

A. Barenghi 089165 - Computer Security


CSPRNGs in practice

Existence
• In practice, we have only candidate CSPRNGs
• We have no proof that a function prng exists
• Proving that a CSPRNG exists implies directly P6=NP

Practical constructions
• Building a CSPRNG “from scratch” is possible, but it is not the way they are
commonly built (not efficient)
• Practically built with another building block: PseudoRandom Permutations
(PRPs)
• defined starting from PseudoRandom Functions (PRFs)

A. Barenghi 089165 - Computer Security


Random Functions (RFs)

Randomly drawing a function


• Consider the set F = {f : {0, 1}in ! {0, 1}out ; in, out 2 N}
• A uniformly randomly sampled f $ F can be encoded by a 2in entries table, each
in
entry out bit wide. |F| = (2out )2

Toy example in = 2, out = 1


• F = {f : {0, 1}2 ! {0, 1}1 } is the set of the 16 Boolean functions w/ two inputs
• Each one is represented by a 4-entry truth table
2
• Intuitively, the functions are 16 as there are 24 = 16 = (21 )2 tables

A. Barenghi 089165 - Computer Security


Pseudorandom Functions (PRFs)

Definition
• A function prfseed : {0, 1}in ! {0, 1}out taking an input and a
bits seed.
• The entire prfseed is described by the value of the seed
• It cannot be told apart from a random
f 2 {f : {0, 1}in ! {0, 1}out } in poly( )
• That is, if they give you a 2 {f : {0, 1}in ! {0, 1}out }, you
can’t tell which one of the following is true
• a = prfseed (·) with seed $ {0, 1}
• b $ F, where F = {f : {0, 1}in ! {0, 1}out }

A. Barenghi 089165 - Computer Security


Pseudorandom Permutations (PRPs)

Pseudorandom Permutation definition


• A bijective PRF: prfseed : {0, 1}len ! {0, 1}len

Wrapping your mind around it


• It is uniquely identified by the value of the seed
• It is not possible to tell apart in poly( ) from a RF
• It’s a permutation of all the possible {0, 1}len strings

Operatively speaking
• acts on a block of bits outputs another one of the same size
• the output “looks unrelated” to the input
• its action is fully identified by the seed
• Useful to think of the seed as a key

A. Barenghi 089165 - Computer Security


Real world PRPs

The issue
• No formally proven PRP exists, yet
• again, its existence would imply P6=NP

Typical construction
1 Compute a small bijective Boolean function f of input and key

2 Compute f again between the previous output and the key


3 Repeat 2 until you’re satisfied

A. Barenghi 089165 - Computer Security


In the real world...
Practical solution: public scrutiny
• Modern PRPs are the outcome of public contests
• Cryptanalytic techniques provide ways (=poly( ) tests) to detect biases in their
outputs: good designs are immune

PRPs a.k.a. Block ciphers


• Concrete PRPs go by the historical name of block ciphers
• Considered broken if, with less than 2 operations, they can be told apart from a
PRP, e.g., via:
• Deriving the input corresponding to an output without the key
• Deriving the key identifying the PRP, or reducing the amount of plausible ones
• Identifying non-uniformities in their outputs
• The key length is chosen to be large enough so that computing 2 guesses is
not practically feasible

A. Barenghi 089165 - Computer Security


Keyspace and Brute Forcing
Keyspace generally measured in bits
● Attack time exponential on the number of
bits (i.e., 33 bits need twice the time of 32)

Basic Solution ?
Keyspace and Brute Forcing
Keyspace generally measured in bits
● Attack time exponential on the number of
bits (i.e., 33 bits need twice the time of 32)
● Need to balance computational power vs key
length.
Keyspace vs. Time for Brute Forcing
Quantifying computational unfeasibility

Boolean operations vs. energy to bring from 20 � ! 100 �


• 265 op.s ⇡ an Olympic swimming pool
• 280 op.s ⇡ the annual rainfall on the Netherlands
• 2114 op.s ⇡ all water on Earth

Practically acceptable unfeasibility


• Legacy level security: at least 280 Boolean operations
• 5 to 10 years security: at least 2128 Boolean operations
• Long term security: at least 2256 Boolean operations

A. Barenghi 089165 - Computer Security


Widespread block ciphers

Advanced Encryption Standard (AES)


• 128 bit block, three key lengths: 128, 192 and 256 bits
• Selected after a 3 years public contest in 2000-10-2 by NIST out of 15 candidates,
re-standardized by ISO/IEC
• ARMv8 and AMD64 include dedicated instructions accelerating its computation
(hitting 3+ GB/s)

Data Encryption Algorithm (DEA, a.k.a. DES)


• Legacy standard by NIST (1977), the key is too short (56b)
• Patch via triple encryption, = 112 equivalent security
• Still found in some legacy systems, officially deprecated

A. Barenghi 089165 - Computer Security


Case Study: DES
Originally designed by IBM (1973-1974)

Its core function is an S-box (a key-dependent


substitution)

It uses a 56 bit key (256 keyspace)


Case Study: DES vs. NSA
In 1976 it becomes a US standard; its S-boxes
are "redesigned" by the NSA

● Late 1980s: differential cryptanalysis discovered


● 1993: shown that the original S-boxes would have
made DES vulnerable to the differential
cryptanalysis, whereas the NSA-designed S-boxes
were specifically immune to that.
● Wait! Wasn't differential cryptanalysis unknown until
late 1980s? Mmmmmmaybe the NSA knew about
differential crypto in the 70s.
An Electronic CodeBook (ECB)

A first attempt to encryption with PRPs


• It’s ok to encrypt a plaintext 6 block size with a block cipher

ptx0

k Enc

ctx0

A. Barenghi 089165 - Computer Security


An Electronic CodeBook (ECB)

A first attempt to encryption with PRPs


• It’s ok to encrypt a plaintext 6 block size with a block cipher
• An extension to multiple blocks could be split-and-encrypt
• Is it good (equivalent to Vernam fed with a CSPRNG)?

ptx0 ptx1 ptxn

k Enc k Enc ······ k Enc

ctx0 ctx1 ctxn

A. Barenghi 089165 - Computer Security


Counter (CTR) mode
Getting it right
• The boxed construction is provably a PRNG if Enc is a PRP
• There is nothing special in the starting point of the counters

000 . . . 0 000 . . . 1 000 . . . n

k Enc k Enc ······ k Enc

ptx0 ptx1 ptxn

ctx0 ctx1 ctxn

A. Barenghi 089165 - Computer Security


Raising the requirements

Confidentiality achieved ... for CoA


• Up to now, the attacker knew only ciphertext material

Confidentiality against Chosen Plaintext Attacks (CPAs)


• Our attacker knows a set of plaintexts which can be encrypted
• He wants to understand which one is being encrypted
• Ideal attacker: cannot tell which plaintext was encrypted out of two he chose
(having the same length)
• Feels strange, but it happens with:
• management data packets in network protocols (e.g., ICMP)
• telling apart a encrypted commands to a remote host

A. Barenghi 089165 - Computer Security


Achieving CPA Security

No deterministic encryption
• The CTR mode of operation is insecure against CPA
• The encryption is deterministic: same ptxs ! same ctx

Decryptable nondeterministic encryption


1 Rekeying: change the key for each block with a ratchet

2 Randomize the encryption: add (removable) randomness to the encryption


(change mode of employing PRP)
3 Numbers used ONCE (NONCEs): in the CTR case, pick a NONCE as the counter
starting point. NONCE is public

A. Barenghi 089165 - Computer Security


Symmetric ratcheting

Getting it right
• The construction takes the name from the mechanical component: it is not
possible to roll-back the procedure once you delete the value carried by green
arrows

A. Barenghi 089165 - Computer Security


CPA-Secure Counter (CTR) mode
Getting it right
• Picking the counter start as a NONCE generates di↵erent bitstreams to be xor-ed
with the ptx each time
• The same plaintext encrypted twice is turned into two di↵erent, random-looking,
ciphertexts

nonce + 0 nonce + 1 nonce + n

k Enc k Enc ······ k Enc

ptx0 ptx1 ptxn

ctx0 ctx1 ctxn


A. Barenghi 089165 - Computer Security
A good point to remember...
Malleability and active attackers

Malleability
• Making changes to the ciphertext (not knowing the key) maps to predictable
changes in the plaintext
• Think about AES-CTR and AES-ECB
• Can be creatively abused to build decryption attacks
• Can be turned into a feature (homomorphic encryption)

How to avoid malleability


• Design an intrinsically non malleable scheme (non trivial)
• Add a mechanism ensuring data integrity (against attackers)

A. Barenghi 089165 - Computer Security


Providing data integrity

Confidentiality 6) Integrity
• Up to now our encryption schemes provide confidentiality
• Changes in the ciphertext are undetected (at best)

Message Authentication Codes (MAC)


• Add a small piece of information (tag) allowing us to test for the message
integrity of the encrypted message itself
• Adding it to the plaintext and then encrypting is not good
• Nomenclature misleads: MACs do not provide data authentication

A. Barenghi 089165 - Computer Security


MACs

Definition
• A MAC is constituted by a pair of functions:
• compute tag(string,key): returns the tag for the input string
• verify tag(string,tag,key): returns true or false
• Ideal attacker model:
• knows as many message-tag pairs as he wants
• cannot forge a valid tag for a message for which he does not know it already
• forgery also includes tag splicing from valid messages
• N.B. the tag creating entity and the verifying entity must both know the same
secret key
• The tag verifier is able to create a valid tag too
• ... and there goes the non-repudiation property

A. Barenghi 089165 - Computer Security


How to build a MAC? the CBC-MAC

ptx0 ptx1 ptxn

k Enc k Enc ······ k Enc

tag

Building a MAC with a PRP (block cipher)


• The CBC-MAC is secure for prefix free messages (why?)
• Encrypting the tag once more fixes (provably) the issue

A. Barenghi 089165 - Computer Security


Practical MAC uses

Browser cookies
• HTTP cookies are a “note to self” for the HTTP servera
• The note should not be tampered between server reads
• Solution: server runs compute tag(cookie,k) and stores both the (cookie,tag)
a
You can find a two slides cookies refresher at slides 40-41 of
https://polimi365-my.sharepoint.com/:b:
/r/personal/10032133_polimi_it/Documents/FCI/2-Livello_Applicativo_v2020.pdf

Later in the course


• Mitigating SYN-based denial of service attacks (SYN Cookies)
• Time-based two-factor authentication mechanisms (TOTP/HOTP)

A. Barenghi 089165 - Computer Security


Compressing for the sake of efficiency

Testing integrity
• Testing the integrity of a file requires us to compare it bit by bit with an intact
copy or read it entirely to compute a MAC
• It would be fantastic to test only short, fixed length strings independently from
the file size, representing the file itself
• Major roadblock: there is a lower bound to the number of bits to encode a given
content without information loss
• Can we build something close to the ideal scenario?

A. Barenghi 089165 - Computer Security


What is a Hash Function
A function H( ) that maps arbitrary-length input
x on fixed-length output, h (digest)
● Need to be Fast
OK

yes

h %^682erhf? KO
"Hello h = h'
H 348754
Bob" no

Trusted Channel h'


plaintext hash
function

H
Untrusted Channel plaintext "Hello
bob"
What is a Hash Function
A function H( ) that maps arbitrary-length input x
on fixed-length output, h
● Need to be Fast
● Collisions: codomain “smaller” than domain.

Computationally infeasible to find:


○ input x such that H(x) = h (given a specific hash)
■ preimage attack resistance
○ input y s.t. y ≠ x and H(y) = H(x), with a given x
■ second preimage attack resistance
○ couples of inputs {x, y} s.t. H(x) = H(y)
■ collision resistance
Attacks to Hash Functions (1)
Hash functions may be broken.
1. Arbitrary collision or (1st or 2nd) preimage attack:
Given a specific hash h, the attacker can find
x such that H(x) = h
or, equivalently, given a specific input x can find
y such that y ≠ x and H(y) = H(x)
faster than brute forcing.

With a n-sized hash function, random collisions


can happen in (2n - 1) cases.
Attacks to Hash Functions (2)
2. Simplified collision attack:
The attacker can generate colliding couples
{x, y} s.t. H(x) = H(y)
faster than brute forcing.

Random collisions can happen in (2n/2) cases


because of the birthday paradox:
- given n randomly chosen people, some pairs will have same birthday
- probability = 100% if n = 367
- probability = 99.9% if n = 70 people
- probability = 50% if n = 23 people, and so on...
- vs. very low chances that some of you are born on a specific date
Cryptographic hashes

A pseudo-unique labeling function


• A cryptographic hash is a function H : {0, 1}⇤ ! {0, 1}l for which the following
problems are computationally hard
1 given d = H(s) find s (1st preimage)
2 given s, d = H(s) find r 6= s with H(r ) = d (2nd preimage)
3 find r , s; r 6= s, with H(s) = H(r ) (collision)
• Ideal behaviour of a concrete cryptographic hash:
1 finding 1st preimage takes O(2d ) hash computations guessing s
2 finding 2nd preimage takes O(2d ) hash comp.s guessing r
d
3 finding a collision takes ⇡ O(2 2 ) hash computations
• The output bitstring of a hash is known as a digest

A. Barenghi 089165 - Computer Security


Concrete hash functions

What to use
• SHA-2 was privately designed (NSA), d 2 {256, 384, 512}
• SHA-3 followed a public design contest (similar to AES), selected among ⇡ 60
candidates, d 2 {256, 384, 512}
• Both currently unbroken and widely standardized (NIST, ISO)

What not to use


• SHA-1: d = 160, collision-broken [6] (obtainable in ⇡ 261 op.s)
• MD-5: horribly broken [7]. Collisions in 211 , public tools online [5]
• In particular, collisions with arbitrary input prefixes in ⇡ 240

A. Barenghi 089165 - Computer Security


Uses for hash functions

Pseudonymized match
• Store/compare hashes instead of values (e.g., Signal contact discovery)

MACs
• Building MACs: generate tag hashing together the message and a secret string,
verify tag recomputing the same hash
• A field-proven way of combining message and secret is HMAC
• Standardized (RFC 2104, NIST FIPS 198)
• Uses a generic hash function as a plug-in, combination denoted as HMAC-hash name
• HMAC-SHA1 (!), HMAC-SHA2 and HMAC-SHA3 are ok

Forensic use
• Write down only the hash of the disk image you obtained in official documents

A. Barenghi 089165 - Computer Security


Game changing ideas

Features we would like to have


• Agreeing on a short secret over a public channel
• Confidentially sending a message over a public authenticated channel without
sharing a secret with the recipient
• Actual data authentication

Solution: asymmetric cryptosystems


• Before 1976: rely on human carriers / physical signatures
• DH key agreement (1976) / Public key encryption (1977)
• Digital signatures (1977)

A. Barenghi 089165 - Computer Security


Asymmetric Cryptosystems
Confidentiality (plus something more)
The Diffie-Hellman key agreement

Goal
• Make two parties share secret value w/ only public messages

Attacker model
• Can eavesdrop anything, but not tamper
• The Computational Diffie-Hellman assumption should hold

CDH Assumption
• Let (G, ·) ⌘ hg i be a finite cyclic group, and two numbers a, b sampled unif. from
{0, . . . , |G| - 1} ( = len(a) ⇡ log2 |G|)
• given g a , g b finding g ab costs more than poly(log |G|)
• Best current attack approach: find either b or a (discrete log problem)

A. Barenghi 089165 - Computer Security


Structure
Key agreement between Alice and Bob
$
• Alice: picks a {0, . . . , |G| - 1}, sends g a to Bob
• Bob: picks b $ {0, . . . , |G| - 1}, sends g b to Alice
• Alice: gets g b from Bob and computes (g b )a
• Bob: gets g a from Bob and computes (g a )b
• (G, ·) is commutative ! (g b )a = (g a )b , we’re done!

Groups used in practice


• A subgroup ⇤
✓ ✓ (G, ·) of (Zn , ·) (integers
◆ modn),◆ breaking CDH takes
1 2
min O e k(log(n)) (log(log(n)))
3 3
, O(2 2 )

• EC points w/ dedicated addition, breaking CDH takes O(2 2 )

A. Barenghi 089165 - Computer Security


Example: Diffie-Hellman Agreement
● Used by Alice and Bob to agree on a secret
over an insecure channel
○ two people talk in the middle of the classroom,
everybody hears them, but at the end only those two
people know a secret, and nobody else
● One-way trapdoor/hard problem: discrete
logarithm
○ If y = ax then x = logay (Math 101)
○ given x, a, p, it is easy to compute y = ax mod p, but
knowing y, it is difficult to compute x
○ Here “difficult” means “computationally very
intensive”, for all practical purposes the problem
requires bruteforce over all possible values of x
How does D-H work (1) - Example

Pick p prime, a primitive root of p, public

Primitive root: a number a such that raising it to


any number between 1 and (p - 1), mod p, we
obtain each number between 1 and (p - 1)
● Example: 3 is a primitive root of 7 because
○ 31 mod 7 = 3 32 mod 7 = 2 33 mod 7 = 6
○ 34 mod 7 = 4 35 mod 7 = 5 36 mod 7 = 1

So let a = 3, p = 7 known to everyone in the system


How does D-H work (2) - Keys
Secret number (undisclosed): They pick a
number X in [1, 2, ..., (p-1)]
● Alice XA XA= 3
● Bob XB XB= 1

Public number (disclosed to everyone): They


compute:
● Alice YA= 33 mod 7 = 6
● Bob YB= 31 mod 7 = 3
How does D-H work (3)

Alice Bob
YA "6"

Untrusted
channel
"3" YB

XA XB
"6" "3"
How does D-H work (3)

Alice Bob
YA "6"

Untrusted
channel
"3" YB

XA 6? 3? XB
"6" "3"
How does D-H work (4) - Secret
At this point, they can compute a secret K

● Since

● Alice

● Bob

Anybody else can listen, but cannot compute K


● Because they miss the secret.
Public key encryption
● Concept: a cipher that uses two keys
○ What is encrypted with key1 can be decrypted only with
key2 (and not with key1 itself), and viceversa.
● Also called “public key cryptography”
○ Idea: one of the two keys is kept private by the subject,
and the other can be publicly disclosed.
● It solves the problem of key exchange
● We will not describe their maths in depth
○ They use a one-way function with a trapdoor
• The private key "cannot" be retrieved from the public key
• It should be easy to compute the public from the private
○ They are usually computational-intensive
Public key encryption: Key Exchange

Alice Bob

Untrusted
channel
te

c
va

bli

te

lic
pri

va
pu

b
pri

pu
SA PA
SB PB
key pair
key pair
Public key encryption: Key Exchange

Alice Bob
PA "here is my public key"

Untrusted
channel
"here is my public key" PB
te

c
va

bli

te

lic
pri

va
pu

b
pri

pu
SA PA
SB PB
key pair
key pair

PB Bob's public key Alice's public key PA


Public key encryption: Confidentiality

Trust assumption:
PB SB
only Bob knows his private key

#$%#$fdasd
"Hello D "Hello
Bob" E hasd4hhel3
bob"
45489dsf57

asymmetric asymmetric
plaintext encryption decryption plaintext
ciphertext
function function
over
untrusted
channel
Exercise: what is this instead?
Trust assumption:
Everybody knows Alice's public key
only Alice knows her private key

SA PA

#$%#$fdasd
"Hello D "Hello
Bob" E hasd4hhel3
bob"
45489dsf57

asymmetric asymmetric
plaintext encryption decryption plaintext
ciphertext
function function
over
untrusted
channel
Public Key Encryption

Plaintext Ciphertext

Asymmetric Public Key(pair) Private Asymmetric


Encryption key generation key Decryption

Ciphertext Plaintext

Components
• Di↵erent keys are employed in encryption/decryption
• It is computationally hard to:
• Decrypt a ciphertext without the private key
• Compute the private key given only the public key

A. Barenghi 089165 - Computer Security


Widespread Asymmetric encryption ciphers

Rivest, Shamir, Adleman (RSA), 1977


• 2048 to 4096 bit message-and key-sizes
• Patented after the invention, patent now expired
• No ciphertext expansion
• Incidentally, the encryption with a fixed key is a PRP

Elgamal encryption scheme, 1985


• Either kbit range keys, or 100’s of bits keys, depending on the variant
• Not encumbered by patents
• The ciphertext is twice the size of the plaintext
• Widely used as an RSA alternative where patents were a concern

A. Barenghi 089165 - Computer Security


The RSA Algorithm (hints) - 1

Hard problem: factorization

If p and q are two large primes:


● computing n = p * q is easy
● but given n it is painfully slow to get p and q
● quadratic sieve field, basically “try all primes
until you get to the smaller between p and q”
● Different problem than mod-log (D-H), but it
can be shown that they are related
The RSA Algorithm (hints) - 2
● Factoring n is exponential in the number of
bits of n
● Computation time for encryption grows
linearly in the number of bits of n
■ square-and-multiply algorithm in hardware
● At the moment of writing:
○ 512-bit RSA factored within 4 hours on EC2 for <
$100: http://seclab.upenn.edu/projects/faas/faas.pdf
○ No demonstration of practical factoring of anything
larger than 700 bits
○ key sizes > 1024 are safe
○ 2048 or 4096 typical choices
A cautionary note on security margins

Computational hardness
• Up to now, enumeration of the secret parameter was the best possible attack
• This is ok for modern block ciphers ! best attack: O(2 )
• Asymmetric cryptosystems rely on hard problems for which bruteforcing the secret
parameter is not the best attack
⇣ 1 2

• Factoring a bit number takes O e k( ) 3 (log( )) 3

• Comparing bit-sizes of the security parameters instead of actual complexities is


really wrong
• Concrete bit-sizes for depending on the cipher: www.keylength.org

A. Barenghi 089165 - Computer Security


Key Lengths: caveat
● Key length measured in bits both in
symmetric and asymmetric algorithms
● However, they measure different things
○ Symmetric: number of decryption attempts
○ Asymmetric: number of key-breaking attempts
● Therefore:
○ You can compare symmetric algorithms based on
the key (e.g., CAST-128 bit “weaker” than AES-256)
○ You cannot directly compare asymmetric
algorithms based on key length.
○ More importantly, never compare directly
asymmetric vs. symmetric key lengths!
○ https://www.keylength.com/en/4/
Key Encapsulation

Assumption
• A public channel between Alice and Bob is available
• For the moment, the attacker model is “eavesdrop only”

Sharing a secret without agreement


• Alice: generates a keypair (kpri , kpub ), sends to Bob
• Bob: gets s $ {0, 1} , encrypts it with kpub , sends ctx to Alice
• Alice: decrypts ctx with kpri , recovers s
• Note: Bob alone decides the value of the shared secret s
• Repeat the procedure with swapped roles and combine the two secrets to achieve
similar guarantees to a key agreement

A. Barenghi 089165 - Computer Security


Efficiency considerations

Can’t I skip a step?


• Employing an asymmetric cryptosystem Bob encrypts a text for Alice without the
need of sharing a secret beforehand
• In principle, Bob and Alice could employ only an asymmetric cryptosystem to
communicate

A. Barenghi 089165 - Computer Security


Efficiency considerations

Can’t I skip a step?


• Employing an asymmetric cryptosystem Bob encrypts a text for Alice without the
need of sharing a secret beforehand
• In principle, Bob and Alice could employ only an asymmetric cryptosystem to
communicate
• In practice this approach would be extremely inefficient
• Asymmetric cryptosystems are from 10⇥ to 1000⇥ slower than their symmetric
counterparts

A. Barenghi 089165 - Computer Security


Best of both worlds
Random symmetric key Plaintext

Recipient
Random
public Asymm. symmetric Symmetric
encryption Encryption key Encryption
key

Encrypted
Ciphertext
random key

Message being sent

Hybrid encryption schemes


• Asymmetric schemes to provide key transport/agreement
• Symmetric schemes to encrypt the bulk of the data
• All modern secure transport protocols built around this idea
A. Barenghi 089165 - Computer Security
Authenticating data

Motivations
• To build a secure hybrid encryption scheme we need to be sure that the public key
the sender uses is the one of the recipient
• We’d like to be able to verify the authenticity of a piece of data without a
pre-shared secret

Digital signatures
• Provide strong evidence that data is bound to a specific user
• No shared secret is needed to check (validate) the signature
• Proper signatures cannot be repudiated by the user
• They are asymmetric cryptographic algorithms
• formally proven that you cannot get non repudiation otherwise

A. Barenghi 089165 - Computer Security


Digital signatures

Message Signed Message

Signature Verification
Sign key Key(pair) key Verify
(private)
generation
(public)

Signed Message OK/ not OK

Computationally hard problems


• Sign a message without the signature key
• this includes splicing signatures from other messages
• Compute the signature key given only the verification key
• Derive the signature key from signed messages

A. Barenghi 089165 - Computer Security


Message Authentication
Trust assumption:
only Alice knows her private key Everybody knows Alice's public key

SA PA

#$%#$fdasd
"Hello D "Hello
Bob" E hasd4hhel3
bob"
45489dsf57

asymmetric asymmetric
plaintext encryption decryption plaintext
function ciphertext function
over untrusted channel
Digital signature: Authentication and Integrity
Trust assumption:
only Alice knows her private key Everybody knows Alice's public key
SA PA

Integrity: OK!

yes

h %^682erhf? h
"Hello
H 348754 D h = h' KO
Bob" E no

h'
asymmetric asymmetric
plaintext hash
encryption decryption
function

H
function function
signature + plaintext
over untrusted channel

"Hello
bob"

plaintext
Widespread Signature schemes

Rivest, Shamir, Adleman (RSA), 1977


• Unique case: the same hard-to-invert function to build an asymmetric encryption
scheme and a signature (di↵erent message processing!)
• Signing definitely slower than verification (⇡ 300⇥)
• Standardized in NIST DSS (FIPS-184-4)

Digital Signature Standard (DSA)


• Derived from tweaking signature schemes by Schnorr and Elgamal
• Also standardized in NIST DSS (FIPS-184-4)
• Signature and verification take roughly the same time

A. Barenghi 089165 - Computer Security


Digital signature uses

Authenticating digital documents


• For performance reasons, sign the hash of the document instead of the document
• Signature properties now guaranteed only if both signature and hash algorithms are
not broken

Authenticating users
• Alternative to password-based login to a system
• The server has the user’s public verification key (e.g. deposited at account creation)
• The server asks the client to sign a long randomly generated bitstring (challenge)
• If the client returns a correctly signed challenge, it has proven its identity to the
server

A. Barenghi 089165 - Computer Security


The public key binding problem

Cautionary note
• Both in asymmetric encryption and digital signatures, the public key must be
bound to the correct user identity
• If public keys are not authentic:
• A MITM attack is possible on asymmetric encryption
• Anyone can produce a signature on behalf of anyone else
• The public key authenticity is guaranteed with... another signature
• We need someone to sign the public-key/identity pair
• We need a format to distribute signed pairs

A. Barenghi 089165 - Computer Security


A case of identity
● A digital signature ensures that plaintext was
authored by someone.
A case of identity
● A digital signature ensures that plaintext was
authored by someone.
● Not really! It ensures it was encrypted with
a certain key...it says nothing about “who” is
using that private key
● Ditto for using public key for encryption!
A case of identity
● A digital signature ensures that plaintext was
authored by someone.
● Not really! It ensures it was encrypted with
a certain key...it says nothing about “who” is
using that private key
● Ditto for using public key for encryption!
● Exchange of public keys must be secured
(either out of band, or otherwise)
● PKI (Public Key Infrastructure) associates
keys with identity on a wide scale
PKI
● A PKI uses a trusted third party called a
certification authority (CA)
● The CA digitally signs files called digital
certificates, which bind an identity to a
public key
○ Identity = “Distinguished Name (DN)”
○ As defined in the X.509 standard (most used one)
● Now we can recognize a number of
subjects...
Digital certificates

Digital certificates
• They bind a public key to a given identity, which is:
• for humans: an ASCII string
• for machines: either the CNAME or IP address
• They specify the intended use for the public key contained
• Avoids ambiguities when a key format is ok for both an encryption and a signature
algorithm
• They contain a time interval in which they are valid
• Most widely deployed format is described in ITU X.509

A. Barenghi 089165 - Computer Security


Bob's Digital Certificate (1)

Bob CA

PB

SB
Bob's ID card + public key
Bob's Digital Certificate (2)

Trust assumption:
SCA
only the CA knows its private key
Identity
(DN)
“Bob”

Public Key H E

PB
asymmetric
hash
encryption
CA's digital function
function
signature
Retrieving Bob's Certificate
Alice CA
Bob
I need Bob's public key

PB

Trust assumption:
SB
only Bob knows his private key

Untrusted
"Hello D "Hello
Bob" E bob"
channel

asymmetric asymmetric
plaintext encryption decryption plaintext
function function
Zoom in: Is the public key valid?
Alice CA

PB
Identity
(DN)
“Bob”

Public Key
PB

CA's digital
Valid?
signature
PKI
● A PKI uses a trusted third party called a
certification authority (CA)
● The CA digitally signs files called digital
certificates, which bind an identity to a
public key
○ Identity = “Distinguished Name (DN)”
○ As defined in the X.509 standard (most used one)
● Now we can recognize a number of
subjects...provided that we can obtain the
public key of the CA
Zoom in: Is the public key valid?
Alice CA

PB
Identity
(DN)
“Bob”

Public Key
PB

CA's digital
Valid?
signature

I need CA's public key


Zoom in: Is the public key valid?
Alice CA

PB
Identity
(DN)
“Bob”

Public Key
PB

CA's digital
Valid?
signature

I need CA's public key


Zoom in: Is the public key valid?
Alice CA

PB
Identity
(DN)
“Bob”

Public Key
PB

CA's digital
Valid?
signature

I need CA's public key


Zoom in: Is the public key valid?
Alice CA

PB
Identity
(DN)
“Bob”

Public Key
PB

CA's digital
Valid?
signature

I need CA's public key

Where is this going to end?


The Certificate Chain
"Quis custodiet custodes?"

The CA needs a private key to sign a certificate


● The public key...must be in a certificate.

Someone else needs to sign that certificate


● And so on…at some point this needs to stop!
Certification authorities

Who signs the certificates


• The certificate signer is a trusted third party, the CA
• The CA public key is authenticated... with another certificate
• Up to a self-signed certificate which has to be trusted a priori

Trusted storage

Subject Subject Subject


CA1.com CA2.com
foo.com
Subject Subject Subject
public key public key public key
Issuer Issuer Issuer
CA1.com CA1.com
CA2.com

Signature Signature Signature


made by issuer made by issuer made by issuer

A. Barenghi 089165 - Computer Security


We Need a Trusted Element:
Root of trust
Top-level CA (root CA, source CA)

Uses a self-signed certificate


● cannot be verified: it's a trusted element
● Basically a document that says “I am myself”
Certification authorities hierarchy

Subject

Root CA
CA1.com

Subject
public key
Issuer
CA1.com

Signature
made by CA1.com

Subsidiary
Subject Subject Subject
CA2.com CA3.com CA4.com

Subject Subject Subject


CA
public key public key public key
Issuer Issuer Issuer
CA1.com CA1.com CA1.com

Signature Signature Signature


made by CA1.com made by CA1.com made by CA1.com

Subject Subject Subject


foo.com bar.com baz.com

Subject Subject Subject


public key public key public key
Issuer Issuer Issuer
CA2.com CA2.com CA2.com

Signature Signature Signature


made by CA2.com made by CA2.com made by CA2.com

A. Barenghi 089165 - Computer Security


How to distribute the trusted element?

An authority releases it
● the state
● a regulator
● the organization management

CA already (de facto standard)


Do you trust your operating system? Do you trust the list of root certificates that ship with it?
A recent browser certificate storage

A. Barenghi 089165 - Computer Security


How to distribute the trusted element?

An authority releases it
● the state
● a regulator
● the organization management

CA already (de facto standard)

Decentralizing trust (e.g., PGP web-of-trust)


Decentralizing Trust: Web of Trust
Let me
sign yo
ur cert

Let
me
sig
ny
our
cer
t

ign your cert


Let me s

No one signed my cert yet


Certificate Revocation Issues
● Signatures cannot be revoked (destroyed).
● Certificates need to be revoked at times.
● Certificate Revocation Lists (CRL)
Verification Sequence for Certificates

1. Does the signature validate the document?


○ Hash verification as we have seen
2. Is the public key the one on the certificate?
3. Is the certificate the one of the subject?
○ Problems with omonimous subjects, DN
4. Is the certificate validated by the CA?
○ Validate the entire certification chain, up to the root
5. Is the root certificate trusted?
○ Need to be already in possession of the root cert
6. Is the certificate in a CRL?
○ How do we get to the CRL if we are not online?
Verification Sequence for Certificates

Any missing check = vulnerability!


1. Does the signature validate the document?
○ Hash verification as we have seen
2. Is the public key the one on the certificate?
3. Is the certificate the one of the subject?
○ Problems with omonimous subjects, DN
4. Is the certificate validated by the CA?
○ Validate the entire certification chain, up to the root
5. Is the root certificate trusted?
○ Need to be already in possession of the root cert
6. Is the certificate in a CRL?
○ How do we get to the CRL if we are not online?
Case study: Italian “legal” digital
signatures framework
Introduced in Italy with D.P.R. 513/97
● many modifications, in particular when
implementing EU regulations

Original Italian scheme: a list of “screened” CAs

Result: each CA created their own digital


signature application (i.e., trusted element)
Attacking Digital Signature Applications

Digital signature stronger than handwritten signature

● Written documents can be modified, written


signatures can be copied.
● Digital signature value tied to content, and
cannot be forged unless the algorithm is broken
● However, a digital signature is brittle: if a fake
is forged, it cannot be told from real one
Crypto: OK – Software Design: KO
Italian signature standards use strong,
unbroken cryptographic algorithms!

However, vulnerabilities did emerge

Do you remember the “bank vault door in a tent”?


Bug 1: Fields of pain
● Bug notified on 9/9/2002
● The software of several CAs (originally DiKe
by Infocamere was the subject of scrutiny)
allowed users to sign Word documents with
dynamic fields or macros without notice
● A macro does not change the bit sequence of
the document, so the signature does not
change with the visualized content
● Examples and stuff on Prof. Zanero’s home:
http://home.deib.polimi.it/zanero/bug-firma.html
(Italian only, sorry for that!)
Example
Reactions
● The CAs responded that this was “intended
behavior” and that it did not violate the law
● However:
○ Microsoft, on 30/1/03, released an Office patch to allow
disabling macros via API.
○ Nowadays, all software show a big alert when signing
an Office document.
○ New legislation explicitly excludes modifiable and
scriptable formats (but recommends PDF)
● The issue is actually much deeper
○ Decoders of complex formats should also be validated
○ Research field of “what you see is what you sign”
Horror story 1 overview

The importance of being static


• 2002-09-09: several pieces of software performing digital signatures with legal
value in Italy allowed to sign MS Word documents containing macros
• Macros allow to dynamically change the displayed text in a document according to,
e.g., the current date
• Striking mismatch between what was thought to be signed (the visualized
document) and the actual signed object (a program-document blend)
• Current standard for digital signatures on human-intended documents
(PAdES,CAdES) target PDF and XML formats
• n.b.: PDFs may embed Javascript, PDF/A do not

A. Barenghi 089165 - Computer Security


Bug 2: Firma&Cifra
● Firma&Cifra was the digital signature
application by PostECom
● Bug found by anonymous on 20/03/2003:
http://www.interlex.it/docdigit/sikur159.htm
● Result: creation and verification of a
signature with a fake certificate
● Also in this case: no cryptographic algorithm
was broken to perform the show
Vulnerability Description
● In order to verify a signature, we need
author certificate and the certificate chain
○ Theoretically, all available online
○ To allow offline verification, everything included
with the document, in a PKCS#7 envelope
● Verification of root certificates must use
preinstalled ones
○ Most software comes with them
○ The root certificate storage is a critical point!
● Firma&Cifra trusts the root certificate in
the PKCS#7 envelope, and it even imports
it in the secure storage area.
The Exploit: Arséne Lupin signature
1. Generate a fake root certificate with the same name
as a real one (e.g., PostECom itself)
2. Use this to generate a fake user certificate (in our
example Arsène Lupin)
3. Use Arsène Lupin's certificate to sign theft and
burglary confessions.
4. Include the fake root cert to the PKCS#7 envelope.

Sign Fake Sign to recipient


Fake Root
Certificate Document
Certificate
"Arsène Lupin"

PKCS#7 Envelope
to recipient
Les jeux sont faits: Lupin's Certficate

Fake Fake Root


Certificate Certificate
"Arsène Lupin" Signature
The Exploit: Arséne Lupin signature

Document

Best comment by Postecom: this is "by design"


(yep: wrong design, but still design!)
Horror story 2 overview

Trust with care


• 2003-02-20: Firma&Cifra was the digital signature application by PostECom
• When presented with any certificate bundled with a signed document, it
considered the certificate authentic and added it to its trusted storage
• Signature forgery as easy as: 1) create your own CA certificate, 2) sign your
target-user certificate, 3) sign the document
• Take away point: which certificates reside in your (applications’) trusted storage
determine who you trust

A. Barenghi 089165 - Computer Security


Putting it all together

Certi�cation Sender
Authority Recipient (public) actions Random
actions encryption key symmetric Plaintext
and identity key

Recipient
Random
Signing (private) public Asymm. symmetric Symmetric
key of the CA Sign encryption Encryption key Encryption
key

Encrypted
Recipient certificate Ciphertext
random key

Message being sent


This way of communicating is the mainstay of modern secure comm. protocols
(TLS, OpenVPN, IPSec)

A. Barenghi 089165 - Computer Security


Directions in modern cryptography

Issues to solve, features to realize


• What if we have a quantum computer?
• Some computationally hard problems are no longer hard
• Move away from cryptosystems based on factoring/dlog
• Alternatives available and being standardized (2022-04)
• What if we want to compute on encrypted data?
• Yes, but it’s moderately-to-horribly inefficient
• What if the attacker has physical access to the device computing the cipher (or
some way of remotely measure it)
• Take into account side channel information in the attacker model

A. Barenghi 089165 - Computer Security


Fundamentals of Information Theory

What is Shannon’s information theory?


• Shannon’s [3] way to mathematically frame communication
• A way to quantify information

What do we need this for (in this course)?


• Quantitatively frame “luck” and “guessing”

A. Barenghi 089165 - Computer Security


Basic Definitions

Information Information
Encoder Channel Decoder
source destination

Basics
• A communication takes place between two endpoints
• sender: made of an information source and an encoder
• receiver: made of an information destination and a dencoder
• Information is carried by a channel in the form of a sequence of symbols of a finite
alphabet

A. Barenghi 089165 - Computer Security


Transmitting and receiving

Losing uncertainty = Acquiring information


• The receiver gets information only through the channel
• it will be uncertain on what the next symbol is, until the symbol arrives
• thus we model the sender as a random variable
• Acquiring information is modeled as getting to know an outcome of a random
variable X
• the amount of information depends on the distribution of X
• intuitively: the closer is X to a uniform distribution, the higher the amount of
information I get from knowing an outcome
• Encoding maps each outcome as a finite sequence of symbols
• More symbols should be needed when more information is sent

A. Barenghi 089165 - Computer Security


Measuring uncertainty: Entropy

Desirable properties
• Non negative measure of uncertainty
• “combining uncertainties” should map to adding entropies

Definition
• Let X be a discrete r.v. with n outcomes in {x0 , . . . , xn-1 } with Pr(X = xi ) = pi
for all 0 6 i 6 n
P
• The entropy of X is H(X) = n-1 i=0 -pi logb (pi )
• The measurement unit of entropy depends on the base b of the logarithm: typical
case for b = 2 is bits

A. Barenghi 089165 - Computer Security


Examples

https://xkcd.com/1210/

X : Uniformly random 6 letters word


• X is a sequence of 6 unif. random letters (626 ⇡ 3.1 · 107 )
P3.1·107 1 1
• H(X) ⇡ i=0 - 3.1·10 7 logb ( 3.1·107 ) ⇡ 28.2b

• X is a uniform pick from 6-letters English words


P6300 1 1
• H(X) ⇡ i=0 - 6300 logb ( 6300 ) ⇡ 12.6b

A. Barenghi 089165 - Computer Security


Shannon’s noiseless coding theorem

Statement (informal)
It is possible to encode the outcomes n of i.i.d. random variables, each one with
entropy H(X), into no less than nH(X) bits per outcome. If < nH(X) bits are used,
some information will be lost.

Consequences
• Arbitrarily compression of bitstrings is impossible without loss
• Cryptographic hashes must discard some information
• Guessing a piece of information (= one outcome of X) is at least as hard as
guessing a H(X) bit long bitstring
• overlooking for a moment the e↵ort of decoding the guess

A. Barenghi 089165 - Computer Security


Min-Entropy

A practical mismatch
• It is possible to have distributions with the same entropies

Plucking low-hanging fruits


• We define the min-entropy of X as H1 (X) = - log(max pi )
i
• Intuitively: it’s the entropy of a r.v. with uniform distribution, where the
probability of each outcome is (max pi )
i
• Guessing the most common outcome of X is at least as hard as guessing a H1 (X)
bit long bitstring

A. Barenghi 089165 - Computer Security


Example

A very biased r.v.


X = 0128 with Pr 12
Consider X : 1
X = a, a 2 1{0, 1}127 with Pr 2128

Intuition and quantification


• Predicting an outcome shouldn’t be too hard: just say 0128
• H(X) = 12 (- log2 ( 12 )) + 2127 2128
1 1
(- log2 ( 2128 )) = 64.5b
• H1 (X) = - log2 ( 12 ) = 1b
• Min-entropy tells us that guessing the most common output is as hard as guessing
a single bit string

A. Barenghi 089165 - Computer Security


The Systems Perspective
“You have probably seen the door to a bank
vault…10-inch thick, hardened steel, with large
bolts…We often find the digital equivalent of
such a vault door installed in a tent. The people
standing around it are arguing over how thick
the door should be, rather than spending their
time looking at the tent.”

(Niels Ferguson & Bruce Schneier, Practical Cryptography)


Conclusions
Perfect ciphers vs. real world: brute-forcing
○ Broken-unbroken ciphers: need for transparency
○ Key lengths matters
○ Symmetric, asymmetric algorithms and hash
functions
○ PKI and CAs and their complexity

We saw several case studies of attacks against


crypto applications
○ They had everything to do with systems security
without even touching the algorithms themselves
Bibliography I

Giovan Battista Bellaso.


La cifra del sig. giovan battista bellaso, 1553.

John Nash.
Personal communication to the us national security agency, Feb. 1955.

Claude E. Shannon.
A mathematical theory of communication.
Bell Syst. Tech. J., 27(3 and 4):379–423 and 623–656, 1948.

Claude E. Shannon.
Communication theory of secrecy systems.
Bell Syst. Tech. J., 28(4):656–715, 1949.

Marc Stevens.
The hashclash project, 2009.

Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, and Yarik Markov.
The first collision for full SHA-1.
In Jonathan Katz and Hovav Shacham, editors, Advances in Cryptology - CRYPTO 2017 - 37th Annual International Cryptology Conference,
Santa Barbara, CA, USA, August 20-24, 2017, Proceedings, Part I, volume 10401 of Lecture Notes in Computer Science, pages 570–596.
Springer, 2017.

A. Barenghi 089165 - Computer Security


Bibliography II

Marc Stevens, Alexander Sotirov, Jacob Appelbaum, Arjen K. Lenstra, David Molnar, Dag Arne Osvik, and Benne de Weger.
Short chosen-prefix collisions for MD5 and the creation of a rogue CA certificate.
In Shai Halevi, editor, Advances in Cryptology - CRYPTO 2009, 29th Annual International Cryptology Conference, Santa Barbara, CA, USA,
August 16-20, 2009. Proceedings, volume 5677 of Lecture Notes in Computer Science, pages 55–69. Springer, 2009.

A. Barenghi 089165 - Computer Security


Further reading: a practical attack
based on MD5 collisions
● It is known that MD5 allows a chosen prefix
collision under certain constraints
○ Here the attack is used to create two valid CA
certificates with the same signature:
http://www.win.tue.nl/~bdeweger/CollidingCertificates/
○ Extended to threaten CAs in 2008:
http://www.win.tue.nl/hashclash/rogue-ca/
● An evolution of the technique was used in
Flame, a nasty malware used against
several middle-Eastern targets
○ http://trailofbits.files.wordpress.com/2012/06/flame-m
d5.pdf 117

You might also like