Cryptanalysis
Cryptanalysis
Cryptanalysis
webcrypt
Prof. Zeph Grunschlag
webcrypt
2
webcrypt
webcrypt - software that I created for the
purpose of learning about cryptanalysis of
some classical ciphers
• Easy to add functionality
• Included modules:
• Substitution
• Affine
• Vigenère
• Hill
• Enigma
3
Encrypting with
webcrypt
1. Go to web-page or file you wish to encrypt
2. Click “Crypto”
3. Choose cipher (click on radio-button)
4. Enter key
5. Click “Encrypt”
6. Save resulting file locally
4
Key-types
1. Substitution: keyword
• CAPTAINAHAB
∗
CAPTINHB
2. Affine: a_b with a ∈ Z26, b ∈ Z26
• 7_11
3. Vigenère: keyword repeated in keystream
• PHAT PHATPHATPHAT...
4. Hill: d_keyword with d the matrix dimension
• 3_traderjoe
19 3 9
17 4 14
5. Enigma r1r2r3_c1_c2_c3_c4_c5 0 17 4
• hog_az_by_cx_dw_ev
5
Cryptanalysis Tools
• Available under “Analyze”. Similar in usage
to “Crypto” - specify analysis tool, enter
possible value, click “Analyze”
Available tools:
• Frequency analysis, Digrams, Trigrams,
Quadragrams compute basic statistics.
6
Attack Models
• known ciphertext: Eve only sees ciphertexts
• known plaintext: Eve knows a limited
number of ciphertext-plaintext pairs
7
Cryptanalysis Examples
Will describe how to crack
8
Cracking Substitution
General ad-hoc approach:
1. Keep (plain→cipher) and (cipher →plain)
lists for discovered letters
2. Use frequency analysis but be aware that
statistical robustness decreases as you go
further down the list
3. Use webcrypt’s “!” option to obtain
partial decrypts
4. Use weaknesses in keyword-based
encryption to your advantage
9
Frequency Analysis
Compare frequencies
expected (from analysis expected ACTUAL
of entire Moby Dick) e N
t B
with frequencies found in
a S
ciphertext. List from o E
most common to least. n K
i Q
e, t, a occur each more
s T
than 8%. O thru H all h F
near 7% hard to r P
distinguish. N,B,S similar. l A
Guess {e,t,a} → {N,B,S}
as unordered sets
10
Digraph Analysis
Reasonable to expect
mapping between most expected ACTUAL
frequent letters: e→N.
th SE
Confirmed by looking at
top two digraphs. he EN
Assume: (e,t,h)→(N,S,E) in BJ
(as ordered tuples).
er EB
Next: “BJ” at 2.4% stands
above crowd, similar to an JL
“in” at 2.1%. Guess
(i,n)→(B,J).
11
Trigraph Analysis
Map top plain-trigraphs: expected KNOWN
plain a b c d e f g h i j k l m n o p q r s t u v w x y z
CIPHER T N L E B J Q S
CIPHERA B C D E F G H I J K L M N O P Q R S T U V WX Y Z
plain i h n g e s t a
13
Exploit Keyword
Method
plain a b c d e f g h i j k l m n o p q r s t u v w x y z
CIPHER T N L E B J Q S
Assuming moderate sized keyword implies
that after keyword:
• cipher-alphabet monotonic
• cipher letter ≤ plain letter
• many simple sequences
• cipher letter = plain letter rest equal
plain a b c d e f g h i j k l m n o p q r s t u v w x y z
CIPHER T N L E B J K MO P Q S U V WX Y Z
14
Partial Decrypt
Summarize with transformation tables:
plain a b c d e f g h i j k l m n o p q r s t u v w x y z
CIPHER T N L E B J KMO P Q S U VWX Y Z
15
Guess from context
Lines 6-9
• partial decrypt:
XVING HISWH OXESP INXXE XXOXY SOTHA TWHEN HISVA STWRI NXXEX
XOREH EAXRO SESOX ETWEN TYORX OREXE ETOUT OXTHE WATER THENO
WRISI NGSWE XXSWI THAXX THEIR XONXX UENTW AVESX AZZXI NGXYX
ROXEA GAINS TITVI NXIXT IVEXY TOSSI NGTHE IRSHI VEREX SPRAY
• cipher text:
FVBJL EBQWE KFNQM BJAFN ARKAY QKSET SWENJ EBQVT QSWPB JDFNA
GKPNE NTAPK QNQKH NSWNJ SYKPH KPNGN NSKUS KGSEN WTSNP SENJK
WPBQB JLQWN FFQWB SETFF SENBP IKJGF UNJSW TVNQA TZZFB JLFYR
PKDNT LTBJQ SBSVB JABIS BVNFY SKQQB JLSEN BPQEB VNPNA QMPTY
17
Cracking Hill
Simple known plaintext attack (no webcrypt)
1. Letting you assume dimension = 3
2. Set up system of equations between
i = yi
plaintexts and ciphertexts: K · x
.. .. ..
3. Find three blocks such that X = x1x2x3
.. .. ..
is invertible.
4. Solve matrix
equation
K · X = Y for K
.. .. ..
where Y = y1y2y3
.. .. ..
• Generalizes directly to unknown dimension
18
Find invertible known
plaintext matrix
• the blue whale is the biggest creature
• the blu ewh ale ist heb
• xi =(19,7,5)’
19 1 4
, (1,11,20)’ , (4,22,7)’ , (0,11,4)’...
• det 75 1122
20 7
= -6496 not invertible
19 1 0
mod 26
19
Solve System
Corresponding texts:
10 2 5 3 2516
• Solution to K · X = Y is K = Y X
−1
20
Cracking Vigenère
21
Index of Coincidence
Friedman’s method based on:
DEF: The index of coincidence of the
string x = x1x2 . . . xn is denoted by Ic(x) and
is the probability that a pair of randomly
chosen characters in x will be equal.
THM: Ic(x) is minimized when all letters
occur with equal probability.
23
Maximum Interpolation
DEF: The interpolation of two probability
distributions (pi), (qi)is defined to be the sum
! piqi
i
THM: Suppose that (pi) is a permutation of (qi)
Then the interpolation of (pi), (qi) is
maximized when ∀i, pi = qi
COR: Suppose (pi) is the expected frequency
of English letters and (qi) is the frequency of a
period-p sub-ciphertext which has been
cipher-shifted by -k. Then the interpolation
is maximized when k is a correct key char.
24
Offset Analysis
webcrypt’s offset analysis with period 7 results:
Text
25
0.070
K = topmost
0.053
0.035
0.018 A B C D E F G
H I J K L M N
O P Q R S T U
V W X Y Z
0
OFFSET 0
OFFSET 1
OFFSET 2
OFFSET 3
OFFSET 4
OFFSET 5
OFFSET 6
26
Cracking Enigma
1. Draw a Turing graph by comparing the
known plain-text to the cipher-text
2. Enter 2-connected component information
into webcrypt’s Bombe to discover the
rotor settings
3. Apply webcrypt’s Plug Analysis repeatedly
do discover letter pairs one at a time.
Note: It is possible to weaken the known
plaintext attack to a known sub-plaintext
attack, using Enigma design flaw.
27
Turing Graph
• Vertices V = Z26
• Labeled edges of the form ({u,v}, i ) i = 1, 2, . . . , n
defined when ei(u) = v
28
From Known Plaintext
to Partial Turing Graph
step 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 20 1 2 3 4 5 6 7 8 9
plain t h e b l u e w h a l e i s t h e b i g g e s t c r e a t
cipher Z D H P S M T K R w S S X L P w Z F Y O L N P Q A A P C L
• e.g., e12(e) = S implies the edge E 12 S
• Idempotent property
M
6
U
A
graph undirected
O
20
G
21
L
F
18 5,11,14
29 B C
4 25,28
10 8
P A W K
15 27 23 26 16
24 7 12 9 2
Q T E S R H D
17 22
1
Z N
3
29
Partial Turing Graph
6
M U
A O
20
G
21
L
F
18 5,11,14
29 B C
4 25,28
10 8
P A W K
15 27 23 26 16
24 7 12 9 2
Q T E S R H D
17 22
1
Z N
3
30
2-Conn. Components
1. {E, P, S, T, Z, L}
2. {C, A, W, R, H}
L
5,11,14
29 C
25,28
10
P A W
15 27 23 26 16
7 12 9
T E S R H
17
1
Z
31
Turing Graph Theorem
• K = (R,P) denotes an Enigma key with rotor
setting R and plugboard setting P
33
Describe all cycles
6 non-spanning-tree
edges
∴
P
6 cycles
T E S
34
webcrypt notation
Component
• “:” after first edge - all L
cycles must start from
that edge
• “-” between edges
• “|” between each 29 5 11 14
cycle, first edge
suppressed P
RESULT: (source=E) 15 27 23
27:23-12|15-7|15-1-17 T 7 E 12 S
|23-5-29-7|23-5-11-12 17
1
|23-5-14-12 Z
35
webcrypt notation
Entire graph
• Separate components with “_”
RESULT: (remove new-line for Bombe input)
27:23-12|15-7|15-1-17|23-5-29-7|23-5-11-12|
23-5-14-12_25:28|28-10-16-9-26
sources
L
5,11,14
29 C
25,28
10
P A W
15 27 23 26 16
7 12 9
T E S R H
17
1
Z
36
Disovered: “NGE”
37
Plug Analysis
1. Decrypt ciphertext using discovered rotors
but no plugs (Key = “NGE”)
• frequency analysis non-random text
• most common letter D must be plugged
to another as shouldn’t be most common
2. Plug Analysis(NGE_D) applied to
ORIGINAL CIPHERTEXT
• Lists least-frequent/most-frequent ratio
resulting from attempting D-X plug
combinations for all possible X
• X=E most likely plugged (min. ratio)
3. Go-to #1 above with new guess “NGE_DE”
38
Plug Analysis - loops
1. Freq. anls. after decrypting with “NGE”
a. D must be plugged
b. Plug anls. original file-value “NGE_D”
c. Discovered D E setting
2. Freq. anls. after decrypting with “NGE_DE”
a. L must be plugged
b. Plug anls. with value “NGE_DE_L”
c. Discovered L O setting
3. Freq. anls. “NGE_DE_LO” decrypt
a. G must be plugged
b. Plug anls. with value “NGE_DE_LO_G”
c. Discovered G A setting
39
1.Plug Analysis - loops
2.
3. continued
4. Freq. anls. “NGE_DE_LO_GA” decrypt
a. Z must be plugged (too frequent)
b. Plug anls. value “NGE_DE_LO_GA_Z”
c. Discovered Z N setting
5. Freq. anls. “NGE_DE_LO_GA_ZN”
a. M must be plugged (6.4% vs. 2.4%)
b. Plug anls. “NGE_DE_LO_GA_ZN_M”
c. Discovered M I setting
CRACK: Key is “NGE_DE_LO_GA_ZN_MI”
40