Cryptanalysis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Cryptanalysis with

webcrypt
Prof. Zeph Grunschlag
webcrypt
2
webcrypt
webcrypt - software that I created for the
purpose of learning about cryptanalysis of
some classical ciphers
• Easy to add functionality
• Included modules:
• Substitution
• Affine
• Vigenère
• Hill
• Enigma
3
Encrypting with
webcrypt
1. Go to web-page or file you wish to encrypt
2. Click “Crypto”
3. Choose cipher (click on radio-button)
4. Enter key
5. Click “Encrypt”
6. Save resulting file locally

4
Key-types
1. Substitution: keyword
• CAPTAINAHAB

CAPTINHB
2. Affine: a_b with a ∈ Z26, b ∈ Z26
• 7_11
3. Vigenère: keyword repeated in keystream
• PHAT PHATPHATPHAT...
4. Hill: d_keyword with d the matrix dimension
 
• 3_traderjoe
19 3 9
17 4 14
5. Enigma r1r2r3_c1_c2_c3_c4_c5 0 17 4

• hog_az_by_cx_dw_ev
5
Cryptanalysis Tools
• Available under “Analyze”. Similar in usage
to “Crypto” - specify analysis tool, enter
possible value, click “Analyze”
Available tools:
• Frequency analysis, Digrams, Trigrams,
Quadragrams compute basic statistics.

• Friedman test and Offset analysis - Vigenère


• Bombe and plug analysis - Enigma
Note: recommend fixed width fonts

6
Attack Models
• known ciphertext: Eve only sees ciphertexts
• known plaintext: Eve knows a limited
number of ciphertext-plaintext pairs

• chosen plaintext: Eve tricks Alice into


encrypting a limited number plaintexts of
Eve’s choice

• chosen ciphertext: Eve tricks Bob into


decrypting a limited number ciphertexts
that has created

7
Cryptanalysis Examples
Will describe how to crack

• Substitution and Vigenere with known


ciphertext attacks
• Hill and Enigma with known plaintext
attacks

• Affine is a special case of Substitution


• Ciphertext examples used located at:
~zeph/4261/studentdirs/assignments/zeph/

8
Cracking Substitution
General ad-hoc approach:
1. Keep (plain→cipher) and (cipher →plain)
lists for discovered letters
2. Use frequency analysis but be aware that
statistical robustness decreases as you go
further down the list
3. Use webcrypt’s “!” option to obtain
partial decrypts
4. Use weaknesses in keyword-based
encryption to your advantage
9
Frequency Analysis
Compare frequencies
expected (from analysis expected ACTUAL
of entire Moby Dick) e N
t B
with frequencies found in
a S
ciphertext. List from o E
most common to least. n K
i Q
e, t, a occur each more
s T
than 8%. O thru H all h F
near 7% hard to r P
distinguish. N,B,S similar. l A
Guess {e,t,a} → {N,B,S}
as unordered sets
10
Digraph Analysis
Reasonable to expect
mapping between most expected ACTUAL
frequent letters: e→N.
th SE
Confirmed by looking at
top two digraphs. he EN

Assume: (e,t,h)→(N,S,E) in BJ
(as ordered tuples).
er EB
Next: “BJ” at 2.4% stands
above crowd, similar to an JL
“in” at 2.1%. Guess
(i,n)→(B,J).
11
Trigraph Analysis
Map top plain-trigraphs: expected KNOWN

• #2 BJ? matches with the SEN


#2 of cipher-trigraphs ing BJ?
BJL g→L and ?E?

• #4 SE? matches with tha SE?


first SE- appearance his EB?
SET a→T her EN?

• #5 EB? matches with hat E?S


EBQ s→Q
12
Partial Alphabets

Summarize with transformation tables:

plain a b c d e f g h i j k l m n o p q r s t u v w x y z
CIPHER T N L E B J Q S

CIPHERA B C D E F G H I J K L M N O P Q R S T U V WX Y Z
plain i h n g e s t a

13
Exploit Keyword
Method
plain a b c d e f g h i j k l m n o p q r s t u v w x y z
CIPHER T N L E B J Q S
Assuming moderate sized keyword implies
that after keyword:

• cipher-alphabet monotonic
• cipher letter ≤ plain letter
• many simple sequences
• cipher letter = plain letter rest equal
plain a b c d e f g h i j k l m n o p q r s t u v w x y z
CIPHER T N L E B J K MO P Q S U V WX Y Z

14
Partial Decrypt
Summarize with transformation tables:
plain a b c d e f g h i j k l m n o p q r s t u v w x y z
CIPHER T N L E B J KMO P Q S U VWX Y Z

CIPHERA B CD E F GH I J K L MNO P QR S T U VWX Y Z


plain i h n o g p e q r s t a u v w x y z

Decrypt alphabet (insert x for unknown):


xixxhxxxxnogpeqrsxtauvwxyz
Pre-pend by “!” for webcrypt partial decrypt
option, and encrypt with this key.

15
Guess from context
Lines 6-9

• partial decrypt:
XVING HISWH OXESP INXXE XXOXY SOTHA TWHEN HISVA STWRI NXXEX
XOREH EAXRO SESOX ETWEN TYORX OREXE ETOUT OXTHE WATER THENO
WRISI NGSWE XXSWI THAXX THEIR XONXX UENTW AVESX AZZXI NGXYX
ROXEA GAINS TITVI NXIXT IVEXY TOSSI NGTHE IRSHI VEREX SPRAY

• cipher text:
FVBJL EBQWE KFNQM BJAFN ARKAY QKSET SWENJ EBQVT QSWPB JDFNA
GKPNE NTAPK QNQKH NSWNJ SYKPH KPNGN NSKUS KGSEN WTSNP SENJK
WPBQB JLQWN FFQWB SETFF SENBP IKJGF UNJSW TVNQA TZZFB JLFYR
PKDNT LTBJQ SBSVB JABIS BVNFY SKQQB JLSEN BPQEB VNPNA QMPTY

conclude: whole→WEKFN, vindictively→VBJABISBVNFY


(c,d,l) →(I,A,F)
16
Partial Decrypt

Summarize with transformation tables:


plain a b c d e f g h i j k l m n o p q r s t u v w x y z
CIPHER T R I A N L E B CD F J KMO P Q S U VWX Y Z

CIPHERA B CD E F GH I J K L MNO P QR S T U VWX Y Z


plain d i j k h l c n o g p e q r s b t a u v w x y z

Can now guess keyword...

17
Cracking Hill
Simple known plaintext attack (no webcrypt)
1. Letting you assume dimension = 3
2. Set up system of equations between
i = yi
plaintexts and ciphertexts: K · x
.. .. ..
3. Find three blocks such that X = x1x2x3
.. .. ..
is invertible.
4. Solve matrix
 equation
 K · X = Y for K
.. .. ..
where Y = y1y2y3
.. .. ..
• Generalizes directly to unknown dimension
18
Find invertible known
plaintext matrix
• the blue whale is the biggest creature
• the blu ewh ale ist heb
• xi =(19,7,5)’
19 1 4
 , (1,11,20)’ , (4,22,7)’ , (0,11,4)’...

• det  75 1122
20 7
 = -6496 not  invertible
19 1 0
 mod 26

• Try another matrix: det  75 1111 = -3317.


Mod 26 11. 20 4

19
Solve System
Corresponding texts:

• plain = the blu ewh ale ist


• CIPHER = NSK RMC NZY ISF RBM
• NSK,RMC,ISF=(13,18,10)’,(17,12,2)’,(8,18,5)’
   
• Y = 181218 X −1 mod 26 = 1914 7 
1317 8 10 2 1

10 2 5 3 2516
• Solution to K · X = Y is K = Y X
−1

20
Cracking Vigenère

1. Obtain keyword size using Friedman


analysis
2. Apply offset analysis to obtain keyword
directly

21
Index of Coincidence
Friedman’s method based on:
DEF: The index of coincidence of the
string x = x1x2 . . . xn is denoted by Ic(x) and
is the probability that a pair of randomly
chosen characters in x will be equal.
THM: Ic(x) is minimized when all letters
occur with equal probability.

• If x is every p’th ciphertext letter for the


actual period p THEN letters not equally
likely so Ic(x) significantly bigger than min.
22
Friedman Analysis
Period Avg. Ic(x)
1 0.0438
2 0.0435
3 0.0442
Results of applying (with 4 0.0429
max period = 15) to our 5 0.0445
6 0.0436
ciphertext: 7 0.0621
8 0.0428
9 0.0435
10 0.0442
Choose the smallest period. 11 0.043
12 0.0427
13 0.0446
14 0.0613
15 0.0445

23
Maximum Interpolation
DEF: The interpolation of two probability
distributions (pi), (qi)is defined to be the sum
! piqi
i
THM: Suppose that (pi) is a permutation of (qi)
Then the interpolation of (pi), (qi) is
maximized when ∀i, pi = qi
COR: Suppose (pi) is the expected frequency
of English letters and (qi) is the frequency of a
period-p sub-ciphertext which has been
cipher-shifted by -k. Then the interpolation
is maximized when k is a correct key char.
24
Offset Analysis
webcrypt’s offset analysis with period 7 results:

Text

25
0.070
K = topmost
0.053

0.035

0.018 A B C D E F G
H I J K L M N
O P Q R S T U
V W X Y Z
0
OFFSET 0

OFFSET 1

OFFSET 2

OFFSET 3

OFFSET 4

OFFSET 5

OFFSET 6
26
Cracking Enigma
1. Draw a Turing graph by comparing the
known plain-text to the cipher-text
2. Enter 2-connected component information
into webcrypt’s Bombe to discover the
rotor settings
3. Apply webcrypt’s Plug Analysis repeatedly
do discover letter pairs one at a time.
Note: It is possible to weaken the known
plaintext attack to a known sub-plaintext
attack, using Enigma design flaw.

27
Turing Graph

DEF: Suppose we are given an Enigma


machine under a particular key K. The n-step
Turing graph for K is a labeled graph with

• Vertices V = Z26
• Labeled edges of the form ({u,v}, i ) i = 1, 2, . . . , n
defined when ei(u) = v

28
From Known Plaintext
to Partial Turing Graph
step 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 20 1 2 3 4 5 6 7 8 9
plain t h e b l u e w h a l e i s t h e b i g g e s t c r e a t
cipher Z D H P S M T K R w S S X L P w Z F Y O L N P Q A A P C L
• e.g., e12(e) = S implies the edge E 12 S

• Idempotent property
M
6
U
A
graph undirected
O
20
G
21
L

F
18 5,11,14
29 B C
4 25,28
10 8
P A W K
15 27 23 26 16
24 7 12 9 2
Q T E S R H D
17 22
1
Z N
3

29
Partial Turing Graph
6
M U
A O
20
G
21
L

F
18 5,11,14
29 B C
4 25,28
10 8
P A W K
15 27 23 26 16
24 7 12 9 2
Q T E S R H D
17 22
1
Z N
3

30
2-Conn. Components
1. {E, P, S, T, Z, L}
2. {C, A, W, R, H}
L

5,11,14
29 C
25,28
10
P A W
15 27 23 26 16
7 12 9
T E S R H
17
1
Z

31
Turing Graph Theorem
• K = (R,P) denotes an Enigma key with rotor
setting R and plugboard setting P

• TG(R,P,n) denotes the n-step Turing graph for


key (R,P)
THM: For any two plugboard settings P and P’
but fixed initial rotor setting R, the resulting
Turing graphs TG(R,P,n) and TG(R,P’,n) are
isomorphic as edge-labeled graphs.
COR: The 2-connected components of a
discovered partial Turing graph are contained in
the Turing graph with same rotor but no plugs.
32
Discover rotor setting
• 2-connected graph info typically rich enough
to exclude all other rotor settings
• Tree-like part of graph devoid of information
• Brute force search through all 17,576
settings usually finds unique setting (or at
most a handful)
• This is the idea behind Turing’s (and
webcrypt’s) Bombe
• For best results, list as much structure about
2-connected component as possible - i.e. all
the cycles starting from component rep.

33
Describe all cycles

6 non-spanning-tree
edges


P

6 cycles
T E S

34
webcrypt notation
Component
• “:” after first edge - all L
cycles must start from
that edge
• “-” between edges
• “|” between each 29 5 11 14
cycle, first edge
suppressed P
RESULT: (source=E) 15 27 23
27:23-12|15-7|15-1-17 T 7 E 12 S
|23-5-29-7|23-5-11-12 17
1
|23-5-14-12 Z

35
webcrypt notation
Entire graph
• Separate components with “_”
RESULT: (remove new-line for Bombe input)
27:23-12|15-7|15-1-17|23-5-29-7|23-5-11-12|
23-5-14-12_25:28|28-10-16-9-26
sources
L

5,11,14
29 C
25,28
10
P A W
15 27 23 26 16
7 12 9
T E S R H
17
1
Z
36
Disovered: “NGE”

37
Plug Analysis
1. Decrypt ciphertext using discovered rotors
but no plugs (Key = “NGE”)
• frequency analysis non-random text
• most common letter D must be plugged
to another as shouldn’t be most common
2. Plug Analysis(NGE_D) applied to
ORIGINAL CIPHERTEXT
• Lists least-frequent/most-frequent ratio
resulting from attempting D-X plug
combinations for all possible X
• X=E most likely plugged (min. ratio)
3. Go-to #1 above with new guess “NGE_DE”
38
Plug Analysis - loops
1. Freq. anls. after decrypting with “NGE”
a. D must be plugged
b. Plug anls. original file-value “NGE_D”
c. Discovered D E setting
2. Freq. anls. after decrypting with “NGE_DE”
a. L must be plugged
b. Plug anls. with value “NGE_DE_L”
c. Discovered L O setting
3. Freq. anls. “NGE_DE_LO” decrypt
a. G must be plugged
b. Plug anls. with value “NGE_DE_LO_G”
c. Discovered G A setting
39
1.Plug Analysis - loops
2.
3. continued
4. Freq. anls. “NGE_DE_LO_GA” decrypt
a. Z must be plugged (too frequent)
b. Plug anls. value “NGE_DE_LO_GA_Z”
c. Discovered Z N setting
5. Freq. anls. “NGE_DE_LO_GA_ZN”
a. M must be plugged (6.4% vs. 2.4%)
b. Plug anls. “NGE_DE_LO_GA_ZN_M”
c. Discovered M I setting
CRACK: Key is “NGE_DE_LO_GA_ZN_MI”

40

You might also like