Archive

Archive for the ‘cryptography’ Category

Python: Cryptography decoding a Caesar shift (frequency analysis)

January 4th, 2010 mat No comments

Due to the simple nature of the Caesar cipher, it could easily be brute forced by trying all possible 25 keys and then looking by eye to see if the plaintext was revealed (this too can be automated by checking for common English words to see if the solution was probable). However the much more elegant method of frequency analysis can be used.

Below is a table of the frequency of letters in the English language:

Letter Frequency (percent) Frequency (decimal) Normalised Frequency
a 8.17% 0.08167 0.64297
b 1.49% 0.01492 0.11746
c 2.78% 0.02782 0.21902
d 4.25% 0.04253 0.33483
e 12.70% 0.12702 1.00000
f 2.23% 0.02228 0.17541
g 2.02% 0.02015 0.15864
h 6.09% 0.06094 0.47977
i 6.97% 0.06966 0.54842
j 0.15% 0.00153 0.01205
k 0.77% 0.00772 0.06078
l 4.03% 0.04025 0.31688
m 2.41% 0.02406 0.18942
n 6.75% 0.06749 0.53133
o 7.51% 0.07507 0.59101
p 1.93% 0.01929 0.15187
q 0.10% 0.00095 0.00748
r 5.99% 0.05987 0.47134
s 6.33% 0.06327 0.49811
t 9.06% 0.09056 0.71296
u 2.76% 0.02758 0.21713
v 0.98% 0.00978 0.07700
w 2.36% 0.02360 0.18580
x 0.15% 0.00150 0.01181
y 1.97% 0.01974 0.15541
z 0.07% 0.00074 0.00583

And shown graphically:

English Letter Frequency

English Letter Frequency

Using the following code we can use frequency analysis to find the solution to ciphertext created using the Caesar shift demonstrated previously (see Caesar shift and Caesar shift using makestrans).

from Numeric import *
from string import maketrans

def translator(text,alphabet,key):
	trantab = maketrans(alphabet,key)
	return text.translate(trantab)

def caesar_decode(ciphertext,s):
	alpha="abcdefghijklmnopqrstuvwxyz"
	return translator(ciphertext,alpha,alpha[-s:]+alpha[:-s])

class frequency_analysis:
	def __init__(self, ciphertext):
		self.cor=[0.64297,0.11746,0.21902,0.33483,1.00000,0.17541,
		0.15864,0.47977,0.54842,0.01205,0.06078,0.31688,0.18942,
		0.53133,0.59101,0.15187,0.00748,0.47134,0.49811,0.71296,
		0.21713,0.07700,0.18580,0.01181,0.15541,0.00583]
		self.ciphertext=ciphertext.lower()
		self.freq()
		self.min_error()
		self.key=self.minimum[0]
		self.solution=caesar_decode(self.ciphertext,self.minimum[0])

	def freq(self):
		self.arr=zeros(26,Float64)
		for l in self.ciphertext:
			x=ord(l)
			if (x>=97 and x<=122):
				self.arr[x-97]+=1.0
		self.arr/=max(self.arr)

	def error(self):
		e=0
		for i in range(0,len(self.arr)):
			e+=abs(self.arr[i]-self.cor[i])**2
		return e

	def min_error(self):
		self.minimum=[0,10000]
		for rot in range(0,25):
			e=self.error()
			print rot,e
			if e<self.minimum[1]:
				self.minimum[1]=e
				self.minimum[0]=rot
			x=self.arr[-1]
			del self.cor[-1]
			self.cor.insert(0,x)

ciphertext="ymjwj fwj ybt ydujx tk jshwduynts: tsj ymfy bnqq "+\
"uwjajsy dtzw xnxyjw kwtr wjfinsl dtzw infwd fsi tsj ymfy bnqq "+\
"uwjajsy dtzw ltajwsrjsy. ymnx nx f ajwd nrutwyfsy qjxxts yt "+\
"wjrjgjwjxujhnfqqd ktw fyyfhpx zxnsl kwjvzjshd fsfqdxnx bmnhm "+\
"wjvznwj qtsljw ufxxflj tk yjcy ns twijw yt fhmnjaj gjyyjw wjxzqyx."
FA=frequency_analysis(ciphertext)
print FA.solution

This code will calculate the error in statistical frequency for each letter squared to generate an error for each possible rotation. Using a sufficiently long piece of ciphertext this code should accurately reveal the Caesar rotation use. The table below shows the error for each rotation:


Rotation Error
0 4.11797847386
1 3.05305477067
2 3.70059678828
3 3.66330931218
4 3.5078619579
5 0.361318100755
6 3.17289666386
7 3.66072641654
8 3.39769855873
9 1.74854802027
10 2.92550921273
11 2.67524757297
12 2.86847189573
13 3.06980318397
14 2.56886153328
15 2.17180117031
16 2.24503724763
17 2.95579718798
18 1.74002183444
19 1.83328601011
20 1.74779021766
21 2.71332097813
22 1.5409364067
23 1.83209213494
24 1.54904808883

The lowest error is for 5 rotations (correctly so) with an error of 0.361318100755, the next lowest error is 22 rotations with an error of 1.5409364067. This is ~4.3x difference, which gives a very large degree of confidence to our solution and below is the deciphered text.

there are two types of encryption: one that will prevent your sister from reading your diary and one that will prevent your government. this is a very important lesson to remeberespecially for attacks using frequency analysis which require longer passage of text in order to achieve better results.

Future
The frequency analysis presented here can be used along with some other techniques in order to crack the viginere cipher.

Categories: cryptography, python Tags: ,

Python: Cryptograph using maketrans for substitution and Caesar ciphers

December 22nd, 2009 mat 1 comment

I’ve rewritten the functions in my previous two posts (caesar and substitution ciphers) using the maketrans function from the strings module in python (With thanks to wumzi for pointing this out).

maketrans takes an input alphabet and an output alphabet and can then be used on a string. This can be used to greatly simplify the two ciphers I produced previously. Example below:

input_alphabet="abcde"
output_alphabet="12345"
trantab = maketrans(input_alphabet,output_alphabet)
text="translate abcdefg"
print text.translate(trantab)
# This will output:
# tr1nsl1t5 12345fg

This nows means my code can be rewritten in just a fraction of what it was before:

from random import shuffle
from string import maketrans

alphabet="abcdefghijklmnopqrstuvwxyz" + \
	 "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ"+ \
         ":.;,?!@#$%&()+=-*/_<> []{}`~^"

def translator(text,alphabet,key):
	trantab = maketrans(alphabet,key)
	return text.translate(trantab)

def caesar_encode(plaintext,s,alphabet):
	return translator(plaintext,alphabet,alphabet[s:]+alphabet[:s])

def caesar_decode(ciphertext,s,alphabet):
	return translator(ciphertext,alphabet,alphabet[-s:]+alphabet[:-s])

def substitution_encode(plaintext,alphabet):
	randarray=range(0,len(alphabet))
	shuffle(randarray)
	key=""
	for i in range(0,len(alphabet)):
		key+=alphabet[randarray[i]]

	return translator(plaintext,alphabet,key),key

def substitution_decode(ciphertext,key,alphabet):
	return translator(ciphertext,key,alphabet)

# Example useage
plaintext="The wheels on the bus go round and round. round" + \
" and round. round and round. The wheels on the bus go round and"+ \
" round, all through the town!"

print
print "SUBSTITUTION"
ciphertext,key=substitution_encode(plaintext,alphabet)
print "Key: ", key
print "Plaintext:", plaintext
print "Cipertext:", ciphertext
print "Decoded  :", substitution_decode(ciphertext,key,alphabet)

print
print "CAESAR SHIFT"
ciphertext=caesar_encode(plaintext,5,alphabet)
print "Key: ", 5
print "Plaintext:", plaintext
print "Cipertext:", ciphertext
print "Decoded  :", caesar_decode(ciphertext,5,alphabet)

This will output the following

SUBSTITUTION
Key: &fywQ.%!lmx_sRGu:{<(5jqAXvMFgk]SIY[4iK+3P8pcV$2da@DT/ZnOJ*E>-r},BH16zUb#L?N`e7C ~9t)oW^=;h0
Plaintext: The wheels on the bus go round and round. round and round.round and round. The wheels on the bus go round and round, all through the town!
Cipertext: O!Q)q!QQ_<)GR)(!Q)f5<)%G){G5Rw)&Rw){G5Rw,){G5Rw)&Rw){G5Rw,{G5Rw)&Rw){G5Rw,)O!Q)q!QQ_<)GR)(!Q)f5<)%G){G5Rw)&Rw){G5RwH)&__)(!{G5%!)(!Q)(GqR6
Decoded : The wheels on the bus go round and round. round and round.round and round. The wheels on the bus go round and round, all through the town!

CAESAR SHIFT
Key: 5
Plaintext: The wheels on the bus go round and round. round and round.round and round. The wheels on the bus go round and round, all through the town!
Cipertext: Ymj`2mjjqx`ts`ymj`gzx`lt`wtzsi`fsi`wtzsi@`wtzsi`fsi`wtzsi@wtzsi`fsi`wtzsi@`Ymj`2mjjqx`ts`ymj`gzx`lt`wtzsi`fsi`wtzsi$`fqq`ymwtzlm`ymj`yt2s&
Decoded : The wheels on the bus go round and round. round and round.round and round. The wheels on the bus go round and round, all through the town!

Conclusion
maketrans is awesome :)

Categories: cryptography, python Tags: ,

Python: Cryptography Substitution Cipher improving on the Caesar cipher

December 22nd, 2009 mat 1 comment

This post builds upon the Caesar shift presented previously; converting it to a full substitution cipher. The substitution cipher will practically remove bruteforce style methods of defeating the encryption and provide a basis for more complicated ciphers.

Subsitution Cipher
Because a Caesar shift only rotates the alphabet there are only 25 possible unique solutions, this leaves the cipher quite vulnerable to brute force. If rather than just rotating the alphabet and keeping it ‘linear’ we can shuffle it to create a substitution cipher. This improves the number of possible solutions to a shocking
(26! – 1) = 4.03291461126605635584e26 (unfortunatly, the substitution cipher is alot weaker than it seems as it is vunerable to several different cryptanalysis attacks).

Substitution Cipher Diagram

Substitution Cipher Diagram

We start with the simple Caesar shift function with the following changes:

  • A new array is introduced filled with the numbers 0 – the length of the alphabet. This array is shuffled using shuffle from the random module. The alphabet substitution dictionary is created using this array to decide which letters go where (this is probably clearer in the code than in my explanation).
  • With the Caesar shift we only needed to know the number of rotations in order to decrypt the text, now we need a full list of the letter substitutions. This is stored as the key and will be needed in order to decode our substitution cipher.
  • We also now store the alphabet outside of the function so that it can be used in a decode function. This function was added along with some example usage to make the full process more understandable
from random import shuffle

alphabet="abcdefghijklmnopqrstuvwxyz"

def substitution(alphabet,plaintext):

	# Create array to use to randomise alphabet position
	randarray=range(0,len(alphabet))
	shuffle(randarray)

	key=""

	#Create our substitution dictionary
	dic={}
	for i in range(0,len(alphabet)):
		key+=alphabet[randarray[i]]
		dic[alphabet[i]]=alphabet[randarray[i]]

	#Convert each letter of plaintext to the corrsponding
	#encrypted letter in our dictionary creating the cryptext
	ciphertext=""
	for l in plaintext:
		if l in dic:
			l=dic[l]
		ciphertext+=l
	for i in alphabet:
		print i,
	print
	for i in key:
		print i,
	print
	return ciphertext,key

# This function decodes the ciphertext using the key and creating
# the reverse of the dictionary created in substitution to retrieve
# the plaintext again
def decode(alphabet,ciphertext,key):

	dic={}
	for i in range(0,len(key)):
		dic[key[i]]=alphabet[i]

	plaintext=""
	for l in ciphertext:
		if l in dic:
			l=dic[l]
		plaintext+=l

	return plaintext

# Example useage
plaintext="the cat sat on the mat"
ciphertext,key=substitution(plaintext)
print "Key: ", key
print "Plaintext:", plaintext
print "Cipertext:", ciphertext
print "Decoded  :", decode(ciphertext,key)

Running this will output the following (This will be different on each run due to the use of random to generate the key).

Key: miylbsowutgdkfvjepqhazrncx
Plaintext: the cat sat on the mat
Cipertext: hwb ymh qmh vf hwb kmh
Decoded : the cat sat on the mat

Improvement by using additional characters
Adding additional characters into the substitution will it more difficult to solve. For example if we change our alphabet from:

alphabet=”abcdefghijklmnopqrstuvwxyz”

If we include capital letters, numbers from 0 -9 and special characters:

alphabet=”ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890″+ \
“:.;,?!@#$%&()+=-*/_<> []{}`~^”+ \
“abcdefghijklmnopqrstuvwxyz”

We increase the number of possible solutions from (26!-1) to (63!-1) which is 1.98260831540444006412e87. This also has the added benefit of making the encrypted text alot cooler and harder to guess at by eye (unfortunately still very easy to see what character represents space).

Key: MXjAarLzWqFePI7E botO5f1kym29RZd3Sh8JTiQGKVDp6YBCsU4nucNgwlx0vH
Plaintext: this plaintext will be much more of a challenge to decode compared to the Caesar shift cipher
Cipertext: tzWoHEeMWIta1tHfWeeHXaHPOjzHP7baH7rHMHjzMeeaILaHt7HAaj7AaHj7PEMbaAHt7HtzaHjMaoMbHozWrtHjWEzab
Decoded : this plaintext will be much more of a challenge to decode compared to the caesar shift cipher

Future
The substitution cipher is a lot more secure than Caesar shift cipher but unfortunately is very insecure towards frequency analysis. In future posts I will address using frequency analysis and methods to prevent this type of attack as well as improving on this cipher by creating multiple-dicitionary based ciphers to create Vigenère style ciphers.

I imagine most people reading this will enjoy the simple challenge of solving some encrypted text. I have used this code to make some ciphertext, try and decode it! (extra points for knowing where it is from):

^’VtuBbtv3vut1u-w.G^’Vt&vnu-tZBtZnuIvwvtwn
-qbtuB6GN3vutbqBS-qt.BSt&wB~vtV.tqv1wbG}u5t
~nDDv5tVvG}u5tbBwvtVvtbBt@nvIvZG}u5tbqwv6tv
3vw.t@nvIvtnubBt1tUnwvG}Ztbqv.t&Swuv5tnbtqS
wbt&vI1SZvt^t61ZtZBtq1@@.tUBwt.BSm)B6tbqvZv
t@BnubZtBUt51b1tV1~vt1t&v1SbnUSDtDnuvG}u5t6
v’wvtBSbtBUt&vb1GQv’wvtwvDv1Znu-tButbnVvG

(note: newlines were placed to make it fit and do not represent a character)

Update: Made code a little cleaner by moving dictionary outside of functions.

Categories: cryptography, python Tags: ,

Python: Cryptography Caesar shift encryption (shift cipher)

December 21st, 2009 mat 5 comments

I have always had a keen interest in cryptography and rather than give a brief history of cryptography I will recommend reading Simon Singh’s The code book or for a modern and hands on approach Applied Cryptography by Bruce Schneier (Who also made a brilliant book on security, more of descriptive approach but very interesting Secrets and Lies: Digital Security in a Networked World).

This post aims to detail the creation (in python) of one of the simplest forms of encryption; the simple Caesar shift (or shift cipher). The Caesar shift takes the normal alphabet and maps it to a an identical alphabet with a rotation. The cipher will be written in such a way that it can be easily expanded on to create more complex encryption schemes with little modification.

Caesar shift alphabet

Caesar shift alphabet diagram

The above image shows a diagrammatic representation of a Caesar shift of 3 (alphabet transposed onto a rotation of itself with a displacement of 3). Below shows the entire alphabet in plaintext and in ciphertext, followed by a simple sentence in plaintext and in cipher text.

Plaintext: abcdefghijklmnopqrstuvwxyz
Ciphertext: defghijklmnopqrstuvwxyzabc
Plaintext: the cat sat on the mat
Ciphertext: wkh fdw vdw rq wkh pdw

Below is the code to convert plaintext into ciphertext along with an example of the usage:

def caesar(plaintext,shift):

	alphabet=["a","b","c","d","e","f","g","h","i","j","k","l",
	"m","n","o","p","q","r","s","t","u","v","w","x","y","z"]

	#Create our substitution dictionary
	dic={}
	for i in range(0,len(alphabet)):
		dic[alphabet[i]]=alphabet[(i+shift)%len(alphabet)]

	#Convert each letter of plaintext to the corrsponding
	#encrypted letter in our dictionary creating the cryptext
	ciphertext=""
	for l in plaintext.lower():
		if l in dic:
			l=dic[l]
		ciphertext+=l

	return ciphertext

#Example useage
plaintext="the cat sat on the mat"
print "Plaintext:", plaintext
print "Cipertext:",caesar(plaintext,3)
#This will result in:
#Plaintext: the cat sat on the mat
#Cipertext: wkh fdw vdw rq wkh pdw

Explanation
Here we have written a function that is given the plaintext and an arbitary rotation (including negative) and returns a ciphertext. The function creates a dictionary, mapping each letter to another letter that is ’shift’ letters away from it in the alphabet (the modulus (%) is used so to wrap the alphabet back to the start if it falls off the end). It then converts each letter of the plain text into the corresponding encrypted letter using the dictionary.

Future
In the next post I will discuss methods to improve the Caesar shift and how to turn it into a full substitution cipher (where the alphabet is shuffled rather than rotated). The Caesar shift is very susceptible to brute forcing and frequency analysis which I shall explain and create a program to defeat these encryptions in a future post.

// unused langs // // // //