Archive

Posts Tagged ‘python’

Saving settings in python with YAML (an XML alternative)

February 25th, 2010 mat 2 comments

When transferring information or settings between computers and programs is is useful to use something simple and program independent. For a recent project I need to send a settings file between a PHP script and a python program, I needed something that would support trees so an INI file was out of the question, and I naturally thought of using XML, however when looking for more information I stumbled upon YAML (seemingly forgotten alternative to XML).

YAML is a recursive acronym that stands for YAML Ain’t Markup Language and below is a simplified example taken from the YAML site. The layout is similar to python with indents representing the nested layers. Semicolons are used to seperate variable names and value pairs.

name   : Joe Bloggs
product:
    - sku         : BL394D
      quantity    : 4
      description : Basketball
      price       : 450.00
    - sku         : BL4438H
      quantity    : 1
      description : Super Hoop
      price       : 2392.00
tax  : 251.42
total: 4443.5

To use YAML in python you will need the PyYAML library avaliable here or in via your package manager in linux. (I.E. “sudo apt-get install http://pyyaml.org/wiki/PyYAML” in ubuntu/debian).

The code for then using YAML is very simple, first we import the yaml library, then we open our settings file we have created and then use yaml’s load feature to load our settings into a python dictionary, and then finally printing it out so we can see what it contains

import yaml
f=open('settings.cfg')
settings=yaml.load(f)
print settings

Using the example YAML file above saved to settings.cfg this python script will output the following when run:

{'product': [{'sku': 'BL394D', 'price': 450.0, 'description':
'Basketball', 'quantity': 4}, {'sku': 'BL4438H', 'price': 2392.0,
'description': 'Super Hoop', 'quantity': 1}], 'total': 4443.5, 'tax':
251.41999999999999, 'name': 'Joe Bloggs'}

We can now access all the information in the file very easily (I.E. settings['name'] will give us the name Joe Bloggs).

YAML supports much much more than what is discussed here and further information can be found on the YAML website.

Categories: python Tags: , ,

Advancing PyMan (using python’s pygame to recreate pacman)

February 18th, 2010 mat No comments

A while back I followed a few tutorials on creating a pacman game in python using pygame. I took the resulting code from these tutorials and added several enhancements. Unfortunately I haven’t had enough time to continue with the code, so hopefully this code is of use to someone learning python to play with and further enhance (Who knows might end up feature complete!).

Screen shot of PyMan in action

Screen shot of PyMan in action

Enhancements over original code

  • start menu
  • sounds
  • multiple ghosts
  • not having to hold down a directional button constantly
  • can press a direction before a turn and will turn at next opertunity (like real pacman)
  • probably more stuff that I’ve forgotten about by now

Source
Source Files – The source and files, to run the game just run “python PyMan.py”

References
I believe the site where I got the original code from was learningpython.com (not 100% sure) and the following links to the tutorials:
Tutorial 1
Tutorial 2
Tutorial 3

Categories: python Tags: ,

Python: interfacing with an arduino

February 3rd, 2010 mat 2 comments

So what is an arduino?
An arduino is an open source open hardware programmable controller with several inputs and outputs. The image below shows an Ardunio Dicemella.

Ardunio Dicemella Annotated Photo

Ardunio Dicemella Annotated Photo

It (Arduino Dicemella) has 14 digital input/output pins (of which 6 can be used as PWM outputs), 6 analog inputs, a 16 MHz crystal oscillator, a USB connection, a power jack, an ICSP header, and a reset button. It contains everything needed to support the microcontroller; simply connect it to a computer with a USB cable or power it with a AC-to-DC adapter or battery to get started.

They are very useful for people who know how to program but have little experience with hardware interaction.

Programming the arduino
This post will not contain in-depth detail on how to program the arduino, instead focussing briefly on setting up serial (over serial or usb cable) communications in order to talk to a python script. The arduino can be programmed via a IDE provided by the creators in a C-style hardware language.

Code example

int ledPin = 13;            // choose the pin for the LED
int inputPin = 2;          // choose the input pin (for a pushbutton)
int val = 0;                // variable for reading the pin status

void setup() {
  pinMode(ledPin, OUTPUT);      // declare LED as output
  pinMode(inputPin, INPUT);     // declare pushbutton as input
}

void loop(){
  val = digitalRead(inputPin);  // read input value
  if (val == HIGH) {            // check if the input is HIGH
    digitalWrite(ledPin, HIGH);  // turn LED ON
  } else {
    digitalWrite(ledPin, LOW); // turn LED OFF
  }
}
Arduino LED switch circuit off

Arduino LED switch circuit off

Arduino LED switch circuit on

Arduino LED switch circuit on

Now we add a few lines to enable the writing of information from our arduino over the serial connection. We first need to set up the transfer speed in our setup (Serial.begin(9600);). Then we can simply send messages over serial using Serial.print(“message\n”);. You can choose between print and println with the difference been that the latter automatically appends the newline char, so we would use the former to write multiple things to the same line. Below is our modified code:

Serial write example

int ledPin = 13;           // choose the pin for the LED
int inputPin = 2;         // choose the input pin (for a pushbutton)
int val = 0;               // variable for reading the pin status

void setup() {
  pinMode(ledPin, OUTPUT);      // declare LED as output
  pinMode(inputPin, INPUT);     // declare pushbutton as input
  Serial.begin(9600);
  Serial.print("Program Initiated\n");
}

void loop(){
  val = digitalRead(inputPin);  // read input value
  if (val == HIGH) {            // check if the input is HIGH
    digitalWrite(ledPin, HIGH);  // turn LED ON
    Serial.print("LED Activated\n");
  } else {
    digitalWrite(ledPin, LOW); // turn LED OFF
  }
}

We now add into this code the ability to receive information via serial. Below is the modified example which removes the action of the button and replaces it by activating the LED when ‘Y’ is sent via serial.

Serial read example

int ledPin = 13;  // choose the pin for the LED
int val = 0;      // variable for reading the pin status
char msg = '  ';   // variable to hold data from serial

void setup() {
  pinMode(ledPin, OUTPUT);      // declare LED as output
  Serial.begin(9600);
  Serial.print("Program Initiated\n");
}

void loop(){
        // While data is sent over serial assign it to the msg
	while (Serial.available()>0){
		msg=Serial.read();
	}

  // Turn LED on/off if we recieve 'Y'/'N' over serial
  if (msg=='Y') {
    digitalWrite(ledPin, HIGH);  // turn LED ON
    Serial.print("LED Activated\n");
    msg=' ';
  } else if (msg=='N') {
    digitalWrite(ledPin, LOW); // turn LED OFF
  }
}

Interaction with python

First we import the serial library to python in order to communicate with the arduino (this includes talking over usb).

import serial

We then attempt to connect to our arduino on /dev/ttyUSB0, using try and except to catch an exception if we are unable to find the arduino on USB0. The 9600 corresponds to the baud rate (speed of communication) that we are using with the arduino and should be the same as set in the program on the arduino otherwise your communication may appear garbled.

try:
	arduino = serial.Serial('/dev/ttyUSB0', 9600)
except:
	print "Failed to connect on /dev/ttyUSB0"

The address will be /dev/ttyUSB# where # is replaced by a number for arduinos connected via usb and /dev/ttyS# where # is replaced by a number for arduinos connected via serial. If you are not sure of the location of your arduino, it can be found in the arduino IDE or you can write some python to scroll through possible locations until a response is found

locations=['/dev/ttyUSB0','/dev/ttyUSB1','/dev/ttyUSB2','/dev/ttyUSB3',
'/dev/ttyS0','/dev/ttyS1','/dev/ttyS2','/dev/ttyS3']

for device in locations:
	try:
		arduino = serial.Serial(device, 9600)
	except:
		print "Failed to connect on",device

You may need to be careful as other devices can be connected. For example if I try to connect to /dev/ttyS0 I will connect to the wacom tablet on my laptop.

Once you have connected to your arduino successfully you can write information to it using write and read information sent from it using read (you will need to import time to use the sleep function). If your arduino does not send any messages via serial then attempting to readline will result in your program hanging until it receives a message.

try:
	arduino.write('Y')
	time.sleep(1)
	print arduino.readline()
except:
	print "Failed to send!"

So the python code should now look like the following and we should be able to control the LED over serial.

import serial
import time

locations=['/dev/ttyUSB0','/dev/ttyUSB1','/dev/ttyUSB2','/dev/ttyUSB3',
'/dev/ttyS0','/dev/ttyS1','/dev/ttyS2','/dev/ttyS3']  

for device in locations:
	try:
		print "Trying...",device
		arduino = serial.Serial(device, 9600)
		break
	except:
		print "Failed to connect on",device   

try:
    arduino.write('Y')
    time.sleep(1)
    print arduino.readline()
except:
    print "Failed to send!"

The above will send the character ‘Y’ (Y for Yes please turn on the LED) to the arduino wait for 1 second and then read from the arduino which will have hopefully posted a response to our ‘Y’. Using the program on this should turn the LED on, and report LED Activated back via serial to our python program. This should be enough for people to get started with ardunios and communicating with them in python.

References

  • Arduino – The arduino website with everything you are likely to need (programming examples and reference guide, and hardware information)
  • Arduino tutorial – a basic and easy to understand tutorial on programming the arduino
  • Python port of arduino-serial.c – By John Wiseman from which I based my program.
  • original arduino-serial.c – by Tod E. Kurt.
  • Sparkfun – Here is a good place to purchase ardunio and other electronics parts. Try coolcomponents if your from the uk like me
  • Dealextreme – Hong Kong based retailer that sells a lot of cheap DIY electronics and also has worldwide free delivery with no min spend (crazy). Does take about two weeks to arrive though (uk).
Categories: arduino, python Tags: , ,

Python: Cryptography decoding a Caesar shift (frequency analysis)

January 4th, 2010 mat No comments

Due to the simple nature of the Caesar cipher, it could easily be brute forced by trying all possible 25 keys and then looking by eye to see if the plaintext was revealed (this too can be automated by checking for common English words to see if the solution was probable). However the much more elegant method of frequency analysis can be used.

Below is a table of the frequency of letters in the English language:

Letter Frequency (percent) Frequency (decimal) Normalised Frequency
a 8.17% 0.08167 0.64297
b 1.49% 0.01492 0.11746
c 2.78% 0.02782 0.21902
d 4.25% 0.04253 0.33483
e 12.70% 0.12702 1.00000
f 2.23% 0.02228 0.17541
g 2.02% 0.02015 0.15864
h 6.09% 0.06094 0.47977
i 6.97% 0.06966 0.54842
j 0.15% 0.00153 0.01205
k 0.77% 0.00772 0.06078
l 4.03% 0.04025 0.31688
m 2.41% 0.02406 0.18942
n 6.75% 0.06749 0.53133
o 7.51% 0.07507 0.59101
p 1.93% 0.01929 0.15187
q 0.10% 0.00095 0.00748
r 5.99% 0.05987 0.47134
s 6.33% 0.06327 0.49811
t 9.06% 0.09056 0.71296
u 2.76% 0.02758 0.21713
v 0.98% 0.00978 0.07700
w 2.36% 0.02360 0.18580
x 0.15% 0.00150 0.01181
y 1.97% 0.01974 0.15541
z 0.07% 0.00074 0.00583

And shown graphically:

English Letter Frequency

English Letter Frequency

Using the following code we can use frequency analysis to find the solution to ciphertext created using the Caesar shift demonstrated previously (see Caesar shift and Caesar shift using makestrans).

from Numeric import *
from string import maketrans

def translator(text,alphabet,key):
	trantab = maketrans(alphabet,key)
	return text.translate(trantab)

def caesar_decode(ciphertext,s):
	alpha="abcdefghijklmnopqrstuvwxyz"
	return translator(ciphertext,alpha,alpha[-s:]+alpha[:-s])

class frequency_analysis:
	def __init__(self, ciphertext):
		self.cor=[0.64297,0.11746,0.21902,0.33483,1.00000,0.17541,
		0.15864,0.47977,0.54842,0.01205,0.06078,0.31688,0.18942,
		0.53133,0.59101,0.15187,0.00748,0.47134,0.49811,0.71296,
		0.21713,0.07700,0.18580,0.01181,0.15541,0.00583]
		self.ciphertext=ciphertext.lower()
		self.freq()
		self.min_error()
		self.key=self.minimum[0]
		self.solution=caesar_decode(self.ciphertext,self.minimum[0])

	def freq(self):
		self.arr=zeros(26,Float64)
		for l in self.ciphertext:
			x=ord(l)
			if (x>=97 and x<=122):
				self.arr[x-97]+=1.0
		self.arr/=max(self.arr)

	def error(self):
		e=0
		for i in range(0,len(self.arr)):
			e+=abs(self.arr[i]-self.cor[i])**2
		return e

	def min_error(self):
		self.minimum=[0,10000]
		for rot in range(0,25):
			e=self.error()
			print rot,e
			if e<self.minimum[1]:
				self.minimum[1]=e
				self.minimum[0]=rot
			x=self.arr[-1]
			del self.cor[-1]
			self.cor.insert(0,x)

ciphertext="ymjwj fwj ybt ydujx tk jshwduynts: tsj ymfy bnqq "+\
"uwjajsy dtzw xnxyjw kwtr wjfinsl dtzw infwd fsi tsj ymfy bnqq "+\
"uwjajsy dtzw ltajwsrjsy. ymnx nx f ajwd nrutwyfsy qjxxts yt "+\
"wjrjgjwjxujhnfqqd ktw fyyfhpx zxnsl kwjvzjshd fsfqdxnx bmnhm "+\
"wjvznwj qtsljw ufxxflj tk yjcy ns twijw yt fhmnjaj gjyyjw wjxzqyx."
FA=frequency_analysis(ciphertext)
print FA.solution

This code will calculate the error in statistical frequency for each letter squared to generate an error for each possible rotation. Using a sufficiently long piece of ciphertext this code should accurately reveal the Caesar rotation use. The table below shows the error for each rotation:


Rotation Error
0 4.11797847386
1 3.05305477067
2 3.70059678828
3 3.66330931218
4 3.5078619579
5 0.361318100755
6 3.17289666386
7 3.66072641654
8 3.39769855873
9 1.74854802027
10 2.92550921273
11 2.67524757297
12 2.86847189573
13 3.06980318397
14 2.56886153328
15 2.17180117031
16 2.24503724763
17 2.95579718798
18 1.74002183444
19 1.83328601011
20 1.74779021766
21 2.71332097813
22 1.5409364067
23 1.83209213494
24 1.54904808883

The lowest error is for 5 rotations (correctly so) with an error of 0.361318100755, the next lowest error is 22 rotations with an error of 1.5409364067. This is ~4.3x difference, which gives a very large degree of confidence to our solution and below is the deciphered text.

there are two types of encryption: one that will prevent your sister from reading your diary and one that will prevent your government. this is a very important lesson to remeberespecially for attacks using frequency analysis which require longer passage of text in order to achieve better results.

Future
The frequency analysis presented here can be used along with some other techniques in order to crack the viginere cipher.

Categories: cryptography, python Tags: ,

Python: Cryptograph using maketrans for substitution and Caesar ciphers

December 22nd, 2009 mat 1 comment

I’ve rewritten the functions in my previous two posts (caesar and substitution ciphers) using the maketrans function from the strings module in python (With thanks to wumzi for pointing this out).

maketrans takes an input alphabet and an output alphabet and can then be used on a string. This can be used to greatly simplify the two ciphers I produced previously. Example below:

input_alphabet="abcde"
output_alphabet="12345"
trantab = maketrans(input_alphabet,output_alphabet)
text="translate abcdefg"
print text.translate(trantab)
# This will output:
# tr1nsl1t5 12345fg

This nows means my code can be rewritten in just a fraction of what it was before:

from random import shuffle
from string import maketrans

alphabet="abcdefghijklmnopqrstuvwxyz" + \
	 "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ"+ \
         ":.;,?!@#$%&()+=-*/_<> []{}`~^"

def translator(text,alphabet,key):
	trantab = maketrans(alphabet,key)
	return text.translate(trantab)

def caesar_encode(plaintext,s,alphabet):
	return translator(plaintext,alphabet,alphabet[s:]+alphabet[:s])

def caesar_decode(ciphertext,s,alphabet):
	return translator(ciphertext,alphabet,alphabet[-s:]+alphabet[:-s])

def substitution_encode(plaintext,alphabet):
	randarray=range(0,len(alphabet))
	shuffle(randarray)
	key=""
	for i in range(0,len(alphabet)):
		key+=alphabet[randarray[i]]

	return translator(plaintext,alphabet,key),key

def substitution_decode(ciphertext,key,alphabet):
	return translator(ciphertext,key,alphabet)

# Example useage
plaintext="The wheels on the bus go round and round. round" + \
" and round. round and round. The wheels on the bus go round and"+ \
" round, all through the town!"

print
print "SUBSTITUTION"
ciphertext,key=substitution_encode(plaintext,alphabet)
print "Key: ", key
print "Plaintext:", plaintext
print "Cipertext:", ciphertext
print "Decoded  :", substitution_decode(ciphertext,key,alphabet)

print
print "CAESAR SHIFT"
ciphertext=caesar_encode(plaintext,5,alphabet)
print "Key: ", 5
print "Plaintext:", plaintext
print "Cipertext:", ciphertext
print "Decoded  :", caesar_decode(ciphertext,5,alphabet)

This will output the following

SUBSTITUTION
Key: &fywQ.%!lmx_sRGu:{<(5jqAXvMFgk]SIY[4iK+3P8pcV$2da@DT/ZnOJ*E>-r},BH16zUb#L?N`e7C ~9t)oW^=;h0
Plaintext: The wheels on the bus go round and round. round and round.round and round. The wheels on the bus go round and round, all through the town!
Cipertext: O!Q)q!QQ_<)GR)(!Q)f5<)%G){G5Rw)&Rw){G5Rw,){G5Rw)&Rw){G5Rw,{G5Rw)&Rw){G5Rw,)O!Q)q!QQ_<)GR)(!Q)f5<)%G){G5Rw)&Rw){G5RwH)&__)(!{G5%!)(!Q)(GqR6
Decoded : The wheels on the bus go round and round. round and round.round and round. The wheels on the bus go round and round, all through the town!

CAESAR SHIFT
Key: 5
Plaintext: The wheels on the bus go round and round. round and round.round and round. The wheels on the bus go round and round, all through the town!
Cipertext: Ymj`2mjjqx`ts`ymj`gzx`lt`wtzsi`fsi`wtzsi@`wtzsi`fsi`wtzsi@wtzsi`fsi`wtzsi@`Ymj`2mjjqx`ts`ymj`gzx`lt`wtzsi`fsi`wtzsi$`fqq`ymwtzlm`ymj`yt2s&
Decoded : The wheels on the bus go round and round. round and round.round and round. The wheels on the bus go round and round, all through the town!

Conclusion
maketrans is awesome :)

Categories: cryptography, python Tags: ,

Python: Cryptography Substitution Cipher improving on the Caesar cipher

December 22nd, 2009 mat 1 comment

This post builds upon the Caesar shift presented previously; converting it to a full substitution cipher. The substitution cipher will practically remove bruteforce style methods of defeating the encryption and provide a basis for more complicated ciphers.

Subsitution Cipher
Because a Caesar shift only rotates the alphabet there are only 25 possible unique solutions, this leaves the cipher quite vulnerable to brute force. If rather than just rotating the alphabet and keeping it ‘linear’ we can shuffle it to create a substitution cipher. This improves the number of possible solutions to a shocking
(26! – 1) = 4.03291461126605635584e26 (unfortunatly, the substitution cipher is alot weaker than it seems as it is vunerable to several different cryptanalysis attacks).

Substitution Cipher Diagram

Substitution Cipher Diagram

We start with the simple Caesar shift function with the following changes:

  • A new array is introduced filled with the numbers 0 – the length of the alphabet. This array is shuffled using shuffle from the random module. The alphabet substitution dictionary is created using this array to decide which letters go where (this is probably clearer in the code than in my explanation).
  • With the Caesar shift we only needed to know the number of rotations in order to decrypt the text, now we need a full list of the letter substitutions. This is stored as the key and will be needed in order to decode our substitution cipher.
  • We also now store the alphabet outside of the function so that it can be used in a decode function. This function was added along with some example usage to make the full process more understandable
from random import shuffle

alphabet="abcdefghijklmnopqrstuvwxyz"

def substitution(alphabet,plaintext):

	# Create array to use to randomise alphabet position
	randarray=range(0,len(alphabet))
	shuffle(randarray)

	key=""

	#Create our substitution dictionary
	dic={}
	for i in range(0,len(alphabet)):
		key+=alphabet[randarray[i]]
		dic[alphabet[i]]=alphabet[randarray[i]]

	#Convert each letter of plaintext to the corrsponding
	#encrypted letter in our dictionary creating the cryptext
	ciphertext=""
	for l in plaintext:
		if l in dic:
			l=dic[l]
		ciphertext+=l
	for i in alphabet:
		print i,
	print
	for i in key:
		print i,
	print
	return ciphertext,key

# This function decodes the ciphertext using the key and creating
# the reverse of the dictionary created in substitution to retrieve
# the plaintext again
def decode(alphabet,ciphertext,key):

	dic={}
	for i in range(0,len(key)):
		dic[key[i]]=alphabet[i]

	plaintext=""
	for l in ciphertext:
		if l in dic:
			l=dic[l]
		plaintext+=l

	return plaintext

# Example useage
plaintext="the cat sat on the mat"
ciphertext,key=substitution(plaintext)
print "Key: ", key
print "Plaintext:", plaintext
print "Cipertext:", ciphertext
print "Decoded  :", decode(ciphertext,key)

Running this will output the following (This will be different on each run due to the use of random to generate the key).

Key: miylbsowutgdkfvjepqhazrncx
Plaintext: the cat sat on the mat
Cipertext: hwb ymh qmh vf hwb kmh
Decoded : the cat sat on the mat

Improvement by using additional characters
Adding additional characters into the substitution will it more difficult to solve. For example if we change our alphabet from:

alphabet=”abcdefghijklmnopqrstuvwxyz”

If we include capital letters, numbers from 0 -9 and special characters:

alphabet=”ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890″+ \
“:.;,?!@#$%&()+=-*/_<> []{}`~^”+ \
“abcdefghijklmnopqrstuvwxyz”

We increase the number of possible solutions from (26!-1) to (63!-1) which is 1.98260831540444006412e87. This also has the added benefit of making the encrypted text alot cooler and harder to guess at by eye (unfortunately still very easy to see what character represents space).

Key: MXjAarLzWqFePI7E botO5f1kym29RZd3Sh8JTiQGKVDp6YBCsU4nucNgwlx0vH
Plaintext: this plaintext will be much more of a challenge to decode compared to the Caesar shift cipher
Cipertext: tzWoHEeMWIta1tHfWeeHXaHPOjzHP7baH7rHMHjzMeeaILaHt7HAaj7AaHj7PEMbaAHt7HtzaHjMaoMbHozWrtHjWEzab
Decoded : this plaintext will be much more of a challenge to decode compared to the caesar shift cipher

Future
The substitution cipher is a lot more secure than Caesar shift cipher but unfortunately is very insecure towards frequency analysis. In future posts I will address using frequency analysis and methods to prevent this type of attack as well as improving on this cipher by creating multiple-dicitionary based ciphers to create Vigenère style ciphers.

I imagine most people reading this will enjoy the simple challenge of solving some encrypted text. I have used this code to make some ciphertext, try and decode it! (extra points for knowing where it is from):

^’VtuBbtv3vut1u-w.G^’Vt&vnu-tZBtZnuIvwvtwn
-qbtuB6GN3vutbqBS-qt.BSt&wB~vtV.tqv1wbG}u5t
~nDDv5tVvG}u5tbBwvtVvtbBt@nvIvZG}u5tbqwv6tv
3vw.t@nvIvtnubBt1tUnwvG}Ztbqv.t&Swuv5tnbtqS
wbt&vI1SZvt^t61ZtZBtq1@@.tUBwt.BSm)B6tbqvZv
t@BnubZtBUt51b1tV1~vt1t&v1SbnUSDtDnuvG}u5t6
v’wvtBSbtBUt&vb1GQv’wvtwvDv1Znu-tButbnVvG

(note: newlines were placed to make it fit and do not represent a character)

Update: Made code a little cleaner by moving dictionary outside of functions.

Categories: cryptography, python Tags: ,

Python: Cryptography Caesar shift encryption (shift cipher)

December 21st, 2009 mat 5 comments

I have always had a keen interest in cryptography and rather than give a brief history of cryptography I will recommend reading Simon Singh’s The code book or for a modern and hands on approach Applied Cryptography by Bruce Schneier (Who also made a brilliant book on security, more of descriptive approach but very interesting Secrets and Lies: Digital Security in a Networked World).

This post aims to detail the creation (in python) of one of the simplest forms of encryption; the simple Caesar shift (or shift cipher). The Caesar shift takes the normal alphabet and maps it to a an identical alphabet with a rotation. The cipher will be written in such a way that it can be easily expanded on to create more complex encryption schemes with little modification.

Caesar shift alphabet

Caesar shift alphabet diagram

The above image shows a diagrammatic representation of a Caesar shift of 3 (alphabet transposed onto a rotation of itself with a displacement of 3). Below shows the entire alphabet in plaintext and in ciphertext, followed by a simple sentence in plaintext and in cipher text.

Plaintext: abcdefghijklmnopqrstuvwxyz
Ciphertext: defghijklmnopqrstuvwxyzabc
Plaintext: the cat sat on the mat
Ciphertext: wkh fdw vdw rq wkh pdw

Below is the code to convert plaintext into ciphertext along with an example of the usage:

def caesar(plaintext,shift):

	alphabet=["a","b","c","d","e","f","g","h","i","j","k","l",
	"m","n","o","p","q","r","s","t","u","v","w","x","y","z"]

	#Create our substitution dictionary
	dic={}
	for i in range(0,len(alphabet)):
		dic[alphabet[i]]=alphabet[(i+shift)%len(alphabet)]

	#Convert each letter of plaintext to the corrsponding
	#encrypted letter in our dictionary creating the cryptext
	ciphertext=""
	for l in plaintext.lower():
		if l in dic:
			l=dic[l]
		ciphertext+=l

	return ciphertext

#Example useage
plaintext="the cat sat on the mat"
print "Plaintext:", plaintext
print "Cipertext:",caesar(plaintext,3)
#This will result in:
#Plaintext: the cat sat on the mat
#Cipertext: wkh fdw vdw rq wkh pdw

Explanation
Here we have written a function that is given the plaintext and an arbitary rotation (including negative) and returns a ciphertext. The function creates a dictionary, mapping each letter to another letter that is ’shift’ letters away from it in the alphabet (the modulus (%) is used so to wrap the alphabet back to the start if it falls off the end). It then converts each letter of the plain text into the corresponding encrypted letter using the dictionary.

Future
In the next post I will discuss methods to improve the Caesar shift and how to turn it into a full substitution cipher (where the alphabet is shuffled rather than rotated). The Caesar shift is very susceptible to brute forcing and frequency analysis which I shall explain and create a program to defeat these encryptions in a future post.

Python: Wordwheel / WordCube solver

December 13th, 2009 mat No comments

Often in newspapers there is a wordwheel or some variant, whereby you have to find as many words greater than 3 letters long, containing the centre word and using the letters no more than once. I have created a webpage that generates a “WordCube” daily for people to peruse at their leisure (www.stealthcopter.com/wordcube). This post contains the code and explanation of the solutions to wordcube’s (and all other word<insert shape here>).

WordCube from http://www.stealthcopter.com/wordcube for 12/12/2009
Example WordCube image for the 12th December 2009 from www.stealthcopter.com/wordcube/2009/12/12

Below is a function I wrote to check if an input was a valid anagram (or partial anagram, as it isn’t essential to use every letter). The function works by cycling over each letter of word we are testing (word), and checks if the letter is valid (checked against chkword). If the letter is valid then it removes the letter from the original word and moves to the next letter until we run out of letters (returns True) or if the letter is invalid (returns False).

def anagramchk(word,chkword):
	for letter in word:
		if letter in chkword:
			chkword=chkword.replace(letter, '', 1)
		else:
			return False
	return True

f=open('english', 'r')
word=raw_input('Input letters (starting with mandatory letter) :')
minlen=4
count=0
for l in f:
	l=l.strip()
	if len(l)<=len(word) and len(l)>=minlen and word[0] in l and anagramchk(l,word):
		if len(l)==len(word):
			print l,'  <-- Long word'
		else:
			print l
		count+=1
f.close()
print count

This will output a list of the possible words, along with a total. The results can be seen for the WordCube in the example above here (To prevent spoiling it if you’d like to have a go at it yourself).

As always I’d be interested to see if anyone knows any faster methods or any other general improvements or comments.

The dictionary file can be found here (not perfect):
here

Categories: python Tags: , , ,

Python: Highest common factor of two numbers

December 8th, 2009 mat 6 comments

Finding the highest common factor is useful in mathematics and cryptography.

It would be computationally expensive to calculate all the common factors of two numbers and then compare to find the highest factor in common. Instead (as usual) there is a mathematical theory that can be used to speed up this process, in this case it’s the Euclidean algorithm.

The Euclidean algorithm subtracts the smaller number from the larger number, then using this new number and the smaller number repeats this process until the numbers are equal and hence the subtraction goes to zero. This is the highest common factor.

Example:
Find the highest common factor of 252 and 105

252-105=147
147-105=42
105-42=63
63-42=21
42-21=21
21-21=0

Below is the python code to preform this

def hcf(no1,no2):
	while no1!=no2:
		if no1>no2:
			no1-=no2
		elif no2>no1:
			no2-=no1
	return no1

This method is quite quick even and scales well for larger numbers. If you know of a faster method or can see any improvements to make then please let me know.

Python: Anagram solver

November 18th, 2009 mat No comments

Finding the solutions to an anagram can be enjoyable and used as a sign of mental agility, with examples from long running television series (countdown) to online gambling skill games. It also makes an interesting tutorial for learning some simple python skills.

We start off by writing a function to check if a given word is an anagram of another word. This will be used to check a list of dictionary words against a given word in order to find its anagrams. The function accepts two string inputs, the word we are testing (word), and the given anagram to check it against (chkword).

The function loops over every letter in the word and if it is not present in chkword then we know that it cannot be an anagram of the chkword. Otherwise it is valid and that letter is removed from the chkword so that each letter may only be used once. For example checking “netting” with “testing” would fail on the second n.

def anagramchk(word,chkword):
	for letter in word:
		if letter in chkword:
			chkword=chkword.replace(letter, '', 1)
		else:
			return 0
	return 1

Now we need to get the anagram from the user. The following using raw_input to read what the user types in the console and assign it to a variable, with the optional argument used to print text so the user knows they need to do something.

wordin=raw_input('Enter anagram:\n')

We can now join both these together and loop over our dictionary file. We strip each line to remove any extra space and gumph at the end of the lines. If the length of a line is less than 4 then it is ignored,this can be changed to whatever minimum you like, so if you want it to be the same length as the input you could replace it with len(line)==len(wordin). We then use the function already written earlier to check if the word is a valid anagram and print the word if it is. Then we finish by closing the file.

f=open('english.txt', 'r')
for line in f:
	line=line.strip()
	if len(line)>=4:
		if anagramchk(line,wordin):
			print line
f.close()

Hopefully this has been helpful to someone. As ever if you have any suggestions or improvements then please leave a comment. Below is the complete version of the code

def anagramchk(word,chkword):
	for letter in word:
		if letter in chkword:
			chkword=chkword.replace(letter, '', 1)
		else:
			return 0
	return 1

wordin=raw_input('Enter anagram:')

f=open('english.txt', 'r')
for line in f:
	line=line.strip()
	if len(line)>=4:
		if anagramchk(line,wordin):
			print line
f.close()

The dictionary file I used was one I found in /usr/share/dict/british-english (most linux distro’s should have this or similar) it can also be found here

Categories: python Tags: , ,
// unused langs // // // //