Skip to content

Posts tagged ‘strings’

Find words with the most anagrams efficiently using python

Sep 2 10
by mat

Following my previous post about 9 letter anagrams I am posting the final code I have created taking into account suggestions/snippets from Michael, Toby and Martin. Added two variables to make it nice and easy to modify what to look for.

Code

# -*- coding: utf-8 -*-
from time import time
from collections import defaultdict  

ag_len = 10 # Anagram word length
ag_min = 2  # Min # of anagrams
dictionary_path = '/usr/share/dict/british-english'
tic = time()

wd = defaultdict(set)  
for l in open (dictionary_path, 'r'):
	l=l.strip()
	if ag_len==len(l):
		wd["".join(sorted(l))].add (l)
	
for ws, wl in wd.iteritems():
	if len ( wl ) >= ag_min:
		print " ".join ( wl )


toc = time()
print toc-tic,'s' 

Explanation
The dictionary file is filtered by length into a dictionary. The key for the dictionary is the letter of the word sorted in order, IE:

"".join(sorted('arranging')) = 'aagginnrr'

With the value as the unsorted word. Because words that are an anagram of each other will be identical when sorted this means that using the add method with a dictionary will cause any anagram to share the same key. Eg:

When the dictionary gets to megatons it will create a new key in the dicitonary like so:
{'aegmnost': set(['megatons'])}

Then to magnetos
{'aegmnost': set(['magnetos', 'megatons'])}

Then to montages:
{'aegmnost': set(['magnetos', 'megatons', 'montages'])}

Then we loop over all the items in the dictionary we created and see if the length of the values is greater than the minimum value we are looking for.

All done, a very elegant and simple method to find words with several anagrams for a given word length.

Results

I was going to post the interesting 10 letter anagrams I found however I couldn’t find any with more than 2 anagrams with the dictionary I was using.

There is a 11 letter tripple anagram:

anthologies anthologise theologians

and some 8 letter with 4 or more anagrams:

painters pertains pantries repaints
resident nerdiest inserted trendies
salesmen lameness nameless maleness
strainer restrain terrains retrains trainers
altering triangle relating integral alerting
rangiest ingrates angriest gantries
parroted predator teardrop prorated
iterates teariest treatise treaties
trounces counters recounts construe

Python: Wordwheel / WordCube solver

Dec 13 09
by mat

Often in newspapers there is a wordwheel or some variant, whereby you have to find as many words greater than 3 letters long, containing the centre word and using the letters no more than once. I have created a webpage that generates a “WordCube” daily for people to peruse at their leisure (www.stealthcopter.com/wordcube). This post contains the code and explanation of the solutions to wordcube’s (and all other word<insert shape here>).

WordCube from http://www.stealthcopter.com/wordcube for 12/12/2009
Example WordCube image for the 12th December 2009 from www.stealthcopter.com/wordcube/2009/12/12

Below is a function I wrote to check if an input was a valid anagram (or partial anagram, as it isn’t essential to use every letter). The function works by cycling over each letter of word we are testing (word), and checks if the letter is valid (checked against chkword). If the letter is valid then it removes the letter from the original word and moves to the next letter until we run out of letters (returns True) or if the letter is invalid (returns False).

def anagramchk(word,chkword):
	for letter in word:
		if letter in chkword:
			chkword=chkword.replace(letter, '', 1)
		else:
			return False
	return True

f=open('english', 'r')
word=raw_input('Input letters (starting with mandatory letter) :')
minlen=4
count=0
for l in f:
	l=l.strip()
	if len(l)<=len(word) and len(l)>=minlen and word[0] in l and anagramchk(l,word):
		if len(l)==len(word):
			print l,'  <-- Long word'
		else:
			print l
		count+=1
f.close()
print count

This will output a list of the possible words, along with a total. The results can be seen for the WordCube in the example above here (To prevent spoiling it if you’d like to have a go at it yourself).

As always I’d be interested to see if anyone knows any faster methods or any other general improvements or comments.

The dictionary file can be found here (not perfect):
here

Python: Anagram solver

Nov 18 09
by mat

Finding the solutions to an anagram can be enjoyable and used as a sign of mental agility, with examples from long running television series (countdown) to online gambling skill games. It also makes an interesting tutorial for learning some simple python skills.

We start off by writing a function to check if a given word is an anagram of another word. This will be used to check a list of dictionary words against a given word in order to find its anagrams. The function accepts two string inputs, the word we are testing (word), and the given anagram to check it against (chkword).

The function loops over every letter in the word and if it is not present in chkword then we know that it cannot be an anagram of the chkword. Otherwise it is valid and that letter is removed from the chkword so that each letter may only be used once. For example checking “netting” with “testing” would fail on the second n.

def anagramchk(word,chkword):
	for letter in word:
		if letter in chkword:
			chkword=chkword.replace(letter, '', 1)
		else:
			return 0
	return 1

Now we need to get the anagram from the user. The following using raw_input to read what the user types in the console and assign it to a variable, with the optional argument used to print text so the user knows they need to do something.

wordin=raw_input('Enter anagram:\n')

We can now join both these together and loop over our dictionary file. We strip each line to remove any extra space and gumph at the end of the lines. If the length of a line is less than 4 then it is ignored,this can be changed to whatever minimum you like, so if you want it to be the same length as the input you could replace it with len(line)==len(wordin). We then use the function already written earlier to check if the word is a valid anagram and print the word if it is. Then we finish by closing the file.

f=open('english.txt', 'r')
for line in f:
	line=line.strip()
	if len(line)>=4:
		if anagramchk(line,wordin):
			print line
f.close()

Hopefully this has been helpful to someone. As ever if you have any suggestions or improvements then please leave a comment. Below is the complete version of the code

def anagramchk(word,chkword):
	for letter in word:
		if letter in chkword:
			chkword=chkword.replace(letter, '', 1)
		else:
			return 0
	return 1

wordin=raw_input('Enter anagram:')

f=open('english.txt', 'r')
for line in f:
	line=line.strip()
	if len(line)>=4:
		if anagramchk(line,wordin):
			print line
f.close()

The dictionary file I used was one I found in /usr/share/dict/british-english (most linux distro’s should have this or similar) it can also be found here

Python: crossword solver + dictionary file

Sep 3 09
by mat

This is a quick and dirty crossword solver that I wrote in python:

word=raw_input('Crossword Solver \nuse * as a wildcard: ')
f=open('dic.txt', 'r')
for line in f:
	line=line.strip()
        if len(line)==len(word):
		good=1
		pos=0
		for letter in word:
			if not letter=='*':
				if not letter==line[pos]:
					good=0
			pos+=1
		if good==1:
			print line
f.close()

Example usage:

Crossword Solver
use * as a wildcard: *arn*val
carnival

The dictionary file I used is 608.2Kb with 80,368 english words and avaliable here