Cracking real world salted MD5 passwords in python with several dictionaries
Recently a friend (who will remain unnamed for obvious reasons) asked me to penetration test a website he created. I found a very simple exploit where I could upload an avatar but the file was not checked to ensure it was an image, so I uploaded a php script I wrote an began exploring the server. I printed out all of the usernames, passwords and salts from the database to see how many of the 1,109 passwords could be easily cracked.
The passwords were stored as MD5 hashes with a random 6 character alphanumeric salt. To create the MD5 hash of the password the salt was prefixed to the password and then the combination was hashed. Thanks to this method we can employ a simple bruteforce/dictionary attack on the passwords. I will start with the wordlists creation, then results I obtained to keep your interest, and finally show my python code.
I already has two reasnoble sized dictionaries that I use for different things like wordcube. I used john the ripper on my double sized dictionary to create lots of common permutations on words, such as captial first letter, and a number affixed to the end. To do this you run john with the following parameters, where dic.txt is the input dictionary and dic_plus_rules.txt is the output from john with all of the additions it has made.
john –wordlist=dic.txt –rules –stdout > dic_plus_rules.txt
I also download two wordlists from openwall, one which is a list of ~3100 common passwords, and one labelled ALL that has a large amount of words (~4 million) in various languages. Because of the highly compressible nature of text the files are available in small gzip files. ALL is 11.5Mb which unzips to 41.4Mb and password 12kb which unzips to 21.8kb. There are also more wordlists avaliable for different languages, but the ALL file includes these.
The size of all of the wordlists I used is shown below:
|Openwall Common Passwords||3,158|
|Openwall Common Passwords||112||10.10%||7s|
|Openwall All||210||18.94%||2.45hrs (8829s)|
|Total Passwords Obtained||254||22.90%||~5hrs|
Here are some of the more amusingly bad passwords, the number in brackets shows the frequency of the password.
Crap passwords: 123456 (18), password (4), 1234567 (4), 123456789 (3) 12345678 (2), 12345 (2), abc123 (2), asdfgh (2), nintendo (2), 123123, abcd1234, abcdefg, qwerty
Self-describing passwords: catholic, cowboy, creator, doger, ginger, killer, maggot, player, princess, skater, smallcock, smooth, super, superman, superstar, tester, veggie, winner, wolverine
Some other passwords:bananas, cheese, cinnamon, hampster ,DRAGON, dribble1, poopie, poopoo
# -*- coding: utf-8 -*- #pymd5cracker.py import hashlib, sys from time import time # Change to commandline swtiches when you have the time! hash = "" hash_file = "hash2.csv" wordlist = "mass_rules.txt"; # Read the hash file entered try: hashdocument = open(hash_file,"r") except IOError: print "Invalid file." raw_input() sys.exit() else: # Read the csv values seperated by colons into an array hashes= for line in hashdocument: line=line.replace("\n","") inp = line.split(":") if (line.count(":")<2): inp.append("") hashes.append(inp) hashdocument.close(); # Read wordlist in try: wordlistfile = open(wordlist,"r") except IOError: print "Invalid file." raw_input() sys.exit() else: pass tested=0 cracked=0 tic = time() for line in wordlistfile: line = line.replace("\n","") tested+=1 for i in range(0,len(hashes)): m = hashlib.md5() m.update(hashes[i]+line) word_hash = m.hexdigest() if word_hash==hashes[i]: toc = time() cracked+=1 hashes[i].append(line) print hashes[i]," : ", line, "\t(",time()-tic,"s)" # Show progress evey 1000 passwords tested if tested%1000==0: print "Cracked: ",cracked," (",tested,") ", line # Save the output of this program so we can use again # with another program/dictionary adding the password # to each line we have solved. crackout = open("pycrackout.txt","w") for i in hashes: s="" for j in i: if s!="": s+=":" s+=j s+="\n" crackout.write(s) crackout.close() print "Passwords found: ",cracked,"/",len(hashes) print "Wordlist Words :", test print "Hashes computed: ",len(hashes)*tested print "Total time taken: ",time()-tic,'s'
- Play with more dictionaries
- Speed up code:
- Add multi-threading: My experience with multi-threading in python is that it doesn't work well for cpu intensive tasks, if you know otherwise please let me know.
- Have a look at PyCUDA to see if I can use my graphics card to speed up the code significantly (another type of mutli-threading really...) without having to change language like in my previous post of CUDA MD5 cracking
- Remove hash once found to stop pointless checking
- Add command line switches to all it to be used like a real program