Posts tagged ‘CUDA’
Installing PyCuda on ubuntu 10.04
Using graphics cards to hand massively parallel tasks can now be realised in python, thanks to PyCuda a module for python which allows interaction with the CUDA libaries/binaries provided by Nvidia. To get this working in ubuntu 10.04 lucid lynx I followed this guide http://wiki.tiker.net/PyCuda/Installation/Linux/Ubuntu.
Prep / Installation:
First install CUDA, most places will tell you that CUDA is incompatible with gcc-4.3 but this is not true if you make a few small changes to the configuration.mk file, please see this post about installing CUDA in ubuntu
Install the required packages (or use package manager):
sudo apt-get install python-numpy -y
sudo apt-get install build-essential python-dev python-setuptools libboost-python-dev -y
Download pycuda from: http://pypi.python.org/pypi/pycuda
Untar the achive:
tar xzvf pycuda-0.94rc.tar.gz
Configure, make and install:
cd pycuda-0.94rc
./configure.py –cuda-root=/usr/local/cuda –cudadrv-lib-dir=/usr/lib –boost-inc-dir=/usr/include –boost-lib-dir=/usr/lib –boost-python-libname=boost_python-mt –boost-thread-libname=boost_thread-mt
make -j 4
sudo python setup.py install
Problems
I had some problems with the compliation because it was complaining about pytools missing, this was resolved by removing and reinstalling python-setuptools:
sudo apt-get remove python-setuptools
sudo apt-get install python-setuptools
Running the example
You should now be able to run the hellogpu.py demo in the examples folder and use the download-examples-from-wiki.py to download further demos from the pycuda wiki.
Next
I plan to learn how to use PyCuda and aim to make a MD5 cracker as discussed previously (here and here)
Do you want to crack MD5 hashes in at a rate of ~300MHash/s without a massive rainbow table? Do you have a CUDA enabled GFX card? If you said yes or maybe to these questions then read on for a brief introduction on how to compile and run a CUDA accelerated MD5 cracker (coded by Benjamin “Titan” Vernoux ).
Pre-Requisites and Downloading
- Own a CUDA enabled GFX card, if you have a NVIDIA graphics card from the past year or so this is most likely the case.
- Download and Install the CUDA toolkit
- Download MD5 GPU crack from http://bvernoux.free.fr (windows and Linux)
Building in Ubuntu 10.04
Extract the archive and do a make on the source code. When doing this I came across two problems that can be fixed by modifying the common.mk file.
Problem 1: (cannot be declared weak)
$ make
/usr/include/string.h:43: error: inline function ‘void* memcpy(void*, const void*, size_t)’ cannot be declared weak
/usr/include/string.h:64: error: inline function ‘void* memset(void*, int, size_t)’ cannot be declared weak
/usr/include/bits/string3.h:49: error: inline function ‘void* memcpy(void*, const void*, size_t)’ cannot be declared weak
/usr/include/bits/string3.h:78: error: inline function ‘void* memset(void*, int, size_t)’ cannot be declared weak
/opt/cuda/bin/../include/common_functions.h:59: error: inline function ‘void* memset(void*, int, size_t)’ cannot be declared weak
/opt/cuda/bin/../include/common_functions.h:62: error: inline function ‘void* memcpy(void*, const void*, size_t)’ cannot be declared weak
/opt/cuda/bin/../include/math_functions.h:422: error: inline function ‘int __signbit(double)’ cannot be declared weak
/opt/cuda/bin/../include/math_functions.h:427: error: inline function ‘int __signbitf(float)’ cannot be declared weak
/opt/cuda/bin/../include/math_functions.h:440: error: inline function ‘int __signbitl(long double)’ cannot be declared weak
/usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbit(double)’ cannot be declared weak
/usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbitf(float)’ cannot be declared weak
/usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbitl(long double)’ cannot be declared weak
/usr/include/bits/mathinline.h:36: error: inline function ‘int __signbitf(float)’ cannot be declared weak
/usr/include/bits/mathinline.h:42: error: inline function ‘int __signbit(double)’ cannot be declared weak
/usr/include/bits/mathinline.h:48: error: inline function ‘int __signbitl(long double)’ cannot be declared weak
Solution 1
# Debug/release configuration
ifeq ($(dbg),1)
COMMONFLAGS += -g
NVCCFLAGS += -D_DEBUG
BINSUBDIR := debug
LIBSUFFIX := D
else
##############Change the following line to have -O0 instead of -O2
COMMONFLAGS += -O0
BINSUBDIR := release
LIBSUFFIX :=
NVCCFLAGS += –compiler-options -fno-strict-aliasing
CXXFLAGS += -fno-strict-aliasing
CFLAGS += -fno-strict-aliasing
endif
Problem 2: (lcudart)
$ make
/usr/bin/ld: skipping incompatible /opt/cuda/lib/libcudart.so when searching for -lcudart
/usr/bin/ld: skipping incompatible /opt/cuda/lib/libcudart.so when searching for -lcudart
/usr/bin/ld: cannot find -lcudart
collect2: ld returned 1 exit status
make: *** [bin/linux/release/gpu_md5_crack_0.2.3] Error 1
Solution 2
############## Change lib to lib64 if using a 64 bit operating system
LIB := -L$(CUDA_INSTALL_PATH)/lib64 -L$(LIBDIR) -L$(COMMONDIR)/lib64/$(OSLOWER) -L$(NVIDIA_SDK_PATH)/lib
Remember that you should “make clean” in-between each attempt to compile.
Benchmarking
Once it has compiled nicely you can give it a testdrive with its build in benchmark (with an NVIDIA 260 GFX card). Just run with the -b option:
./gpu_md5_crack_0.2.3 -b
GPU_MD5_Crack v0.2.3 09 July 2009 LGPL for BackTrack 4.
Copyright (C) 2009 TitanMKD (titanmkd@gmail.com).Benchmark Start
Using default CUDA GPU device:0
Cuda device ID:0, Device name:GeForce GTX 260, supporting CUDA:1.3,
multiProcessorCount:27, clockRate:1466.00 MHz, TotalMem:895.31 MB
******* Test 0 Start *******
Expected Password: 1234567890
MD5 Hash:e807f1fcf82d132f9bb018ca6738a19f, Start Password:1200000000, Total pwd to check:1000000000
Charset used 0:0123456789
MD5 brute force startedMD5 Cracked pwd=1234567890 hash=e807f1fcf82d132f9bb018ca6738a19f
Instant 200.02 Mhash/s(40.00 ms)
Average 190.49 Mhash/s, Total Time:0.21s(210.00 ms)
MD5 brute force finished
******* Test 0 End ************** Test 1 Start *******
Expected Password: azerty
MD5 Hash:ab4f63f9ac65152575886860dde480a1, Start Password:, Total pwd to check:1000000000
Charset used 1:abcdefghijklmnopqrstuvwxyz
MD5 brute force startedMD5 Cracked pwd=azerty hash=ab4f63f9ac65152575886860dde480a1
Instant 200.02 Mhash/s(40.00 ms)
Average 240.02 Mhash/s, Total Time:0.10s(100.00 ms)
MD5 brute force finished
******* Test 1 End ************** Test 2 Start *******
Expected Password: azer09
MD5 Hash:41b9cabe6033932eb3037fc933060adc, Start Password:, Total pwd to check:1000000000
Charset used 2:abcdefghijklmnopqrstuvwxyz0123456789
MD5 brute force started
Progress 5%, Pwd:6lmea, Instant 280.02 Mhash/s(28.57 ms)
MD5 Cracked pwd=azer09 hash=41b9cabe6033932eb3037fc933060adc
Instant 266.69 Mhash/s(30.00 ms)
Average 287.20 Mhash/s, Total Time:0.39s(390.00 ms)
MD5 brute force finished
******* Test 2 End ************** Test 3 Start *******
Expected Password: AZBVSD
MD5 Hash:fd049008572788d60140aaead79336cc, Start Password:, Total pwd to check:1000000000
Charset used 3:ABCDEFGHIJKLMNOPQRSTUVWXYZ
MD5 brute force startedMD5 Cracked pwd=AZBVSD hash=fd049008572788d60140aaead79336cc
Instant 266.69 Mhash/s(30.00 ms)
Average 240.02 Mhash/s, Total Time:0.10s(100.00 ms)
MD5 brute force finished
******* Test 3 End ************** Test 4 Start *******
Expected Password: AZ09AA
MD5 Hash:7a552dd9cdd49acc5320bad9c29c9722, Start Password:, Total pwd to check:1000000000
Charset used 4:ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
MD5 brute force started
Progress 5%, Pwd:6LMEA, Instant 266.69 Mhash/s(30.00 ms)
MD5 Cracked pwd=AZ09AA hash=7a552dd9cdd49acc5320bad9c29c9722
Instant 266.69 Mhash/s(30.00 ms)
Average 280.02 Mhash/s, Total Time:0.40s(400.00 ms)
MD5 brute force finished
******* Test 4 End ************** Test 5 Start *******
Expected Password: zaZAab
MD5 Hash:aef49f70bb7b923b8bc0a018f916ef64, Start Password:zCAAAA, Total pwd to check:1000000000
Charset used 5:ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
MD5 brute force started
Progress 17%, Pwd:zaDpoA, Instant 280.02 Mhash/s(28.57 ms)
MD5 Cracked pwd=zaZAab hash=aef49f70bb7b923b8bc0a018f916ef64
Instant 266.69 Mhash/s(30.00 ms)
Average 283.10 Mhash/s, Total Time:0.65s(650.00 ms)
MD5 brute force finished
******* Test 5 End ************** Test 6 Start *******
Expected Password: za0ZA9
MD5 Hash:062cc3b1302759722f48ac0b95b75803, Start Password:zaAAAA, Total pwd to check:1000000000
Charset used 6:ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
MD5 brute force startedMD5 Cracked pwd=za0ZA9 hash=062cc3b1302759722f48ac0b95b75803
Instant 266.69 Mhash/s(30.00 ms)
Average 266.69 Mhash/s, Total Time:0.06s(60.00 ms)
MD5 brute force finished
******* Test 6 End ************** Test 7 Start *******
Expected Password: a^-*|
MD5 Hash:cf7dcf4c3eeb6255668393242fcce273, Start Password:a0000, Total pwd to check:1000000000
Charset used 7: !”#$%&’()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
MD5 brute force startedMD5 Cracked pwd=a^-*| hash=cf7dcf4c3eeb6255668393242fcce273
Instant 266.69 Mhash/s(30.00 ms)
Average 266.69 Mhash/s, Total Time:0.15s(150.00 ms)
MD5 brute force finished
******* Test 7 End *******Benchmark End
So from the benchmark you can see that we are getting between 200 and 300 Mhash/s, that is about 250,000,000 hash attempts per second! AMAZING!!!
Number of combinations for different alphabets
| Length | 0-9 | a-z | a-z0-9 | a-zA-Z | a-zA-Z0-9 |
|---|---|---|---|---|---|
| 1 | 10 | 26 | 36 | 52 | 62 |
| 2 | 100 | 676 | 1,296 | 2,704 | 3,844 |
| 3 | 1,000 | 17,576 | 46,656 | 140,608 | 238,328 |
| 4 | 10,000 | 456,976 | 1,679,616 | 7,311,616 | 14,776,336 |
| 5 | 100,000 | 11,881,376 | 60,466,176 | 380,204,032 | 916,132,832 |
| 6 | 1,000,000 | 308,915,776 | 2,176,782,336 | 19,770,609,664 | 56,800,235,584 |
| 7 | 10,000,000 | 8,031,810,176 | 78,364,164,096 | 1,028,071,702,528 | 3,521,614,606,208 |
| 8 | 100,000,000 | 208,827,064,576 | 2,821,109,907,456 | 53,459,728,531,456 | 218,340,105,584,896 |
| 9 | 1,000,000,000 | 5,429,503,678,976 | 101,559,956,668,416 | 2,779,905,883,635,710 | 13,537,086,546,263,600 |
| 10 | 10,000,000,000 | 141,167,095,653,376 | 3,656,158,440,062,980 | 144,555,105,949,057,000 | 839,299,365,868,340,000 |
Estimated time (in seconds) to crack (at 250MHash/s)
| Length | 0-9 | a-z | a-z0-9 | a-zA-Z | a-zA-Z0-9 |
|---|---|---|---|---|---|
| 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 3 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 4 | 0.00 | 0.00 | 0.00 | 0.01 | 0.03 |
| 5 | 0.00 | 0.02 | 0.12 | 0.76 | 1.83 |
| 6 | 0.00 | 0.62 | 4.35 | 39.54 | 113.60 |
| 7 | 0.02 | 16.06 | 156.73 | 2,056.14 | 7,043.23 |
| 8 | 0.20 | 417.65 | 5,642.22 | 106,919.46 | 436,680.21 |
| 9 | 2.00 | 10,859.01 | 203,119.91 | 5,559,811.77 | 27,074,173.09 |
| 10 | 20.00 | 282,334.19 | 7,312,316.88 | 289,110,211.90 | 1,678,598,731.74 |
Full calculations avaliable here: MD5 hash cracking time using GPU accelerated brute forcing
What now?
Well you can crack MD5′s at an extremely accelerated rate, so enjoy doing so responsibly (let your morals guide you
). You could also explore the source code and make additions as you see fit, I am planning on modifying it to allow an extra parameter so that prefixes can be added if you already know how the password starts. This can be the case when someone has prefixed the password with a known salt.
I’ve heard a lot about CUDA, such as how it is 10,000% faster at cracking wireless passwords over a conventional program/hardware, but never really got around to testing it out before now. This post details the steps required to compile and setup CUDA 2.3 SDK and toolkit on ubuntu 9.10.
Downloads
You are required to have an Nvidia graphics driver (relatively new version) already installed. First download the CUDA toolkit and CUDA sdk from the Nvidia CUDA 2.3 download page.
Install the toolkit
# Make file executable chmod +x cudatoolkit_2.3_linux_64_ubuntu9.04.run # Run it as superuser sudo ./cudatoolkit_2.3_linux_64_ubuntu9.04.run
You now need to edit your .bashrc file in your home directory to include the paths (so your CUDA binaries can be found by the system)
export PATH=${PATH}:/usr/local/cuda/bin
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64
Note if you are using 32bit then “lib64″ should be replaced with just “lib”
Install the SDK
# Make file executable chmod +x cudasdk_2.3_linux.run # Run it as normal user ./cudasdk_2.3_linux.run
You should now have a NVIDIA_GPU_Computing_SDK folder in your home directory. Change directory into the C folder inside this one.
cd NVIDIA_GPU_Computing_SDK/C
In this folder is a make file which will compile all the Nvidia SDK and all the demos, in order for this to work in ubuntu 9.10 (x64) you will need to install several dependencies. By installing these before attempting to make will save you a lot of time, if you are getting errors please scroll down to the problems section to see if they are already covered.
# Install the necessary libraries sudo apt-get install freeglut3 freeglut3-dev libx11-dev mesa-common-dev libxmu6
Making and running demos
You can then run the make command, once this is ran all of the executables will be placed in NVIDIA_GPU_Computing_SDK/C/bin/linux/released . We can check that our computer has an useable CUDA device install by running the deviceQuery program:
cd ~/NVIDIA_GPU_Computing_SDK/C/bin/linux/released ./deviceQuery
This should output something similar to the following:
# ./deviceQuery CUDA Device Query (Runtime API) version (CUDART static linking) There is 1 device supporting CUDA Device 0: "GeForce GTX 260" CUDA Driver Version: 2.30 CUDA Runtime Version: 2.30 CUDA Capability Major revision number: 1 CUDA Capability Minor revision number: 3 Total amount of global memory: 938803200 bytes Number of multiprocessors: 27 Number of cores: 216 Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 16384 bytes Total number of registers available per block: 16384 Warp size: 32 Maximum number of threads per block: 512 Maximum sizes of each dimension of a block: 512 x 512 x 64 Maximum sizes of each dimension of a grid: 65535 x 65535 x 1 Maximum memory pitch: 262144 bytes Texture alignment: 256 bytes Clock rate: 1.47 GHz Concurrent copy and execution: Yes Run time limit on kernels: Yes Integrated: No Support host page-locked memory mapping: Yes Compute mode: Default (multiple host threads can use this device simultaneously) Test PASSED
Now that we can see CUDA is successfully installed and a suitable device is found we can run some of nvidia’s more ascetically pleasing demos:
./fluidsGL
./smokeParticles
./particles
./postProcessGL
Problems
libxi (Nvidia forum link) make[1]: Leaving directory `/home/mat/NVIDIA_GPU_Computing_SDK/C/common' make[1]: Entering directory `/home/mat/NVIDIA_GPU_Computing_SDK/C/common' In file included from ./../common/inc/paramgl.h:24, from src/paramgl.cpp:19: ./../common/inc/GL/glut.h:60:20: error: GL/glu.h: No such file or directory make[1]: *** [obj/release/paramgl.cpp.o] Error 1 make[1]: Leaving directory `/home/mat/NVIDIA_GPU_Computing_SDK/C/common' make: *** [lib/libparamgl.so] Error 2
sudo apt-get install freeglut3 freeglut3-dev libx11-dev mesa-common-dev
/usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbitf(float)’ cannot be declared weak /usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbitl(long double)’ cannot be declared weak /usr/include/bits/mathinline.h:36: error: inline function ‘int __signbitf(float)’ cannot be declared weak /usr/include/bits/mathinline.h:42: error: inline function ‘int __signbit(double)’ cannot be declared weak /usr/include/bits/mathinline.h:48: error: inline function ‘int __signbitl(long double)’ cannot be declared weak /usr/local/cuda/bin/../include/math_functions.h:442: error: inline function ‘int __signbitl(long double)’ cannot be declared weak make[1]: *** [obj/release/particleSystem.cu.o] Error 255 make[1]: Leaving directory `/home/mat/NVIDIA_GPU_Computing_SDK/C/src/particles' make: *** [src/particles/Makefile.ph_build] Error 2
The problem is due to having gcc 4.4 installed rather than 4.3, it is possible to install the older version of this compiler but it is simpler to modify common/common.mk and add the following extra flag (Nvidia forum link):
# Change: NVCCFLAGS += --compiler-options -fno-strict-aliasing # To: NVCCFLAGS += --compiler-options -fno-strict-aliasing --compiler-options -fno-inline
and change the -O2
# Change: COMMONFLAGS += -O2 # To: COMMONFLAGS += -O0
The two remaining errors you may encounter are very similar and arrise from missing libraries:
libxi (Nvidia forum link)
/usr/bin/ld: cannot find -lXi collect2: ld returned 1 exit status make[1]: *** [../../bin/linux/release/particles] Error 1
sudo apt-get install libxi-dev
libxmu (Nvidia forum link)
/usr/bin/ld: cannot find -lXmu collect2: ld returned 1 exit status make[1]: *** [../../bin/linux/release/particles] Error 1
sudo apt-get install libxmu-dev libxmu6





