Skip to content

Posts tagged ‘CUDA’

Installing PyCuda on ubuntu 10.04

Jul 8 10
by mat

Using graphics cards to hand massively parallel tasks can now be realised in python, thanks to PyCuda a module for python which allows interaction with the CUDA libaries/binaries provided by Nvidia. To get this working in ubuntu 10.04 lucid lynx I followed this guide http://wiki.tiker.net/PyCuda/Installation/Linux/Ubuntu.

Prep / Installation:

First install CUDA, most places will tell you that CUDA is incompatible with gcc-4.3 but this is not true if you make a few small changes to the configuration.mk file, please see this post about installing CUDA in ubuntu

Install the required packages (or use package manager):

sudo apt-get install python-numpy -y
sudo apt-get install build-essential python-dev python-setuptools libboost-python-dev -y

Download pycuda from: http://pypi.python.org/pypi/pycuda

Untar the achive:

tar xzvf pycuda-0.94rc.tar.gz

Configure, make and install:

cd pycuda-0.94rc
./configure.py –cuda-root=/usr/local/cuda –cudadrv-lib-dir=/usr/lib –boost-inc-dir=/usr/include –boost-lib-dir=/usr/lib –boost-python-libname=boost_python-mt –boost-thread-libname=boost_thread-mt
make -j 4
sudo python setup.py install

Problems

I had some problems with the compliation because it was complaining about pytools missing, this was resolved by removing and reinstalling python-setuptools:

sudo apt-get remove python-setuptools
sudo apt-get install python-setuptools

Running the example

You should now be able to run the hellogpu.py demo in the examples folder and use the download-examples-from-wiki.py to download further demos from the pycuda wiki.

Next
I plan to learn how to use PyCuda and aim to make a MD5 cracker as discussed previously (here and here)

Cracking MD5 hashes (or passwords) ultra-fast with GPU acceleration

Jun 24 10
by mat

Do you want to crack MD5 hashes in at a rate of ~300MHash/s without a massive rainbow table? Do you have a CUDA enabled GFX card? If you said yes or maybe to these questions then read on for a brief introduction on how to compile and run a CUDA accelerated MD5 cracker (coded by Benjamin “Titan” Vernoux ).

Pre-Requisites and Downloading

Building in Ubuntu 10.04

Extract the archive and do a make on the source code. When doing this I came across two problems that can be fixed by modifying the common.mk file.

Problem 1: (cannot be declared weak)

$ make
/usr/include/string.h:43: error: inline function ‘void* memcpy(void*, const void*, size_t)’ cannot be declared weak
/usr/include/string.h:64: error: inline function ‘void* memset(void*, int, size_t)’ cannot be declared weak
/usr/include/bits/string3.h:49: error: inline function ‘void* memcpy(void*, const void*, size_t)’ cannot be declared weak
/usr/include/bits/string3.h:78: error: inline function ‘void* memset(void*, int, size_t)’ cannot be declared weak
/opt/cuda/bin/../include/common_functions.h:59: error: inline function ‘void* memset(void*, int, size_t)’ cannot be declared weak
/opt/cuda/bin/../include/common_functions.h:62: error: inline function ‘void* memcpy(void*, const void*, size_t)’ cannot be declared weak
/opt/cuda/bin/../include/math_functions.h:422: error: inline function ‘int __signbit(double)’ cannot be declared weak
/opt/cuda/bin/../include/math_functions.h:427: error: inline function ‘int __signbitf(float)’ cannot be declared weak
/opt/cuda/bin/../include/math_functions.h:440: error: inline function ‘int __signbitl(long double)’ cannot be declared weak
/usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbit(double)’ cannot be declared weak
/usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbitf(float)’ cannot be declared weak
/usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbitl(long double)’ cannot be declared weak
/usr/include/bits/mathinline.h:36: error: inline function ‘int __signbitf(float)’ cannot be declared weak
/usr/include/bits/mathinline.h:42: error: inline function ‘int __signbit(double)’ cannot be declared weak
/usr/include/bits/mathinline.h:48: error: inline function ‘int __signbitl(long double)’ cannot be declared weak

Solution 1

# Debug/release configuration
ifeq ($(dbg),1)
COMMONFLAGS += -g
NVCCFLAGS += -D_DEBUG
BINSUBDIR := debug
LIBSUFFIX := D
else
##############Change the following line to have -O0 instead of -O2
COMMONFLAGS += -O0
BINSUBDIR := release
LIBSUFFIX :=
NVCCFLAGS += –compiler-options -fno-strict-aliasing
CXXFLAGS += -fno-strict-aliasing
CFLAGS += -fno-strict-aliasing
endif

Problem 2: (lcudart)

$ make
/usr/bin/ld: skipping incompatible /opt/cuda/lib/libcudart.so when searching for -lcudart
/usr/bin/ld: skipping incompatible /opt/cuda/lib/libcudart.so when searching for -lcudart
/usr/bin/ld: cannot find -lcudart
collect2: ld returned 1 exit status
make: *** [bin/linux/release/gpu_md5_crack_0.2.3] Error 1

Solution 2

############## Change lib to lib64 if using a 64 bit operating system
LIB := -L$(CUDA_INSTALL_PATH)/lib64 -L$(LIBDIR) -L$(COMMONDIR)/lib64/$(OSLOWER) -L$(NVIDIA_SDK_PATH)/lib

Remember that you should “make clean” in-between each attempt to compile.

Benchmarking

Once it has compiled nicely you can give it a testdrive with its build in benchmark (with an NVIDIA 260 GFX card). Just run with the -b option:

./gpu_md5_crack_0.2.3 -b
GPU_MD5_Crack v0.2.3 09 July 2009 LGPL for BackTrack 4.
Copyright (C) 2009 TitanMKD (titanmkd@gmail.com).

Benchmark Start
Using default CUDA GPU device:0
Cuda device ID:0, Device name:GeForce GTX 260, supporting CUDA:1.3,
multiProcessorCount:27, clockRate:1466.00 MHz, TotalMem:895.31 MB
******* Test 0 Start *******
Expected Password: 1234567890
MD5 Hash:e807f1fcf82d132f9bb018ca6738a19f, Start Password:1200000000, Total pwd to check:1000000000
Charset used 0:0123456789
MD5 brute force started

MD5 Cracked pwd=1234567890 hash=e807f1fcf82d132f9bb018ca6738a19f
Instant 200.02 Mhash/s(40.00 ms)
Average 190.49 Mhash/s, Total Time:0.21s(210.00 ms)
MD5 brute force finished
******* Test 0 End *******

******* Test 1 Start *******
Expected Password: azerty
MD5 Hash:ab4f63f9ac65152575886860dde480a1, Start Password:, Total pwd to check:1000000000
Charset used 1:abcdefghijklmnopqrstuvwxyz
MD5 brute force started

MD5 Cracked pwd=azerty hash=ab4f63f9ac65152575886860dde480a1
Instant 200.02 Mhash/s(40.00 ms)
Average 240.02 Mhash/s, Total Time:0.10s(100.00 ms)
MD5 brute force finished
******* Test 1 End *******

******* Test 2 Start *******
Expected Password: azer09
MD5 Hash:41b9cabe6033932eb3037fc933060adc, Start Password:, Total pwd to check:1000000000
Charset used 2:abcdefghijklmnopqrstuvwxyz0123456789
MD5 brute force started
Progress 5%, Pwd:6lmea, Instant 280.02 Mhash/s(28.57 ms)
MD5 Cracked pwd=azer09 hash=41b9cabe6033932eb3037fc933060adc
Instant 266.69 Mhash/s(30.00 ms)
Average 287.20 Mhash/s, Total Time:0.39s(390.00 ms)
MD5 brute force finished
******* Test 2 End *******

******* Test 3 Start *******
Expected Password: AZBVSD
MD5 Hash:fd049008572788d60140aaead79336cc, Start Password:, Total pwd to check:1000000000
Charset used 3:ABCDEFGHIJKLMNOPQRSTUVWXYZ
MD5 brute force started

MD5 Cracked pwd=AZBVSD hash=fd049008572788d60140aaead79336cc
Instant 266.69 Mhash/s(30.00 ms)
Average 240.02 Mhash/s, Total Time:0.10s(100.00 ms)
MD5 brute force finished
******* Test 3 End *******

******* Test 4 Start *******
Expected Password: AZ09AA
MD5 Hash:7a552dd9cdd49acc5320bad9c29c9722, Start Password:, Total pwd to check:1000000000
Charset used 4:ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
MD5 brute force started
Progress 5%, Pwd:6LMEA, Instant 266.69 Mhash/s(30.00 ms)
MD5 Cracked pwd=AZ09AA hash=7a552dd9cdd49acc5320bad9c29c9722
Instant 266.69 Mhash/s(30.00 ms)
Average 280.02 Mhash/s, Total Time:0.40s(400.00 ms)
MD5 brute force finished
******* Test 4 End *******

******* Test 5 Start *******
Expected Password: zaZAab
MD5 Hash:aef49f70bb7b923b8bc0a018f916ef64, Start Password:zCAAAA, Total pwd to check:1000000000
Charset used 5:ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
MD5 brute force started
Progress 17%, Pwd:zaDpoA, Instant 280.02 Mhash/s(28.57 ms)
MD5 Cracked pwd=zaZAab hash=aef49f70bb7b923b8bc0a018f916ef64
Instant 266.69 Mhash/s(30.00 ms)
Average 283.10 Mhash/s, Total Time:0.65s(650.00 ms)
MD5 brute force finished
******* Test 5 End *******

******* Test 6 Start *******
Expected Password: za0ZA9
MD5 Hash:062cc3b1302759722f48ac0b95b75803, Start Password:zaAAAA, Total pwd to check:1000000000
Charset used 6:ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
MD5 brute force started

MD5 Cracked pwd=za0ZA9 hash=062cc3b1302759722f48ac0b95b75803
Instant 266.69 Mhash/s(30.00 ms)
Average 266.69 Mhash/s, Total Time:0.06s(60.00 ms)
MD5 brute force finished
******* Test 6 End *******

******* Test 7 Start *******
Expected Password: a^-*|
MD5 Hash:cf7dcf4c3eeb6255668393242fcce273, Start Password:a0000, Total pwd to check:1000000000
Charset used 7: !”#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
MD5 brute force started

MD5 Cracked pwd=a^-*| hash=cf7dcf4c3eeb6255668393242fcce273
Instant 266.69 Mhash/s(30.00 ms)
Average 266.69 Mhash/s, Total Time:0.15s(150.00 ms)
MD5 brute force finished
******* Test 7 End *******

Benchmark End

So from the benchmark you can see that we are getting between 200 and 300 Mhash/s, that is about 250,000,000 hash attempts per second! AMAZING!!!

Number of combinations for different alphabets

Length 0-9 a-z a-z0-9 a-zA-Z a-zA-Z0-9
1 10 26 36 52 62
2 100 676 1,296 2,704 3,844
3 1,000 17,576 46,656 140,608 238,328
4 10,000 456,976 1,679,616 7,311,616 14,776,336
5 100,000 11,881,376 60,466,176 380,204,032 916,132,832
6 1,000,000 308,915,776 2,176,782,336 19,770,609,664 56,800,235,584
7 10,000,000 8,031,810,176 78,364,164,096 1,028,071,702,528 3,521,614,606,208
8 100,000,000 208,827,064,576 2,821,109,907,456 53,459,728,531,456 218,340,105,584,896
9 1,000,000,000 5,429,503,678,976 101,559,956,668,416 2,779,905,883,635,710 13,537,086,546,263,600
10 10,000,000,000 141,167,095,653,376 3,656,158,440,062,980 144,555,105,949,057,000 839,299,365,868,340,000

Estimated time (in seconds) to crack (at 250MHash/s)

Length 0-9 a-z a-z0-9 a-zA-Z a-zA-Z0-9
1 0.00 0.00 0.00 0.00 0.00
2 0.00 0.00 0.00 0.00 0.00
3 0.00 0.00 0.00 0.00 0.00
4 0.00 0.00 0.00 0.01 0.03
5 0.00 0.02 0.12 0.76 1.83
6 0.00 0.62 4.35 39.54 113.60
7 0.02 16.06 156.73 2,056.14 7,043.23
8 0.20 417.65 5,642.22 106,919.46 436,680.21
9 2.00 10,859.01 203,119.91 5,559,811.77 27,074,173.09
10 20.00 282,334.19 7,312,316.88 289,110,211.90 1,678,598,731.74

Full calculations avaliable here: MD5 hash cracking time using GPU accelerated brute forcing

What now?
Well you can crack MD5’s at an extremely accelerated rate, so enjoy doing so responsibly (let your morals guide you :P). You could also explore the source code and make additions as you see fit, I am planning on modifying it to allow an extra parameter so that prefixes can be added if you already know how the password starts. This can be the case when someone has prefixed the password with a known salt.

Compiling and running CUDA 2.3 SDK and toolkit on ubuntu 9.10 x64 (64-bit)

Feb 20 10
by mat

I’ve heard a lot about CUDA, such as how it is 10,000% faster at cracking wireless passwords over a conventional program/hardware, but never really got around to testing it out before now. This post details the steps required to compile and setup CUDA 2.3 SDK and toolkit on ubuntu 9.10.

Downloads
You are required to have an Nvidia graphics driver (relatively new version) already installed. First download the CUDA toolkit and CUDA sdk from the Nvidia CUDA 2.3 download page.

Install the toolkit

# Make file executable
chmod +x cudatoolkit_2.3_linux_64_ubuntu9.04.run
# Run it as superuser
sudo ./cudatoolkit_2.3_linux_64_ubuntu9.04.run

You now need to edit your .bashrc file in your home directory to include the paths (so your CUDA binaries can be found by the system)

export PATH=${PATH}:/usr/local/cuda/bin
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64

Note if you are using 32bit then “lib64″ should be replaced with just “lib”

Install the SDK

# Make file executable
chmod +x cudasdk_2.3_linux.run
# Run it as normal user
./cudasdk_2.3_linux.run

You should now have a NVIDIA_GPU_Computing_SDK folder in your home directory. Change directory into the C folder inside this one.

cd NVIDIA_GPU_Computing_SDK/C

In this folder is a make file which will compile all the Nvidia SDK and all the demos, in order for this to work in ubuntu 9.10 (x64) you will need to install several dependencies. By installing these before attempting to make will save you a lot of time, if you are getting errors please scroll down to the problems section to see if they are already covered.

# Install the necessary libraries
sudo apt-get install freeglut3 freeglut3-dev libx11-dev mesa-common-dev libxmu6

Making and running demos

You can then run the make command, once this is ran all of the executables will be placed in NVIDIA_GPU_Computing_SDK/C/bin/linux/released . We can check that our computer has an useable CUDA device install by running the deviceQuery program:

cd ~/NVIDIA_GPU_Computing_SDK/C/bin/linux/released
./deviceQuery

This should output something similar to the following:

# ./deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA

Device 0: "GeForce GTX 260"
  CUDA Driver Version:                           2.30
  CUDA Runtime Version:                          2.30
  CUDA Capability Major revision number:         1
  CUDA Capability Minor revision number:         3
  Total amount of global memory:                 938803200 bytes
  Number of multiprocessors:                     27
  Number of cores:                               216
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 16384
  Warp size:                                     32
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          262144 bytes
  Texture alignment:                             256 bytes
  Clock rate:                                    1.47 GHz
  Concurrent copy and execution:                 Yes
  Run time limit on kernels:                     Yes
  Integrated:                                    No
  Support host page-locked memory mapping:       Yes
  Compute mode:                                  Default (multiple host threads can use this device simultaneously)

Test PASSED

Now that we can see CUDA is successfully installed and a suitable device is found we can run some of nvidia’s more ascetically pleasing demos:

./fluidsGL

CUDA SDK example fluidsGL on ubuntu 9.10 x64

CUDA SDK example fluidsGL on ubuntu 9.10 x64

./smokeParticles

CUDA SDK example smokeparticles on ubuntu 9.10 x64

CUDA SDK example smokeparticles on ubuntu 9.10 x64

./particles

CUDA SDK example particles on ubuntu 9.10 x64

CUDA SDK example particles on ubuntu 9.10 x64

./postProcessGL

CUDA SDK example postProcessGL on ubuntu 9.10 x64 (teapot)

CUDA SDK example postProcessGL on ubuntu 9.10 x64 (teapot)

Problems


libxi (Nvidia forum link)

make[1]: Leaving directory `/home/mat/NVIDIA_GPU_Computing_SDK/C/common'
make[1]: Entering directory `/home/mat/NVIDIA_GPU_Computing_SDK/C/common'
In file included from ./../common/inc/paramgl.h:24,
                 from src/paramgl.cpp:19:
./../common/inc/GL/glut.h:60:20: error: GL/glu.h: No such file or directory
make[1]: *** [obj/release/paramgl.cpp.o] Error 1
make[1]: Leaving directory `/home/mat/NVIDIA_GPU_Computing_SDK/C/common'
make: *** [lib/libparamgl.so] Error 2
sudo apt-get install freeglut3 freeglut3-dev libx11-dev mesa-common-dev
/usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbitf(float)’ cannot be declared weak
/usr/include/bits/mathcalls.h:350: error: inline function ‘int __signbitl(long double)’ cannot be declared weak
/usr/include/bits/mathinline.h:36: error: inline function ‘int __signbitf(float)’ cannot be declared weak
/usr/include/bits/mathinline.h:42: error: inline function ‘int __signbit(double)’ cannot be declared weak
/usr/include/bits/mathinline.h:48: error: inline function ‘int __signbitl(long double)’ cannot be declared weak
/usr/local/cuda/bin/../include/math_functions.h:442: error: inline function ‘int __signbitl(long double)’ cannot be declared weak
make[1]: *** [obj/release/particleSystem.cu.o] Error 255
make[1]: Leaving directory `/home/mat/NVIDIA_GPU_Computing_SDK/C/src/particles'
make: *** [src/particles/Makefile.ph_build] Error 2

The problem is due to having gcc 4.4 installed rather than 4.3, it is possible to install the older version of this compiler but it is simpler to modify common/common.mk and add the following extra flag (Nvidia forum link):

# Change:
NVCCFLAGS += --compiler-options -fno-strict-aliasing
# To:
NVCCFLAGS += --compiler-options -fno-strict-aliasing --compiler-options -fno-inline

and change the -O2

# Change:
COMMONFLAGS += -O2
# To: 
COMMONFLAGS += -O0

The two remaining errors you may encounter are very similar and arrise from missing libraries:

libxi (Nvidia forum link)

/usr/bin/ld: cannot find -lXi
collect2: ld returned 1 exit status
make[1]: *** [../../bin/linux/release/particles] Error 1
sudo apt-get install libxi-dev

libxmu (Nvidia forum link)

/usr/bin/ld: cannot find -lXmu
collect2: ld returned 1 exit status
make[1]: *** [../../bin/linux/release/particles] Error 1
sudo apt-get install libxmu-dev libxmu6