I'm using Python's library somoclu
to train a self-organising map using Python. The library allows users to perform the training either on the CPU (Intel Core i7-8700) or on the GPU (GeForce GTX 1080 Ti).
I noticed that the CPU was running the script faster than the GPU did, so I ran a sweep varying the number of datapoints and the size of the map, to see if at some point the GPU outperformed the CPU. This was the script:
import numpy as np
import somoclu
import time
m = 3 # Number of dimensions
points = [5000, 30000, 80000, 150000, 300000] # Number of datapoints
iterMax = 200 # Max number of iterations
mapSize = [4, 32, 64, 128] # Dimensions of SOM
np.random.seed(0)
#%% SOM
for n in points:
for size in mapSize:
y = np.random.rand(n,m) # Input data
# With CPU
t = time.clock() # Start time
som = somoclu.Somoclu(size,
size,
compactsupport = False,
kerneltype = 0)
som.train(y.astype(np.float32), epochs = iterMax)
elapsedTime = time.clock() - t
# With GPU
t = time.clock() # Start time
som = somoclu.Somoclu(size,
size,
compactsupport = False,
kerneltype = 1)
som.train(y.astype(np.float32), epochs = iterMax)
elapsedTime = time.clock() - t
I saved the times in a CSV, and this is what I got:
CPU GPU
2.7632589999999997 5.935387999999999
60.340638 82.796062
228.292085 305.75625900000006
861.3243 1141.331934
11.692982999999913 24.568256999999903
330.17140100000006 443.82112400000005
1354.677431 1749.3110039999992
5559.308704 6990.034151000002
29.3726179999976 47.36881999999969
913.3250950000001 1163.5942189999987
3703.653313999999 4615.292857
14868.418703000003 18635.051464000004
37.40133600000263 68.64375999999902
1699.020611 2141.047305
6925.692426000009 8645.564134
27887.844171999997 illegal memory access was encountered
As you can see, the CPU outperforms the GPU in every single case (on top of it, the GPU version crashed when running the script with 150000 datapoints and a 64x64 map). How is this possible? What is the advantage on using the GPU to train the SOM then?
EDIT:
I tried the same library in R, and in this language the GPU outperforms the CPU. So apparently is just a Python issue, but I'm no expert in programming to figure out what is happening. I believe the kernel running is the same, so it's just the interface that changes. Let's see if this helps somebody to find why in Python the CPU is going faster than the GPU.
According to Figure 5 in this paper on somoclu, the GPU was faster. However, the paper did not show extensive benchmarking. I can only suggest that for your machine, the CPU is more capable. But you could study the paper to run a more similar test for comparison.
To ensure replicability of the results, we benchmarked with publicly available cluster GPUinstances provided by Amazon Web Services. The instance type was cg1.4xlarge (https://aws.amazon.com/ec2/instance-types/), equipped with 22 GiB of memory, two IntelXeon X5570 quad-core CPUs, and two NVIDIA Tesla M2050 GPUs, running Ubuntu 12.04.
(16) Somoclu: An Efficient Parallel Library for Self-Organizing Maps, Available from: https://www.researchgate.net/publication/236635216_Somoclu_An_Efficient_Parallel_Library_for_Self-Organizing_Maps
It seems that both your CPU and your GPU are more powerful than the AWS benchmark.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With