Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CUDA ERROR: initialization error when using parallel in python

I use CUDA for my code, but it still slow run. Therefore I change it to run parallel using multiprocessing (pool.map) in python. But I have CUDA ERROR: initialization error

This Is function :

def step_M(self, iter_training):
    gpe, e_tuple_list = iter_training
    g = gpe[0]
    p = gpe[1]
    em_iters = gpe[2]

    e_tuple_list = sorted(e_tuple_list, key=lambda tup: tup[0])
    data = self.X[e_tuple_list[0][0]:e_tuple_list[0][1]]
    cluster_indices = np.array(range(e_tuple_list[0][0], e_tuple_list[0][1], 1), dtype=np.int32)
    for i in range(1, len(e_tuple_list)):
        d = e_tuple_list[i]
        cluster_indices = np.concatenate((cluster_indices, np.array(range(d[0], d[1], 1), dtype=np.int32)))
        data = np.concatenate((data, self.X[d[0]:d[1]]))

    g.train_on_subset(self.X, cluster_indices, max_em_iters=em_iters)
    return g, cluster_indices, data

And here code call:

pool = Pool()
iter_bic_list = pool.map(self.step_M, iter_training.items())

The iter_training same: enter image description here

And this is errors enter image description here could you help me to fix.Thanks you.

like image 255
Hudo Avatar asked Sep 01 '25 02:09

Hudo


2 Answers

I found this is a problem with cuda putting a mutex for a process ID. So when you use the multiprocessing module another subprocess with a separate pid is spawned. And it is not able to access because of the mutex for the GPU.

A quick solution which I found to be working is using the threading module instead of the multiprocessing module.

So basically the same pid which loads the network in the gpu should use it.

like image 137
sagarwal Avatar answered Sep 02 '25 14:09

sagarwal


I had gone through the same problem, reason behind this is If you create a CUDA context before the fork(), you cannot use that within the child process. The cudaSetDevice(0); call attempts to share the CUDA context, implicitly created in the parent process when you call cudaGetDeviceCount();. Solution:

Two possible solutions for the above problem:-

1-Either all Cuda related operation should be performed under parent or child process(Make sure no Cuda call make in parent process if operation under child process like torch.cuda.is_available() and torch.manual_seed both count as a CUDA calls in PyTorch 0.2.0.)

2-Change Process type to spawn() for python use syntax multiprocessing.set_start_method('spawn') before initaialzing process

e.g :

import multiprocess as multiprocessing
multiprocessing.set_start_method('spawn')
p1 = multiprocessing.Process(target=fun, args=(param1,)
p1.start()
p1.join()
like image 29
SHIVPOOJAN SAINI Avatar answered Sep 02 '25 16:09

SHIVPOOJAN SAINI