CUDA ERROR: initialization error when using parallel in python

Question

I use CUDA for my code, but it still slow run. Therefore I change it to run parallel using multiprocessing (pool.map) in python. But I have CUDA ERROR: initialization error

This Is function :

def step_M(self, iter_training):
    gpe, e_tuple_list = iter_training
    g = gpe[0]
    p = gpe[1]
    em_iters = gpe[2]

    e_tuple_list = sorted(e_tuple_list, key=lambda tup: tup[0])
    data = self.X[e_tuple_list[0][0]:e_tuple_list[0][1]]
    cluster_indices = np.array(range(e_tuple_list[0][0], e_tuple_list[0][1], 1), dtype=np.int32)
    for i in range(1, len(e_tuple_list)):
        d = e_tuple_list[i]
        cluster_indices = np.concatenate((cluster_indices, np.array(range(d[0], d[1], 1), dtype=np.int32)))
        data = np.concatenate((data, self.X[d[0]:d[1]]))

    g.train_on_subset(self.X, cluster_indices, max_em_iters=em_iters)
    return g, cluster_indices, data

And here code call:

pool = Pool()
iter_bic_list = pool.map(self.step_M, iter_training.items())

The iter_training same: enter image description here

And this is errors enter image description here could you help me to fix.Thanks you.

sagarwal · Accepted Answer

I found this is a problem with cuda putting a mutex for a process ID. So when you use the multiprocessing module another subprocess with a separate pid is spawned. And it is not able to access because of the mutex for the GPU.

A quick solution which I found to be working is using the threading module instead of the multiprocessing module.

So basically the same pid which loads the network in the gpu should use it.

SHIVPOOJAN SAINI · Answer

I had gone through the same problem, reason behind this is If you create a CUDA context before the fork(), you cannot use that within the child process. The cudaSetDevice(0); call attempts to share the CUDA context, implicitly created in the parent process when you call cudaGetDeviceCount();. Solution:

Two possible solutions for the above problem:-

1-Either all Cuda related operation should be performed under parent or child process(Make sure no Cuda call make in parent process if operation under child process like torch.cuda.is_available() and torch.manual_seed both count as a CUDA calls in PyTorch 0.2.0.)

2-Change Process type to spawn() for python use syntax multiprocessing.set_start_method('spawn') before initaialzing process

e.g :

import multiprocess as multiprocessing
multiprocessing.set_start_method('spawn')
p1 = multiprocessing.Process(target=fun, args=(param1,)
p1.start()
p1.join()

CUDA ERROR: initialization error when using parallel in python

Tags:

python

parallel-processing

cuda

Hudo

2 Answers

sagarwal

SHIVPOOJAN SAINI

Recent Activity

Donate For Us

CUDA ERROR: initialization error when using parallel in python

Tags:

python

parallel-processing

cuda

Hudo

2 Answers

sagarwal

SHIVPOOJAN SAINI

Related questions

Recent Activity

Donate For Us