Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scikit-learn's GridSearchCV stops working when n_jobs>1

I have previously asked here come up with following lines of code:

parameters = [{'weights': ['uniform'], 'n_neighbors': [5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]}]
clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=4)
clf.fit(features, rewards)

But when I've run this there has appeared another problem that was not related to the previously asked question. Python ends up with following OS error message:

Process:         Python [1327]
Path:            /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
Identifier:      Python
Version:         2.7.2.5 (2.7.2.5.r64662-trunk)
Code Type:       X86-64 (Native)
Parent Process:  Python [1316]
Responsible:     Sublime Text 2 [308]
User ID:         501

Date/Time:       2014-08-12 10:27:24.640 +0200
OS Version:      Mac OS X 10.9.4 (13E28)
Report Version:  11
Anonymous UUID:  D10CD8B7-221F-B121-98D4-4574A1F2189F

Sleep/Wake UUID: 0B9C4AE0-26E6-4DE8-B751-665791968115

Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000110

VM Regions Near 0x110:
--> 
__TEXT                 0000000100000000-0000000100001000 [    4K] r-x/rwx SM=COW  /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python

Application Specific Information:
*** multi-threaded process forked ***
crashed on child side of fork pre-exec

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libdispatch.dylib               0x00007fff91534c90 dispatch_group_async_f + 141
1   libBLAS.dylib                   0x00007fff9413f791 APL_sgemm + 1061
2   libBLAS.dylib                   0x00007fff9413cb3f cblas_sgemm + 1267
3   _dotblas.so                     0x0000000102b0236e dotblas_matrixproduct + 5934
4   org.activestate.ActivePython27  0x00000001000c552d PyEval_EvalFrameEx + 23949
5   org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
6   org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
7   org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
8   org.activestate.ActivePython27  0x000000010003d390 function_call + 176
9   org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
10  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
11  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
12  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
13  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
14  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
15  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
16  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
17  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
18  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
19  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
20  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
21  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
22  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
23  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
24  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
25  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
26  org.activestate.ActivePython27  0x0000000100077dfa slot_tp_call + 74
27  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
28  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
29  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
30  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
31  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
32  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
33  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
34  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
35  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
36  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
37  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
38  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
39  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
40  org.activestate.ActivePython27  0x0000000100077a28 slot_tp_init + 88
41  org.activestate.ActivePython27  0x0000000100074e25 type_call + 245
42  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
43  org.activestate.ActivePython27  0x00000001000c267d PyEval_EvalFrameEx + 11997  
44  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
45  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
46  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
47  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
48  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
49  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
50  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
51  org.activestate.ActivePython27  0x0000000100077a28 slot_tp_init + 88
52  org.activestate.ActivePython27  0x0000000100074e25 type_call + 245
53  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
54  org.activestate.ActivePython27  0x00000001000c267d PyEval_EvalFrameEx + 11997
55  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
56  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
57  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
58  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
59  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98   
60  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
61  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
62  org.activestate.ActivePython27  0x0000000100077dfa slot_tp_call + 74
63  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
64  org.activestate.ActivePython27  0x00000001000c267d PyEval_EvalFrameEx + 11997
65  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
66  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
67  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
68  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
69  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
70  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
71  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
72  org.activestate.ActivePython27  0x00000001000c7bf6 PyEval_EvalCode + 54
73  org.activestate.ActivePython27  0x00000001000ed31e PyRun_FileExFlags + 174
74  org.activestate.ActivePython27  0x00000001000ed5d9 PyRun_SimpleFileExFlags + 489
75  org.activestate.ActivePython27  0x00000001001041dc Py_Main + 2940
76  org.activestate.ActivePython27.app  0x0000000100000ed4 0x100000000 + 3796

Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000100  rbx: 0x00007fff7cd43640  rcx: 0x0000000000000000  rdx: 0x0000000105e00000
rdi: 0x0000000000000008  rsi: 0x0000000105e01000  rbp: 0x00007fff5fbfa370  rsp: 0x00007fff5fbfa350
r8: 0x0000000000000001   r9: 0x0000000105e00000  r10: 0x0000000105e01000  r11: 0x0000000000000000
r12: 0x000000010ba10530  r13: 0x000000010b000000  r14: 0x00000001066d1970  r15: 0x00007fff915311af
rip: 0x00007fff91534c90  rfl: 0x0000000000010206  cr2: 0x0000000000000110

Logical CPU:     2
Error Code:      0x00000006
Trap Number:     14

.........
VM Region Summary:
ReadOnly portion of Libraries: Total=183.7M resident=97.0M(53%) swapped_out_or_unallocated=86.7M(47%)
Writable regions: Total=1.3G written=142.8M(11%) resident=503.6M(39%) swapped_out=0K(0%) unallocated=791.7M(61%)

When I have replaced the second line in my code by:

clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=1)

Then everything works fine except I don't use multiple threads.

My operating system is OSX 10.9.4

My python version is 2.7.8 |Anaconda 2.0.1 (x86_64)| (default, Jul 2 2014, 15:36:00) [GCC 4.2.1 (Apple Inc. build 5577)]

My scikit-lern version is 0.14.1

My numpy version is 1.8.1

And my scipy version is 0.14.0

My question is if anybody has an idea how to make GridSearchCV run on more than one thread?

EDIT:

I have realized that actually this error happens only for some of my input data sets. Unfortunately the problematic datasets (its X) are too big so it is not possible to copy them in here. Input features data is basically tf-idf vectors and y vectors are floats > 0, particularly:

[60.0, 7.0, 12.0, 21.0, 5.5, 3.0, 0.0, 2.5, 11.0, 3.0, 16.0, 2.0, 0.0, 4.5, 2.5, 6.0, 9.5, 2.5, 15.0, 7.0, 8.0, 13.0, 14.0, 8.0, 3.5, 6.0, 22.5, 7.0, 4.0, 3.5, 4.5, 6.0, 5.5, 7.0, 2.0, 0.0, 0.0, 0.0, 14.5, 8.0, 7.5, 2.5, 11.5, 1.0, 3.0, 14.5, 10.0, 14.5, 8.0, 8.0, 7.0, 2.5, 3.5, 3.0, 13.5, 7.0, 6.5, 2.5, 9.0, 8.0, 11.0, 17.5, 12.5, 4.5, 5.5, 8.0, 2.0, 7.0, 4.0, 1.5, 3.0, 21.5, 4.5, 4.0, 7.0, 9.0, 13.5, 8.0, 10.5, 4.5, 1.5, 11.5, 7.5, 11.5, 4.5, 5.0, 7.0, 9.5, 4.0, 4.0, 6.0, 3.5, 4.5, 7.5, 3.5, 3.5, 3.5, 6.0, 5.0, 5.5, 25.0, 6.5, 5.0, 2.0, 2.0, 10.5, 0.0, 6.5, 19.0, 9.0, 1.0, 1.5, 1.0, 0.0, 1.0, 4.5, 2.5, 17.5, 39.5, 7.5, 5.5, 8.0, 1.0, 6.0, 12.0, 10.0, 5.5, 19.0, 4.5, 1.5, 25.5, 4.0, 10.0, 18.5, 9.5, 10.5, 2.5, 6.0, 1.0, 10.0, 8.5, 12.5, 13.5, 5.0, 6.5, 11.0, 4.5, 8.0, 7.5, 11.5, 14.5, 9.0, 3.0, 1.5, 3.5, 5.5, 2.5, 12.5, 6.5, 5.5, 5.0, 0.0, 8.0, 3.0, 14.5, 5.0, 14.0, 7.0, 13.5, 12.5, 4.0, 1.5, 6.5, 10.5, 9.0, 16.5, 4.0, 4.0, 15.0, 11.5, 2.5, 8.5, 3.0, 5.0, 4.0, 8.5, 6.0, 5.0, 5.0, 5.0, 5.5, 8.0, 11.0, 4.0, 0.0, 5.5, 0.0, 4.5, 1.5, 0.0, 6.5, 11.0, 2.5, 8.0, 15.5, 5.5, 4.5, 5.0, 4.0, 5.5, 10.5, 7.5, 6.5, 8.5, 2.5, 1.5, 1.5, 18.0, 15.0, 14.0, 9.5, 5.5, 7.5, 14.5, 2.5, 5.0, 60.0, 6.5, 14.5, 6.5, 4.0, 1.5, 2.0, 4.0, 27.0, 3.0, 5.0, 4.0, 2.5, 1.0, 1.5, 1.5, 9.0, 4.0, 8.5, 4.0, 4.0, 0.0, 1.5, 7.5, 1.5, 7.5, 1.0, 28.5, 15.5, 7.5, 1.0, 2.5, 2.5, 2.5, 16.0, 5.5, 8.5, 4.0, 2.5, 5.0, 2.5, 6.0, 11.0, 10.0, 4.5, 6.5, 8.0, 6.0, 4.5, 15.5, 4.0, 5.0]

The version with 1 job works for all of my input data sets, even for this one.

like image 400
ziky90 Avatar asked Aug 12 '14 08:08

ziky90


2 Answers

libdispatch.dylib from Grand Central Dispatch is used internally by OSX's builtin implementation of BLAS called Accelerate when you do a numpy.dot calls. The GCD runtime does not work when programs call the POSIX fork syscall without using an exec syscall afterwards and therefore makes all Python programs that use the multiprocessing module prone to crash. sklearn's GridsearchCV uses the Python multiprocessing module for parallelization.

Under Python 3.4 and later you can force Python multiprocessing to use the forkserver start method instead of the default fork mode to workaround this problem, for instance at the beginning of the main file of your program:

if __name__ == "__main__":
    import multiprocessing as mp; mp.set_start_method('forkserver')

Alternatively, you can rebuild numpy from source and make it link against ATLAS or OpenBLAS instead of OSX Accelerate. The numpy developers are working on binary distributions that include either ATLAS or OpenBLAS by default.

like image 138
ogrisel Avatar answered Nov 07 '22 17:11

ogrisel


That worked perfectly for me as well (upgrading was a bit of a drag but this was the only fix, of many attempted, that worked in my case). For any other ipython notebook users out there, the best way to work this in is to add it to the notebook configuration (you'll get an error trying to run it straight in a notebook). The commands can be added like this:

# in ipython_notebook_config.py
c.IPKernelApp.exec_lines = ['import multiprocessing', 'multiprocessing.set_start_method("forkserver")']
like image 24
Eric Czech Avatar answered Nov 07 '22 17:11

Eric Czech