I have been trying to use multiprocessing module from python to achieve parallism on a task that is computationally expensive.
I'm able to execute my code, however it doesn't run in parallel. I have been reading multiprocessing's manual page and foruns to find out why it isn't working and i haven't figured it out yet.
I think that the problem may be related with some kinda of lock on executing other modules that i created and imported.
Here is my code:
main.py:
##import my modules
import prepare_data
import filter_part
import wrapper_part
import utils
from myClasses import ML_set
from myClasses import data_instance
n_proc = 5
def main():
if __name__ == '__main__':
##only main process should run this
data = prepare_data.import_data() ##read data from file
data = prepare_data.remove_and_correct_outliers(data)
data = prepare_data.normalize_data_range(data)
features = filter_part.filter_features(data)
start_t = time.time()
##parallelism will be used on this part
best_subset = wrapper_part.wrapper(n_proc, data, features)
print time.time() - start_t
main()
wrapper_part.py:
##my modules
from myClasses import ML_set
from myClasses import data_instance
import utils
def wrapper(n_proc, data, features):
p_work_list = utils.divide_features(n_proc-1, features)
n_train, n_test = utils.divide_data(data)
workers = []
for i in range(0,n_proc-1):
print "sending process:", i
p = mp.Process(target=worker_classification, args=(i, p_work_list[i], data, features, n_train, n_test))
workers.append(p)
p.start()
for worker in workers:
print "waiting for join from worker"
worker.join()
return
def worker_classification(id, work_list, data, features, n_train, n_test):
print "Worker ", id, " starting..."
best_acc = 0
best_subset = []
while (work_list != []):
test_subset = work_list[0]
del(work_list[0])
train_set, test_set = utils.cut_dataset(n_train, n_test, data, test_subset)
_, acc = classification_decision_tree(train_set, test_set)
if acc > best_acc:
best_acc = acc
best_subset = test_subset
print id, " found best subset -> ", best_subset, " with accuracy: ", best_acc
All the other modules dont use the multiprocessing module and work fine. At this stage i'm just testing paralelism, not even trying to get the results back, thus there isn't any communication between processes nor shared memory variables. Some variables are used by every process, however they are defined before spawning the processes so as far as my knowledge goes, i believe each process has its own copy of the variable.
As output for 5 processes i get this:
importing data from file...
sending process: 0
sending process: 1
Worker 0 starting...
0 found best subset -> [2313] with accuracy: 60.41
sending process: 2
Worker 1 starting...
1 found best subset -> [3055] with accuracy: 60.75
sending process: 3
Worker 2 starting...
2 found best subset -> [3977] with accuracy: 62.8
waiting for join from worker
waiting for join from worker
waiting for join from worker
waiting for join from worker
Worker 3 starting...
3 found best subset -> [5770] with accuracy: 60.07
55.4430000782
It took around 55 seconds for 4 processes to execute the parallel part. Testing this with only 1 process the execution time is 16 seconds:
importing data from file...
sending process: 0
waiting for join from worker
Worker 0 starting...
0 found best subset -> [5870] with accuracy: 63.32
16.4409999847
Im running this on python 2.7 and windows 8
EDIT
I tested my code on ubuntu and it worked, i guess its something wrong with windows 8 and python. Here is the output on ubuntu:
importing data from file...
size trainset: 792 size testset: 302
sending process: 0
sending process: 1
Worker 0 starting...
sending process: 2
Worker 1 starting...
sending process: 3
Worker 2 starting...
waiting for join from worker
Worker 3 starting...
2 found best subset -> [5199] with accuracy: 60.93
1 found best subset -> [3198] with accuracy: 60.93
0 found best subset -> [1657] with accuracy: 61.26
waiting for join from worker
waiting for join from worker
waiting for join from worker
3 found best subset -> [5985] with accuracy: 62.25
6.1428809166
I'll start using ubuntu to test from now on, however i would like to know why the code doesn't work on windows.
Multiprocessing in Python enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel. Multiprocessing enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel. This parallelization leads to significant speedup in tasks that involve a lot of computation.
Executing Multiple Functions at the Same Time As we said earlier, we need to use the if __name__ == “__main__” pattern to successfully run the functions in parallel. In our main function we'll start the two functions we made earlier as Process 's.
Key Takeaways. Python is NOT a single-threaded language. Python processes typically use a single thread because of the GIL. Despite the GIL, libraries that perform computationally heavy tasks like numpy, scipy and pytorch utilise C-based implementations under the hood, allowing the use of multiple cores.
There are several common ways to parallelize Python code. You can launch several application instances or a script to perform jobs in parallel. This approach is great when you don't need to exchange data between parallel jobs.
Make sure to read the Windows guidelines in the multiprocessing
manual: https://docs.python.org/2/library/multiprocessing.html#windows
Especially "Safe importing of main module":
Instead one should protect the “entry point” of the program by using
if __name__ == '__main__':
as follows:
You violated this rule within the first code snippet shown above, so I did not look further than this. Hopefully the solution to the problems you observe is as simple as including this protection.
The reason why this is important: on Unix-like systems, child processes are created by forking. In this case, the operating system creates an exact copy of the process that creates the fork. That is, all state is inherited from the parent by the child. For instance, this means that all functions and classes are defined.
On Windows, there is no such system call. Python needs to perform the quite heavy task of creating a fresh Python interpreter session in the child, and re-create (step by step) the state of the parent. For instance, all functions and classes need to be defined again. That is why heavy import
machinery is going on under the hood of a Python multiprocessing child on Windows. This machinery starts when the child imports the main module. In your case, this implicates a call to main()
in the child! For sure, you do not want that.
You might find this tedious. I find impressive that the multiprocessing
module manages to provide an interface for same functionality for two so very different platforms. Really, with respect to process handling, POSIX-compliant operating systems and Windows are so different, that it is inherently difficult to come up with an abstraction that works on both.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With