I'm trying to model a biochemical process, and I structured my question as an optimization problem, that I solve using differential_evolution
from scipy.
So far, so good, I'm pretty happy with the implementation of a simplified model with 15-19 parameters.
I expanded the model and now, with 32 parameters, is taking way too long. Not totally unexpected, but still an issue, hence the question.
I've seen:
- an almost identical question for R Parallel differential evolution
- and a github issue https://github.com/scipy/scipy/issues/4864 on the topic
but it would like to stay in python (the model is within a python pipeline), and the pull request did not lead to and officially accepted solution yet, although some options have been suggested.
Also, I can't parallelize the code within the function to be optimised because is a series of sequential calculations each requiring the result of the previous step. The ideal option would be to have something that evaluates some individuals in parallel and return them to the population.
Summing up:
- Is there any option within scipy that allows parallelization of differential_evolution that I dumbly overlooked? (Ideal solution)
- Is there a suggestion for an alternative algorithm in scipy that is either (way) faster in serial or possible to parallelize?
- Is there any other good package that offers parallelized differential evolution funtions? Or other applicable optimization methods?
- Sanity check: am I overloading DE with 32 parameter and I need to radically change approach?
PS
I'm a biologist, formal math/statistics isn't really my strenght, any formula-to-english translation would be hugely appreciated :)
PPS
As an extreme option I could try to migrate to R, but I can't code C/C++ or other languages.
Thread-based parallelism in Python. A multi-threaded program consists of sub-programs each of which is handled separately by different threads. Multi-threading allows for parallelism in program execution. All the active threads run concurrently, sharing the CPU resources effectively and thereby, making the program execution faster.
This article covers the basics of multithreading in Python programming language. Just like multiprocessing, multithreading is a way of achieving multitasking. In multithreading, the concept of threads is used. Let us first understand the concept of thread in computer architecture. Attention geek!
However, some functions can still virtually run in parallel. Python allows this with two different concepts: multithreading and multiprocessing. In this post we’ll go over how to run two functions virtually in parallel with multiprocessing. Multiprocessing is a native Python library that supports process based parallelism.
Differential Evolution Algorithm on the Sphere Function Differential evolution is a heuristic approach for the global optimisation of nonlinear and non- differentiable continuous space functions. For a minimisation algorithm to be considered practical, it is expected to fulfil five different requirements:
Scipy differential_evolution
can now be used in parallel extremely easily, by specifying the workers:
workers int or map-like callable, optional
If workers is an int the population is subdivided into workers sections and evaluated in parallel (uses multiprocessing.Pool). Supply -1 to use all available CPU cores. Alternatively supply a map-like callable, such as multiprocessing.Pool.map for evaluating the population in parallel. This evaluation is carried out as workers(func, iterable). This option will override the updating keyword to updating='deferred' if workers != 1. Requires that func be pickleable.
New in version 1.2.0.
scipy.optimize.differential_evolution documentation
Thanks to @jp2011 for pointing to pygmo
First, worth noting the difference from pygmo 1, since the fist link on google still directs to the older version.
Second, Multiprocessing island are available only for python 3.4+
Third, it works. The processes I started when I first asked the question are still running while I write, the pygmo archipelago running an extensive test of all the 18 possible DE variations present in saDE made in less than 3h. The compiled version using Numba as suggested here https://esa.github.io/pagmo2/docs/python/tutorials/coding_udp_simple.html will probably finish even earlier. Chapeau.
I personally find it a bit less intuitive than the scipy version, given the need to build a new class (vs a signle function in scipy) to define the problem but is probably just a personal preference. Also, the mutation/crossing over parameters are defined less clearly, for someone approaching DE for the first time might be a bit obscure.
But, since serial DE in scipy just isn't cutting it, welcome pygmo(2).
Additionally I found a couple other options claiming to parallelize DE. I didn't test them myself, but might be useful to someone stumbling on this question.
Platypus, focused on multiobjective evolutionary algorithms https://github.com/Project-Platypus/Platypus
Yabox
https://github.com/pablormier/yabox
from Yabox creator a detailed, yet IMHO crystal clear, explaination of DE https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/
I've been having exactly the same problem. Perhaps, you could try pygmo, which does support different optimisation algorithms (including DE) and has a model for parallel computation. However, I'm finding that the community is not big as it is for scipy. Their tutorials, documentation, and examples are good quality and one can get things to work from that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With