I'm developing a Python project for dealing with computer simulations, and I'm also developing a GUI for it. (The core logic itself does not require a GUI.) The GUI toolkit I use for is wxPython, but I think my question is general enough not to depend on it.
The way that the GUI currently works is that it starts the core logic package (called garlicsim
) on the same process and the same thread as the GUI. This works, but I understand it's a problematic approach, because if the core logic needs to do some hard computation, the GUI will hang, which I consider unacceptable.
What should I do?
I heard about the option of launching the core logic on a separate process from the GUI. This sounds interesting, but I have a lot of questions about this.
multiprocessing
package or the subprocess
package to launch the new process?You might find some inspiration here: http://wiki.wxpython.org/LongRunningTasks, however it is for multithreading, not multiprocessing.
The basic idea
You may even drive the i/o communication through a socket, this would let easy network management of the simulation.
Edit: I just saw the 2.6-new multiprocessing package you mentioned. Seems a nice pick, you could use queues to communicate between process then. This is a tighter coupling, you can choose based on your needs.
To answer the specific questions.
"Do I use the multiprocessing
package or the subprocess
package to launch the new process?"
Use multiprocessing
"How do I have easy access to the simulation data from the GUI process?"
You don't have access to the simulation processes objects, if that's what you're asking The simulation is a separate process. You can start it, stop it, and -- most importantly -- make requests via a queue of commands that go to the simulator.
"The user should be able to browse through the timeline of the simulation easily and smoothly. How can this be done?"
This is just design. Single process, multiple processes, multiple threads don't have any impact on this question at all.
Each simulation must have some parameters, it must start, it must produce a log (or timeline). That has to be done no matter what library you use to start and stop the simulation.
The output from the simulation -- which is input to your GUI -- can be done a million ways.
Database. The simulation timeline could be inserted into a SQLite database and queried by the GUI. This doesn't work out terribly well because SQLite doesn't have really clever locking. But it does work.
File. The simulation timeline is written to a file. The GUI reads the file. This works out really, really well.
Request/Reply. The simulation has multiple threads, one of which is dequeueing commands and responding by -- for example -- sending back the timeline up to the moment, or stopping the simulation or changing parameters and restarting it.
The simplest approach that can work for you here is launch the computation in a separate thread, and communicate data between this thread and the GUI using Queue
objects. These are completely safe and very convenient for inter-thread communication.
Other solutions are more complex - you may end up running the simulation in a completely separate "server" process and communicate with sockets with the main GUI.
Unfortunately, although you're right that the choice of GUI doesn't affect the answer, the best approach to this problem will depend a lot on what exactly your simulation data is doing.
For example, if it generates sequential data then it can feed it to your GUI via a thread-safe or process-safe queue. But if it mutates the whole data and your GUI needs to be able to see a snapshot at any given time, that might be too expensive to solve by sending the whole state along the queue and might require a mutex-style approach instead to share access to the data structure. So the nature of the work done on your data is paramount here.
As for whether to use multiprocessing or subprocess, that depends on whether you have a completely separate program or not handling the data. The former is for doing multiprocessing in the style of multithreading - it is different parts of the same program running in multiple processes. The latter is when one program wants to run another (which could be a copy of the program, but usually is not). Again, it's hard to know which is the best approach for your specific situation, although it does sound like you could have the core logic as a command line application and communicate via pipes, sockets, etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With