Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does multiprocessing Julia break my module imports?

My team is trying to run a library (Cbc with JuMP) with multiprocessing and using the julia -p # argument. Our code is in a julia package and so we can run our code fine using julia --project, it just runs with one process. Trying to specify both at once however julia --project -p 8 breaks our ability to run the project since running using PackageName after results in an error. We also intend to compile this using the PackageCompiler library so getting it to work with a project is necessary.

We have our project in a folder with a src directory, a Project.toml, and a Manifest.toml src contains: main.jl and Solver.jl

Project.toml contains:

name = "Solver"
uuid = "5a323fe4-ce2a-47f6-9022-780aeeac18fe"
authors = ["..."]
version = "0.1.0"

Normally, our project works fine starting this way (single threaded):

julia --project
julia> using Solver
julia> include("src/main.jl")

If we add the -p 8 argument when starting Julia, we get an error upon typing using Solver:

ERROR: On worker 2:
ArgumentError: Package Solver [5a323fe4-ce2a-47f6-9022-780aeeac18fe] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.

We have tried running using Pkg; Pkg.instantiate(); using Solver but this doesn't help as another error just happens later (at the include("src/main.jl") step):

ERROR: LoadError: On worker 2:
ArgumentError: Package Solver not found in current path:
- Run `import Pkg; Pkg.add("Solver")` to install the Solver package.

and then following that suggestion produces another error:

ERROR: The following package names could not be resolved:
 * Solver (not found in project, manifest or registry)
Please specify by known `name=uuid`.

Why does this module import work fine in single process mode, but not with -p 8?

Thanks in advance for your consideration

like image 936
Joshua Stowell Avatar asked Mar 30 '20 16:03

Joshua Stowell


2 Answers

First it is important to note that you are NOT using multi-thread parallelism, you are using distributed parallelism. When you initiate with -p 2 you are launching two different processes that do not share the same memory. Additionally, the project is only being loaded in the master process, that is why the other processes cannot see whatever is in the project. You can learn more about the different kinds of parallelism that Julia offers in the official documentation.

To load the environment in all the workers, you can add this to the beginning of your file.

using Distributed
addprocs(2; exeflags="--project")
@everywhere using Solver
@everywhere include("src/main.jl")

and remove the -p 2 part of the line which you launch julia with. This will load the project on all the processes. The @everywhere macro is used to indicate all the process to perform the given task. This part of the docs explains it.

Be aware, however, that parallelism doesn't work automatically, so if your software is not written with distributed parallelism in mind, it may not get any benefit from the newly launched workers.

like image 185
aramirezreyes Avatar answered Nov 17 '22 01:11

aramirezreyes


There is an issue with Julia when an uncompiled module exists and several parallel processes try to compile it at the same time for the first use.

Hence, if you are running your own module across many processes on a single machine you always need to run in the following way (this assumes that Julia process is run in the same folder where your project is located):

using Distributed, Pkg
@everywhere using Distributed, Pkg
Pkg.activate(".")
@everywhere Pkg.activate(".")
using YourModuleName
@everywhere using YourModuleName

I think this approach is undocumented but I found it experimentally to be most robust. If you do not use my pattern sometimes (not always!) a compiler chase occurs and strange things tend to happen.

Note that if you are running a distributed cluster you need to modify the code above to run the initialization on a single worker from each node and than on all workers.

like image 32
Przemyslaw Szufel Avatar answered Nov 17 '22 03:11

Przemyslaw Szufel