Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boost python multiple modules in one shared object

I'm trying to create via boost python a package which will include several modules.

The reasoning is that we want to expose a very large API, and it makes sense to group it in different modules for ease of use and to preserve python memory usage. On the other hand we are forced (for reasons beyond the scope of this question to compile this into a single shared object)

So I create with boost python a package which exports several modules, as following:

void exportClass1()
{
    namespace bp = boost::python;
    // map the IO namespace to a sub-module
    // make "from myPackage.class1 import <whatever>" work
    bp::object class1Module(bp::handle<>(bp::borrowed(PyImport_AddModule("myPackage.class1"))));
    // make "from mypackage import class1" work
    bp::scope().attr("class1") = class1Module;
    // set the current scope to the new sub-module
    bp::scope io_scope = class1Module;

    // export stuff in the class1 namespace

    class_<class1 >("class1", init<>())
    .
    .   CLASS SPECIFICS GO HERE
    .

    Other class of module class1 go here as well
}

BOOST_PYTHON_MODULE(myPackage)
{
    namespace bp = boost::python;

    // specify that this module is actually a package
    bp::object package = bp::scope();
    package.attr("__path__") = "myPackage";

    exportClass1();
    exportClass2();
    .
    .
    .

}

This code works.

The main problem is memory consumption. The overall exposed api is very big so loading the entire package consumes roughly 65MB of ram, just for all the declarations. (before the package user started doing anything)

This is of course unacceptable. (given that loading a single module should consume maybe 1-3MB of ram)

When in python, if I call:

from myPackage.myModule import *

OR

from myPackage.myModule import someClass

The memory consumption immidietly skyrockets to 65MB.

After doing any of the imports if I call: sys.modules I see all the classes in my package as being "known" However if I run:

from myPackage.myModule import class1
c = class2()

I get an error:

NameError: name 'class2' is not defined

So it seems I get the worst of two worlds, on the one hands I consume memory as if I imported everything from my package, on the other hand I don't get the classes actually imported.

Any ideas how to solve this, so that when I import a specific module only it will be imported, and not all the package data will be read to the python memory. (which both takes time and consumes a lot of valuable memory)

like image 365
Max Shifrin Avatar asked Mar 12 '15 09:03

Max Shifrin


1 Answers

So this was much simpler than I assumed.

The code above is correct also for making the calls in the form of:

from myPackage.myModule import class1
c = class2()

What was preventing this from being executed correctly were the system paths. The shared object was not placed in a location of the python path and did not have an __init__.py in the folder where it was placed.

As soon as the shared object was placed in the correct site-packages folder, which of course has __init__.py the above example works correctly.

like image 158
Max Shifrin Avatar answered Oct 23 '22 18:10

Max Shifrin