Memory leak for Optuna trial with multiprocessing

Question

The Background

I have a machine learning pipeline that consists of N boosted models (LGBMRegressor), each with identical hyperparameters. Each of the N LGBMRegressors is trained on a separate chunk of data. My current workstation has a lot of cores, so I multiprocess each regressor on a separate thread.

The Problem

I am trying to tune the parameters that go into the LGBMRegressors through optuna. When I use the multiprocessing inside an optuna trial, it has a memory leak and I run out of memory. Can I use multiprocessing inside an optuna trial and not run into a memory leak?

Minimal Reproducible Example

import optuna
import pandas as pd
import numpy as np
import multiprocessing
from lightgbm import LGBMRegressor

N = 500
n_cores = 30
rows_per_N = 1000
cols_per_N=50
data = [ [np.random.normal(size=(rows_per_N, cols_per_N)), np.random.normal(size=(rows_per_N, ))] for i in range(N)]

def get_metric(data):
    (X, y), params = data
    model =LGBMRegressor(**params)
    model.fit(X, y)
    return np.abs( model.predict(X) - y )


def objective(trial):
    param = {
        "n_jobs": "1",
        "num_leaves": trial.suggest_int("num_leaves", 2, 256)
    }
    lgb_params = [param for _ in range(N)]
    p = multiprocessing.Pool(n_cores)
    results = p.map(get_metric, zip(data,lgb_params))
    return np.mean(results)


study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=100)

Alternative Solutions

I have written the above code as a for loop and it has not had memory issues. The drawback here is that this is 30x slower than the multiprocessed solution.

vht981230 · Accepted Answer

As suggested by @J. M. Arnold, I think you can should use context manager for the pool to make sure it is closed to avoid potential memory leak. Additionally, from the documentation, you can avoid out of memory error by periodically running the garbage collector. You can do that by setting the parameter gc_after_trial to True in study.optimize method

import optuna
import pandas as pd
import numpy as np
import multiprocessing
from lightgbm import LGBMRegressor

N = 500
n_cores = 30
rows_per_N = 1000
cols_per_N=50
data = [ [np.random.normal(size=(rows_per_N, cols_per_N)), np.random.normal(size=(rows_per_N, ))] for i in range(N)]

def get_metric(data):
    (X, y), params = data
    model =LGBMRegressor(**params)
    model.fit(X, y)
    return np.abs( model.predict(X) - y )


def objective(trial):
    param = {
        "n_jobs": "1",
        "num_leaves": trial.suggest_int("num_leaves", 2, 256)
    }
    lgb_params = [param for _ in range(N)]
    with multiprocessing.pool.Pool(n_cores) as p:
        results = p.map(get_metric, zip(data,lgb_params))
        return np.mean(results)


study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=100, gc_after_trial=True)

Memory leak for Optuna trial with multiprocessing

Tags:

python

multiprocessing

optuna

The Background

The Problem

Minimal Reproducible Example

Alternative Solutions

Ottpocket

1 Answers

vht981230

Recent Activity

Donate For Us

Memory leak for Optuna trial with multiprocessing

Tags:

python

multiprocessing

optuna

The Background

The Problem

Minimal Reproducible Example

Alternative Solutions

Ottpocket

1 Answers

vht981230

Related questions

Recent Activity

Donate For Us