I'm using Ray & RLlib to train RL agents on an Ubuntu system. Tensorboard is used to monitor the training progress by pointing it to ~/ray_results
where all the log files for all runs are stored. Ray Tune is not being used.
For example, on starting a new Ray/RLlib training run, a new directory will be created at
~/ray_results/DQN_ray_custom_env_2020-06-07_05-26-32djwxfdu1
To visualize the training progress, we need to start Tensorboard using
tensorboard --logdir=~/ray_results
Question: Is it possible to configure Ray/RLlib to change the output directory of the log files from ~/ray_results
to another location?
Additionally, instead of logging to a directory named something like DQN_ray_custom_env_2020-06-07_05-26-32djwxfdu1
, can this directory name by set by ourselves?
Failed Attempt: Tried setting
os.environ['TUNE_RESULT_DIR'] = '~/another_dir`
before running ray.init()
, but the result log files were still being written to ~/ray_results
.
Is it possible to configure Ray/RLlib to change the output directory of the log files from ~/ray_results to another location?
There is currently no way to configure this using RLib CLI tool (rllib
).
If you're okay with Python API, then, as described in documentation, local_dir
parameter of tune.run
is responsible for specifying output directory, default is ~/ray_results
.
Additionally, instead of logging to a directory named something like DQN_ray_custom_env_2020-06-07_05-26-32djwxfdu1, can this directory name by set by ourselves?
This is governed by trial_name_creator
parameter of tune.run
. It must be a function that accepts trial object and formats it into a string like so:
def trial_name_id(trial):
return f"{trial.trainable_name}_{trial.trial_id}"
tune.run(...trial_name_creator=trial_name_id)
Without using Tune, you can change the logdir using rllib's "Trainer". The "Trainer" class takes in an optional "logger_creator" if you want to specify where to save the log (see here).
A concrete example:
def custom_log_creator(custom_path, custom_str):
timestr = datetime.today().strftime("%Y-%m-%d_%H-%M-%S")
logdir_prefix = "{}_{}".format(custom_str, timestr)
def logger_creator(config):
if not os.path.exists(custom_path):
os.makedirs(custom_path)
logdir = tempfile.mkdtemp(prefix=logdir_prefix, dir=custom_path)
return UnifiedLogger(config, logdir, loggers=None)
return logger_creator
trainer = PPOTrainer(config=config, env='CartPole-v0',
logger_creator=custom_log_creator(os.path.expanduser("~/another_ray_results/subdir"), 'custom_dir'))
for i in range(ITER_NUM):
result = trainer.train()
You will find the training results (i.e., TensorBoard events file, params, model, ...) saved under "~/another_ray_results/subdir" with your specified naming convention.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With