Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

purpose of 'num_samples' in Tune of Ray package for hyperprameter optimization

I am trying to perform a hyper parameter optimization task for a LSTM (pure Tensorflow) with Tune. I followed their example on the hyperopt algorithm. In the example they have used the below line inside the 'config' section.

"num_samples": 10 if args.smoke_test else 1000,

The documenatation does not explain what this is. I am uable to determine if this is a useful piece of code or how am I supposed to alter this for my scenario. So it will be great if I can know the meaning of this line of code.

The example hyperopt code can be found through this link

like image 647
Suleka_28 Avatar asked Jan 01 '19 16:01

Suleka_28


People also ask

What is Num_samples in Ray tune?

num_samples – Number of times to sample from the hyperparameter space. Defaults to 1. If grid_search is provided as an argument, the grid will be repeated num_samples of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.

What is Num_samples?

num_samples (int) – Number of times to sample from the hyperparameter space.

What is Ray tune?

Ray Tune is a Python library that accelerates hyperparameter tuning by allowing you to leverage cutting edge optimization algorithms at scale. Behind most of the major flashy results in machine learning is a graduate student (me) or engineer spending hours training a model and tuning algorithm parameters.


2 Answers

You can find the parameter in the documentation of run_experiments.

By default, each random variable and grid search point is sampled once. To take multiple random samples, add num_samples: N to the experiment config. If grid_search is provided as an argument, the grid will be repeated num_samples of times.

Essentially the parameter is part of the configuration and can be used to sample your data multiple times instead of only once.

Your demo code however uses run_experiment:

config = {
    "my_exp": {
        "run": "exp",
        "num_samples": 10 if args.smoke_test else 1000,
        "config": {
            "iterations": 100,
        },
        "stop": {
            "timesteps_total": 100
        },
    }
}
algo = HyperOptSearch(space, max_concurrent=4, reward_attr="neg_mean_loss")
scheduler = AsyncHyperBandScheduler(reward_attr="neg_mean_loss")
run_experiments(config, search_alg=algo, scheduler=scheduler)  # here the config is passed
like image 150
Patrick Artner Avatar answered Oct 19 '22 13:10

Patrick Artner


As per the documentation:

num_samples (int) – Number of times to sample from the hyperparameter space. Defaults to 1. If grid_search is provided as an argument, the grid will be repeated num_samples of times.

Substitue of repeat:

repeat (int) – Deprecated and will be removed in future versions of Ray. Use num_samples instead

Usage:

"num_samples": 10

num_samples=10

class ray.tune.Experiment(name,run,stop=None,config=None,trial_resources=None,
repeat=1,num_samples=1,local_dir=None,upload_dir=None,checkpoint_freq=0,
checkpoint_at_end=False,max_failures=3,restore=None)
like image 29
Sonal Borkar Avatar answered Oct 19 '22 14:10

Sonal Borkar