This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Define a sweep configuration

Learn how to create configuration files for sweeps.

1: Sweep configuration options

A W&B Sweep combines a strategy for exploring hyperparameter values with the code that evaluates them. The strategy can be as simple as trying every option or as complex as Bayesian Optimization and Hyperband (BOHB).

Define a sweep configuration either in a Python dictionary or a YAML file. How you define your sweep configuration depends on how you want to manage your sweep.

Define your sweep configuration in a YAML file if you want to initialize a sweep and start a sweep agent from the command line. Define your sweep in a Python dictionary if you initialize a sweep and start a sweep entirely within a Python script or notebook.

The following guide describes how to format your sweep configuration. See Sweep configuration options for a comprehensive list of top-level sweep configuration keys.

Basic structure

Both sweep configuration format options (YAML and Python dictionary) utilize key-value pairs and nested structures.

Use top-level keys within your sweep configuration to define qualities of your sweep search such as the name of the sweep (name key), the parameters to search through (parameters key), the methodology to search the parameter space (method key), and more.

For example, the proceeding code snippets show the same sweep configuration defined within a YAML file and within a Python dictionary. Within the sweep configuration there are five top level keys specified: program, name, method, metric and parameters.

Define a sweep configuration in a YAML file if you want to manage sweeps interactively from the command line (CLI)

program: train.py
name: sweepdemo
method: bayes
metric:
  goal: minimize
  name: validation_loss
parameters:
  learning_rate:
    min: 0.0001
    max: 0.1
  batch_size:
    values: [16, 32, 64]
  epochs:
    values: [5, 10, 15]
  optimizer:
    values: ["adam", "sgd"]

Define a sweep in a Python dictionary data structure if you define training algorithm in a Python script or notebook.

The proceeding code snippet stores a sweep configuration in a variable named sweep_configuration:

sweep_configuration = {
    "name": "sweepdemo",
    "method": "bayes",
    "metric": {"goal": "minimize", "name": "validation_loss"},
    "parameters": {
        "learning_rate": {"min": 0.0001, "max": 0.1},
        "batch_size": {"values": [16, 32, 64]},
        "epochs": {"values": [5, 10, 15]},
        "optimizer": {"values": ["adam", "sgd"]},
    },
}

Within the top level parameters key, the following keys are nested: learning_rate, batch_size, epoch, and optimizer. For each of the nested keys you specify, you can provide one or more values, a distribution, a probability, and more. For more information, see the parameters section in Sweep configuration options.

Double nested parameters

Sweep configurations support nested parameters. To delineate a nested parameter, use an additional parameters key under the top level parameter name. Sweep configs support multi-level nesting.

Specify a probability distribution for your random variables if you use a Bayesian or random hyperparameter search. For each hyperparameter:

Create a top level parameters key in your sweep config.
Within the parameterskey, nest the following:
1. Specify the name of hyperparameter you want to optimize.
2. Specify the distribution you want to use for the distribution key. Nest the distribution key-value pair underneath the hyperparameter name.
3. Specify one or more values to explore. The value (or values) should be inline with the distribution key.
  1. (Optional) Use an additional parameters key under the top level parameter name to delineate a nested parameter.

Nested parameters defined in sweep configuration overwrite keys specified in a W&B run configuration.

For example, suppose you initialize a W&B run with the following configuration in a train.py Python script (see Lines 1-2). Next, you define a sweep configuration in a dictionary called sweep_configuration (see Lines 4-13). You then pass the sweep config dictionary to wandb.sweep to initialize a sweep config (see Line 16).

def main():
    run = wandb.init(config={"nested_param": {"manual_key": 1}})


sweep_configuration = {
    "top_level_param": 0,
    "nested_param": {
        "learning_rate": 0.01,
        "double_nested_param": {"x": 0.9, "y": 0.8},
    },
}

# Initialize sweep by passing in config.
sweep_id = wandb.sweep(sweep=sweep_configuration, project="<project>")

# Start sweep job.
wandb.agent(sweep_id, function=main, count=4)

The nested_param.manual_key that is passed when the W&B run is initialized is not accessible. The wandb.Run.config only possess the key-value pairs that are defined in the sweep configuration dictionary.

Sweep configuration template

The following template shows how you can configure parameters and specify search constraints. Replace hyperparameter_name with the name of your hyperparameter and any values enclosed in <>.

program: <insert>
method: <insert>
parameter:
  hyperparameter_name0:
    value: 0  
  hyperparameter_name1: 
    values: [0, 0, 0]
  hyperparameter_name: 
    distribution: <insert>
    value: <insert>
  hyperparameter_name2:  
    distribution: <insert>
    min: <insert>
    max: <insert>
    q: <insert>
  hyperparameter_name3: 
    distribution: <insert>
    values:
      - <list_of_values>
      - <list_of_values>
      - <list_of_values>
early_terminate:
  type: hyperband
  s: 0
  eta: 0
  max_iter: 0
command:
- ${Command macro}
- ${Command macro}
- ${Command macro}
- ${Command macro}

To express a numeric value using scientific notation, add the YAML !!float operator, which casts the value to a floating point number. For example, min: !!float 1e-5. See Command example.

Sweep configuration examples

program: train.py
method: random
metric:
  goal: minimize
  name: loss
parameters:
  batch_size:
    distribution: q_log_uniform_values
    max: 256 
    min: 32
    q: 8
  dropout: 
    values: [0.3, 0.4, 0.5]
  epochs:
    value: 1
  fc_layer_size: 
    values: [128, 256, 512]
  learning_rate:
    distribution: uniform
    max: 0.1
    min: 0
  optimizer:
    values: ["adam", "sgd"]

sweep_config = {
    "method": "random",
    "metric": {"goal": "minimize", "name": "loss"},
    "parameters": {
        "batch_size": {
            "distribution": "q_log_uniform_values",
            "max": 256,
            "min": 32,
            "q": 8,
        },
        "dropout": {"values": [0.3, 0.4, 0.5]},
        "epochs": {"value": 1},
        "fc_layer_size": {"values": [128, 256, 512]},
        "learning_rate": {"distribution": "uniform", "max": 0.1, "min": 0},
        "optimizer": {"values": ["adam", "sgd"]},
    },
}

Bayes hyperband example

program: train.py
method: bayes
metric:
  goal: minimize
  name: val_loss
parameters:
  dropout:
    values: [0.15, 0.2, 0.25, 0.3, 0.4]
  hidden_layer_size:
    values: [96, 128, 148]
  layer_1_size:
    values: [10, 12, 14, 16, 18, 20]
  layer_2_size:
    values: [24, 28, 32, 36, 40, 44]
  learn_rate:
    values: [0.001, 0.01, 0.003]
  decay:
    values: [1e-5, 1e-6, 1e-7]
  momentum:
    values: [0.8, 0.9, 0.95]
  epochs:
    value: 27
early_terminate:
  type: hyperband
  s: 2
  eta: 3
  max_iter: 27

The proceeding tabs show how to specify either a minimum or maximum number of iterations for early_terminate:

The brackets for this example are: [3, 3*eta, 3*eta*eta, 3*eta*eta*eta], which equals [3, 9, 27, 81].

early_terminate:
  type: hyperband
  min_iter: 3

The brackets for this example are [27/eta, 27/eta/eta], which equals [9, 3].

early_terminate:
  type: hyperband
  max_iter: 27
  s: 2

Macro and custom command arguments example

For more complex command line arguments, you can use macros to pass environment variables, the Python interpreter, and additional arguments. W&B supports pre defined macros and custom command line arguments that you can specify in your sweep configuration.

For example, the following sweep configuration (sweep.yaml) defines a command that runs a Python script (run.py) with the ${env}, ${interpreter}, and ${program} macros replaced with the appropriate values when the sweep runs.

The --batch_size=${batch_size}, --test=True, and --optimizer=${optimizer} arguments use custom macros to pass the values of the batch_size, test, and optimizer parameters defined in the sweep configuration.

program: run.py
method: random
metric:
  name: validation_loss
parameters:
  learning_rate:
    min: 0.0001
    max: 0.1
command:
  - ${env}
  - ${interpreter}
  - ${program}
  - "--batch_size=${batch_size}"
  - "--optimizer=${optimizer}"
  - "--test=True"

The associated Python script (run.py) can then parse these command line arguments using the argparse module.

# run.py  
import wandb
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--batch_size', type=int)
parser.add_argument('--optimizer', type=str, choices=['adam', 'sgd'], required=True)
parser.add_argument('--test', type=str2bool, default=False)
args = parser.parse_args()

# Initialize a W&B Run
with wandb.init('test-project') as run:
    run.log({'validation_loss':1})

See the Command macros section in Sweep configuration options for a list of pre-defined macros you can use in your sweep configuration.

Boolean arguments

The argparse module does not support boolean arguments by default. To define a boolean argument, you can use the action parameter or use a custom function to convert the string representation of the boolean value to a boolean type.

As an example, you can use the following code snippet to define a boolean argument. Pass store_true or store_false as an argument to ArgumentParser.

import wandb
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--test', action='store_true')
args = parser.parse_args()

args.test  # This will be True if --test is passed, otherwise False

You can also define a custom function to convert the string representation of the boolean value to a boolean type. For example, the following code snippet defines the str2bool function, which converts a string to a boolean value.

def str2bool(v: str) -> bool:
  """Convert a string to a boolean. This is required because
  argparse does not support boolean arguments by default.
  """
  if isinstance(v, bool):
      return v
  return v.lower() in ('yes', 'true', 't', '1')

1 - Sweep configuration options

A sweep configuration consists of nested key-value pairs. Use top-level keys within your sweep configuration to define qualities of your sweep search such as the parameters to search through (parameter key), the methodology to search the parameter space (method key), and more.

The proceeding table lists top-level sweep configuration keys and a brief description. See the respective sections for more information about each key.

Top-level keys	Description
`program`	(required) Training script to run
`entity`	The entity for this sweep
`project`	The project for this sweep
`description`	Text description of the sweep
`name`	The name of the sweep, displayed in the W&B UI.
`method`	(required) The search strategy
`metric`	The metric to optimize (only used by certain search strategies and stopping criteria)
`parameters`	(required) Parameter bounds to search
`early_terminate`	Any early stopping criteria
`command`	Command structure for invoking and passing arguments to the training script
`run_cap`	Maximum number of runs for this sweep

See the Sweep configuration structure for more information on how to structure your sweep configuration.

`metric`

Use the metric top-level sweep configuration key to specify the name, the goal, and the target metric to optimize.

Key	Description
`name`	Name of the metric to optimize.
`goal`	Either `minimize` or `maximize` (Default is `minimize`).
`target`	Goal value for the metric you are optimizing. The sweep does not create new runs when if or when a run reaches a target value that you specify. Active agents that have a run executing (when the run reaches the target) wait until the run completes before the agent stops creating new runs.

`parameters`

In your YAML file or Python script, specify parameters as a top level key. Within the parameters key, provide the name of a hyperparameter you want to optimize. Common hyperparameters include: learning rate, batch size, epochs, optimizers, and more. For each hyperparameter you define in your sweep configuration, specify one or more search constraints.

The proceeding table shows supported hyperparameter search constraints. Based on your hyperparameter and use case, use one of the search constraints below to tell your sweep agent where (in the case of a distribution) or what (value, values, and so forth) to search or use.

Search constraint	Description
`values`	Specifies all valid values for this hyperparameter. Compatible with `grid`.
`value`	Specifies the single valid value for this hyperparameter. Compatible with `grid`.
`distribution`	Specify a probability distribution. See the note following this table for information on default values.
`probabilities`	Specify the probability of selecting each element of `values` when using `random`.
`min`, `max`	(`int`or `float`) Maximum and minimum values. If `int`, for `int_uniform` -distributed hyperparameters. If `float`, for `uniform` -distributed hyperparameters.
`mu`	(`float`) Mean parameter for `normal` - or `lognormal` -distributed hyperparameters.
`sigma`	(`float`) Standard deviation parameter for `normal` - or `lognormal` -distributed hyperparameters.
`q`	(`float`) Quantization step size for quantized hyperparameters.
`parameters`	Nest other parameters inside a root level parameter.

W&B sets the following distributions based on the following conditions if a distribution is not specified:

categorical if you specify values
int_uniform if you specify max and min as integers
uniform if you specify max and min as floats
constant if you provide a set to value

`method`

Specify the hyperparameter search strategy with the method key. There are three hyperparameter search strategies to choose from: grid, random, and Bayesian search.

Grid search

Iterate over every combination of hyperparameter values. Grid search makes uninformed decisions on the set of hyperparameter values to use on each iteration. Grid search can be computationally costly.

Grid search executes forever if it is searching within in a continuous search space.

Random search

Choose a random, uninformed, set of hyperparameter values on each iteration based on a distribution. Random search runs forever unless you stop the process from the command line, within your python script, or the W&B App.

Specify the distribution space with the metric key if you choose random (method: random) search.

Bayesian search

In contrast to random and grid search, Bayesian models make informed decisions. Bayesian optimization uses a probabilistic model to decide which values to use through an iterative process of testing values on a surrogate function before evaluating the objective function. Bayesian search works well for small numbers of continuous parameters but scales poorly. For more information about Bayesian search, see the Bayesian Optimization Primer paper.

Bayesian search runs forever unless you stop the process from the command line, within your python script, or the W&B App.

Distribution options for random and Bayesian search

Within the parameter key, nest the name of the hyperparameter. Next, specify the distribution key and specify a distribution for the value.

The proceeding tables lists distributions W&B supports.

Value for `distribution` key	Description
`constant`	Constant distribution. Must specify the constant value (`value`) to use.
`categorical`	Categorical distribution. Must specify all valid values (`values`) for this hyperparameter.
`int_uniform`	Discrete uniform distribution on integers. Must specify `max` and `min` as integers.
`uniform`	Continuous uniform distribution. Must specify `max` and `min` as floats.
`q_uniform`	Quantized uniform distribution. Returns `round(X / q) * q` where X is uniform. `q` defaults to `1`.
`log_uniform`	Log-uniform distribution. Returns a value `X` between `exp(min)` and `exp(max)`such that the natural logarithm is uniformly distributed between `min` and `max`.
`log_uniform_values`	Log-uniform distribution. Returns a value `X` between `min` and `max` such that `log(`X`)` is uniformly distributed between `log(min)` and `log(max)`.
`q_log_uniform`	Quantized log uniform. Returns `round(X / q) * q` where `X` is `log_uniform`. `q` defaults to `1`.
`q_log_uniform_values`	Quantized log uniform. Returns `round(X / q) * q` where `X` is `log_uniform_values`. `q` defaults to `1`.
`inv_log_uniform`	Inverse log uniform distribution. Returns `X`, where `log(1/X)` is uniformly distributed between `min` and `max`.
`inv_log_uniform_values`	Inverse log uniform distribution. Returns `X`, where `log(1/X)` is uniformly distributed between `log(1/max)` and `log(1/min)`.
`normal`	Normal distribution. Return value is normally distributed with mean `mu` (default `0`) and standard deviation `sigma` (default `1`).
`q_normal`	Quantized normal distribution. Returns `round(X / q) * q` where `X` is `normal`. Q defaults to 1.
`log_normal`	Log normal distribution. Returns a value `X` such that the natural logarithm `log(X)` is normally distributed with mean `mu` (default `0`) and standard deviation `sigma` (default `1`).
`q_log_normal`	Quantized log normal distribution. Returns `round(X / q) * q` where `X` is `log_normal`. `q` defaults to `1`.

`early_terminate`

Use early termination (early_terminate) to stop poorly performing runs. If early termination occurs, W&B stops the current run before it creates a new run with a new set of hyperparameter values.

You must specify a stopping algorithm if you use early_terminate. Nest the type key within early_terminate within your sweep configuration.

Stopping algorithm

W&B currently supports Hyperband stopping algorithm.

Hyperband hyperparameter optimization evaluates if a program should stop or if it should to continue at one or more pre-set iteration counts, called brackets.

When a W&B run reaches a bracket, the sweep compares that run’s metric to all previously reported metric values. The sweep terminates the run if the run’s metric value is too high (when the goal is minimization) or if the run’s metric is too low (when the goal is maximization).

Brackets are based on the number of logged iterations. The number of brackets corresponds to the number of times you log the metric you are optimizing. The iterations can correspond to steps, epochs, or something in between. The numerical value of the step counter is not used in bracket calculations.

Specify either min_iter or max_iter to create a bracket schedule.

Key	Description
`min_iter`	Specify the iteration for the first bracket
`max_iter`	Specify the maximum number of iterations.
`s`	Specify the total number of brackets (required for `max_iter`)
`eta`	Specify the bracket multiplier schedule (default: `3`).
`strict`	Enable ‘strict’ mode that prunes runs aggressively, more closely following the original Hyperband paper. Defaults to false.

Hyperband checks which runs to end once every few minutes. The end run timestamp might differ from the specified brackets if your run or iteration are short.

`command`

Modify the format and contents with nested values within the command key. You can directly include fixed components such as filenames.

On Unix systems, /usr/bin/env ensures that the OS chooses the correct Python interpreter based on the environment.

W&B supports the following macros for variable components of the command:

Command macro	Description
`${env}`	`/usr/bin/env` on Unix systems, omitted on Windows.
`${interpreter}`	Expands to `python`.
`${program}`	Training script filename specified by the sweep configuration `program` key.
`${args}`	Hyperparameters and their values in the form `--param1=value1 --param2=value2`.
`${args_no_boolean_flags}`	Hyperparameters and their values in the form `--param1=value1` except boolean parameters are in the form `--boolean_flag_param` when `True` and omitted when `False`.
`${args_no_hyphens}`	Hyperparameters and their values in the form `param1=value1 param2=value2`.
`${args_json}`	Hyperparameters and their values encoded as JSON.
`${args_json_file}`	The path to a file containing the hyperparameters and their values encoded as JSON.
`${envvar}`	A way to pass environment variables. `${envvar:MYENVVAR}` __ expands to the value of MYENVVAR environment variable. __