> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt
> Use this file to discover all available pages before exploring further.

> Learn how to create configuration files for sweeps.

# Overview

A W\&B Sweep combines a strategy for exploring hyperparameter values with the code that evaluates them. The strategy can be as simple as trying every option or as complex as Bayesian Optimization and Hyperband ([BOHB](https://arxiv.org/abs/1807.01774)).

Define a sweep configuration either in a [Python dictionary](https://docs.python.org/3/tutorial/datastructures.html#dictionaries) or a [YAML](https://yaml.org/) file. How you define your sweep configuration depends on how you want to manage your sweep.

<Note>
  Define your sweep configuration in a YAML file if you want to initialize a sweep and start a sweep agent from the command line. Define your sweep in a Python dictionary if you initialize a sweep and start a sweep entirely within a Python script or notebook.
</Note>

The following guide describes how to format your sweep configuration. See [Sweep configuration options](./sweep-config-keys) for a comprehensive list of top-level sweep configuration keys.

## Basic structure

Both sweep configuration format options (YAML and Python dictionary) utilize key-value pairs and nested structures.

Use top-level keys within your sweep configuration to define qualities of your sweep search such as the name of the sweep ([`name`](./sweep-config-keys) key), the parameters to search through ([`parameters`](./sweep-config-keys#parameters) key), the methodology to search the parameter space ([`method`](./sweep-config-keys#method) key), and more.

For example, the following code snippets show the same sweep configuration defined within a YAML file and within a Python dictionary. Within the sweep configuration there are five top level keys specified: `program`, `name`, `method`, `metric` and `parameters`.

<Tabs>
  <Tab title="CLI">
    Define a sweep configuration in a YAML file if you want to manage sweeps interactively from the command line (CLI)

    ```yaml title="config.yaml" theme={null}
    program: train.py
    name: sweepdemo
    method: bayes
    metric:
      goal: minimize
      name: validation_loss
    parameters:
      learning_rate:
        min: 0.0001
        max: 0.1
      batch_size:
        values: [16, 32, 64]
      epochs:
        values: [5, 10, 15]
      optimizer:
        values: ["adam", "sgd"]
    ```
  </Tab>

  <Tab title="Python script or notebook">
    Define a sweep in a Python dictionary data structure if you define training algorithm in a Python script or notebook.

    The following code snippet stores a sweep configuration in a variable named `sweep_configuration`:

    ```python title="train.py" theme={null}
    sweep_configuration = {
        "name": "sweepdemo",
        "method": "bayes",
        "metric": {"goal": "minimize", "name": "validation_loss"},
        "parameters": {
            "learning_rate": {"min": 0.0001, "max": 0.1},
            "batch_size": {"values": [16, 32, 64]},
            "epochs": {"values": [5, 10, 15]},
            "optimizer": {"values": ["adam", "sgd"]},
        },
    }
    ```
  </Tab>
</Tabs>

Within the top level `parameters` key, the following keys are nested: `learning_rate`, `batch_size`, `epoch`, and `optimizer`. For each of the nested keys you specify, you can provide one or more values, a distribution, a probability, and more. For more information, see the [parameters](./sweep-config-keys#parameters) section in [Sweep configuration options](./sweep-config-keys).

## Double nested parameters

Sweep configurations support nested parameters. To define a nested parameter, include an additional `parameters` key under the top-level parameter name.

The following example shows a sweep configuration with three nested parameters: `nested_category_1`, `nested_category_2`, and `nested_category_3`. Each nested parameter includes two additional parameters: `momentum` and `weight_decay`.

<Note>
  `nested_category_1`, `nested_category_2`, and `nested_category_3` are placeholders. Replace them with names that fit your use case.
</Note>

The following code snippets show how to define nested parameters in both a YAML file and a Python dictionary.

<Tabs>
  <Tab title="CLI">
    ```yaml theme={null}
    program: sweep_nest.py
    name: nested_sweep
    method: random
    metric:
      name: loss
      goal: minimize
    parameters:
      optimizer:
        values: ['adam', 'sgd']
      fc_layer_size:
        values: [128, 256, 512]
      dropout:
        values: [0.3, 0.4, 0.5]
      epochs:
        value: 1
      learning_rate:
        distribution: uniform
        min: 0
        max: 0.1
      batch_size:
        distribution: q_log_uniform_values
        q: 8
        min: 32
        max: 256
      nested_category_1:
        parameters:
          momentum:
            distribution: uniform
            min: 0.0
            max: 0.9
          weight_decay:
            values: [0.0001, 0.0005, 0.001]
      nested_category_2:
        parameters:
          momentum:
            distribution: uniform
            min: 0.0
            max: 0.9
          weight_decay:
            values: [0.1, 0.2, 0.3]
      nested_category_3:
        parameters:
          momentum:
            distribution: uniform
            min: 0.5
            max: 0.7
          weight_decay:
            values: [0.2, 0.3, 0.4]
    ```
  </Tab>

  <Tab title="Python script or notebook">
    ```python theme={null}
    {
      "program": "sweep_nest.py",
      "name": "nested_sweep",
      "method": "random",
      "metric": {
        "name": "loss",
        "goal": "minimize"
      },
      "parameters": {
        "optimizer": {
          "values": ["adam", "sgd"]
        },
        "fc_layer_size": {
          "values": [128, 256, 512]
        },
        "dropout": {
          "values": [0.3, 0.4, 0.5]
        },
        "epochs": {
          "value": 1
        },
        "learning_rate": {
          "distribution": "uniform",
          "min": 0,
          "max": 0.1
        },
        "batch_size": {
          "distribution": "q_log_uniform_values",
          "q": 8,
          "min": 32,
          "max": 256
        },
        "nested_category_1": {
          "parameters": {
            "momentum": {
              "distribution": "uniform",
              "min": 0.0,
              "max": 0.9
            },
            "weight_decay": {
              "values": [0.0001, 0.0005, 0.001]
            }
          }
        },
        "nested_category_2": {
          "parameters": {
            "momentum": {
              "distribution": "uniform",
              "min": 0.0,
              "max": 0.9
            },
            "weight_decay": {
              "values": [0.1, 0.2, 0.3]
            }
          }
        },
        "nested_category_3": {
          "parameters": {
            "momentum": {
              "distribution": "uniform",
              "min": 0.5,
              "max": 0.7
            },
            "weight_decay": {
              "values": [0.2, 0.3, 0.4]
            }
          }
        }
      }
    }
    ```
  </Tab>
</Tabs>

{/* For example, the following code snippets show a sweep config both in a YAML config file and a Python script. */}

<Warning>
  Nested parameters defined in sweep configuration overwrite keys specified in a W\&B run configuration.

  As an example, suppose you have `train.py` script that initializes a run with a nested default:

  ```python theme={null}
  def main():
      with  wandb.init(config={"nested_param": {"manual_key": 1}}) as run:
          # Your training code here
  ```

  Your sweep configuration defines nested parameters under a top-level `"parameters"` key:

  ```python theme={null}
  sweep_configuration = {
      "method": "grid",
      "metric": {"name": "score", "goal": "minimize"},
      "parameters": {
          "top_level_param": {"value": 0},
          "nested_param": {
              "parameters": {
                  "learning_rate": {"value": 0.01},
                  "double_nested_param": {
                      "parameters": {"x": {"value": 0.9}, "y": {"value": 0.8}}
                  },
              }
          },
      },
  }

  sweep_id = wandb.sweep(sweep=sweep_configuration, project="<project>")
  wandb.agent(sweep_id, function=main, count=4)
  ```

  During a sweep run, `run.config["nested_param"]` reflects the subtree defined by the
  sweep (`learning_rate`, `double_nested_param`) config and does not include `manual_key` defined
  in `wandb.init(config=...)`.
</Warning>

## Sweep configuration template

The following template shows how you can configure parameters and specify search constraints. Replace `hyperparameter_name` with the name of your hyperparameter and any values enclosed in `<>`.

```yaml title="config.yaml" theme={null}
program: <insert>
method: <insert>
parameter:
  hyperparameter_name0:
    value: 0  
  hyperparameter_name1: 
    values: [0, 0, 0]
  hyperparameter_name: 
    distribution: <insert>
    value: <insert>
  hyperparameter_name2:  
    distribution: <insert>
    min: <insert>
    max: <insert>
    q: <insert>
  hyperparameter_name3: 
    distribution: <insert>
    values:
      - <list_of_values>
      - <list_of_values>
      - <list_of_values>
early_terminate:
  type: hyperband
  s: 0
  eta: 0
  max_iter: 0
command:
- ${Command macro}
- ${Command macro}
- ${Command macro}
- ${Command macro}      
```

To express a numeric value using scientific notation, add the YAML `!!float` operator, which casts the value to a floating point number. For example, `min: !!float 1e-5`. See [Command example](#command-example).

## Sweep configuration examples

<Tabs>
  <Tab title="CLI">
    ```yaml title="config.yaml"  theme={null}
    program: train.py
    method: random
    metric:
      goal: minimize
      name: loss
    parameters:
      batch_size:
        distribution: q_log_uniform_values
        max: 256 
        min: 32
        q: 8
      dropout: 
        values: [0.3, 0.4, 0.5]
      epochs:
        value: 1
      fc_layer_size: 
        values: [128, 256, 512]
      learning_rate:
        distribution: uniform
        max: 0.1
        min: 0
      optimizer:
        values: ["adam", "sgd"]
    ```
  </Tab>

  <Tab title="Python script or notebook">
    ```python title="train.py"  theme={null}
    sweep_config = {
        "method": "random",
        "metric": {"goal": "minimize", "name": "loss"},
        "parameters": {
            "batch_size": {
                "distribution": "q_log_uniform_values",
                "max": 256,
                "min": 32,
                "q": 8,
            },
            "dropout": {"values": [0.3, 0.4, 0.5]},
            "epochs": {"value": 1},
            "fc_layer_size": {"values": [128, 256, 512]},
            "learning_rate": {"distribution": "uniform", "max": 0.1, "min": 0},
            "optimizer": {"values": ["adam", "sgd"]},
        },
    }
    ```
  </Tab>
</Tabs>

### Bayes hyperband example

```yaml theme={null}
program: train.py
method: bayes
metric:
  goal: minimize
  name: val_loss
parameters:
  dropout:
    values: [0.15, 0.2, 0.25, 0.3, 0.4]
  hidden_layer_size:
    values: [96, 128, 148]
  layer_1_size:
    values: [10, 12, 14, 16, 18, 20]
  layer_2_size:
    values: [24, 28, 32, 36, 40, 44]
  learn_rate:
    values: [0.001, 0.01, 0.003]
  decay:
    values: [1e-5, 1e-6, 1e-7]
  momentum:
    values: [0.8, 0.9, 0.95]
  epochs:
    value: 27
early_terminate:
  type: hyperband
  s: 2
  eta: 3
  max_iter: 27
```

The following tabs show how to specify either a minimum or maximum number of iterations for `early_terminate`:

<Tabs>
  <Tab title="Maximum number of iterations">
    The brackets for this example are: `[3, 3*eta, 3*eta*eta, 3*eta*eta*eta]`, which equals `[3, 9, 27, 81]`.

    ```yaml theme={null}
    early_terminate:
      type: hyperband
      min_iter: 3
    ```
  </Tab>

  <Tab title="Minimum number of iterations">
    The brackets for this example are `[27/eta, 27/eta/eta]`, which equals `[9, 3]`.

    ```yaml theme={null}
    early_terminate:
      type: hyperband
      max_iter: 27
      s: 2
    ```
  </Tab>
</Tabs>

### Macro and custom command arguments example

For more complex command line arguments, you can use macros to pass environment variables, the Python interpreter, and additional arguments. [W\&B supports pre defined macros](./sweep-config-keys#command-macros) and custom command line arguments that you can specify in your sweep configuration.

For example, the following sweep configuration (`sweep.yaml`) defines a command that runs a Python script (`run.py`) with the `${env}`, `${interpreter}`, and `${program}` macros replaced with the appropriate values when the sweep runs.

The `--batch_size=${batch_size}`, `--test=True`, and `--optimizer=${optimizer}` arguments use custom macros to pass the values of the `batch_size`, `test`, and `optimizer` parameters defined in the sweep configuration.

```yaml title="sweep.yaml" theme={null}
program: run.py
method: random
metric:
  name: validation_loss
parameters:
  learning_rate:
    min: 0.0001
    max: 0.1
command:
  - ${env}
  - ${interpreter}
  - ${program}
  - "--batch_size=${batch_size}"
  - "--optimizer=${optimizer}"
  - "--test=True"
```

The associated Python script (`run.py`) can then parse these command line arguments using the `argparse` module.

```python title="run.py" theme={null}
# run.py  
import wandb
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--batch_size', type=int)
parser.add_argument('--optimizer', type=str, choices=['adam', 'sgd'], required=True)
parser.add_argument('--test', type=str2bool, default=False)
args = parser.parse_args()

# Initialize a W&B Run
with wandb.init('test-project') as run:
    run.log({'validation_loss':1})
```

See the [Command macros](./sweep-config-keys#command-macros) section in [Sweep configuration options](./sweep-config-keys) for a list of pre-defined macros you can use in your sweep configuration.

#### Boolean arguments

The `argparse` module does not support boolean arguments by default. To define a boolean argument, you can use the [`action`](https://docs.python.org/3/library/argparse.html#action) parameter or use a custom function to convert the string representation of the boolean value to a boolean type.

As an example, you can use the following code snippet to define a boolean argument. Pass `store_true` or `store_false` as an argument to `ArgumentParser`.

```python theme={null}
import wandb
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--test', action='store_true')
args = parser.parse_args()

args.test  # This will be True if --test is passed, otherwise False
```

You can also define a custom function to convert the string representation of the boolean value to a boolean type. For example, the following code snippet defines the `str2bool` function, which converts a string to a boolean value.

```python theme={null}
def str2bool(v: str) -> bool:
  """Convert a string to a boolean. This is required because
  argparse does not support boolean arguments by default.
  """
  if isinstance(v, bool):
      return v
  return v.lower() in ('yes', 'true', 't', '1')
```
