This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Sweeps

Hyperparameter search and model optimization with W&B Sweeps

1: Tutorial: Define, initialize, and run a sweep
2: Add W&B (wandb) to your code
3: Define a sweep configuration

3.1: Sweep configuration options

4: Initialize a sweep
5: Start or stop a sweep agent
6: Parallelize agents
7: Visualize sweep results
8: Manage sweeps with the CLI
9: Learn more about sweeps
10: Manage algorithms locally
11: Sweeps troubleshooting
12: Sweeps UI
13: Tutorial: Create sweep job from project

Use W&B Sweeps to automate hyperparameter search and visualize rich, interactive experiment tracking. Pick from popular search methods such as Bayesian, grid search, and random to search the hyperparameter space. Scale and parallelize sweep across one or more machines.

How it works

Create a sweep with two W&B CLI commands:

Initialize a sweep

wandb sweep --project <project-name> <path-to-config file>

Start the sweep agent

wandb agent <sweep-ID>

The preceding code snippet, and the colab linked on this page, show how to initialize and create a sweep with wht W&B CLI. See the Sweeps walkthrough for a step-by-step outline of the W&B Python SDK commands to use to define a sweep configuration, initialize a sweep, and start a sweep.

How to get started

Depending on your use case, explore the following resources to get started with W&B Sweeps:

Read through the sweeps walkthrough for a step-by-step outline of the W&B Python SDK commands to use to define a sweep configuration, initialize a sweep, and start a sweep.
Explore this chapter to learn how to:
Explore a curated list of Sweep experiments that explore hyperparameter optimization with W&B Sweeps. Results are stored in W&B Reports.

For a step-by-step video, see: Tune Hyperparameters Easily with W&B Sweeps.

1 - Tutorial: Define, initialize, and run a sweep

Sweeps quickstart shows how to define, initialize, and run a sweep. There are four main steps

This page shows how to define, initialize, and run a sweep. There are four main steps:

Set up your training code
Define the search space with a sweep configuration
Initialize the sweep
Start the sweep agent

Copy and paste the following code into a Jupyter Notebook or Python script:

# Import the W&B Python Library and log into W&B
import wandb

# 1: Define objective/training function
def objective(config):
    score = config.x**3 + config.y
    return score

def main():
    with wandb.init(project="my-first-sweep") as run:
        score = objective(run.config)
        run.log({"score": score})

# 2: Define the search space
sweep_configuration = {
    "method": "random",
    "metric": {"goal": "minimize", "name": "score"},
    "parameters": {
        "x": {"max": 0.1, "min": 0.01},
        "y": {"values": [1, 3, 7]},
    },
}

# 3: Start the sweep
sweep_id = wandb.sweep(sweep=sweep_configuration, project="my-first-sweep")

wandb.agent(sweep_id, function=main, count=10)

The following sections break down and explains each step in the code sample.

Set up your training code

Define a training function that takes in hyperparameter values from wandb.Run.config and uses them to train a model and return metrics.

Optionally provide the name of the project where you want the output of the W&B Run to be stored (project parameter in wandb.init()). If the project is not specified, the run is put in an “Uncategorized” project.

Both the sweep and the run must be in the same project. Therefore, the name you provide when you initialize W&B must match the name of the project you provide when you initialize a sweep.

# 1: Define objective/training function
def objective(config):
    score = config.x**3 + config.y
    return score


def main():
    with wandb.init(project="my-first-sweep") as run:
        score = objective(run.config)
        run.log({"score": score})

Define the search space with a sweep configuration

Specify the hyperparameters to sweep in a dictionary. For configuration options, see Define sweep configuration.

The proceeding example demonstrates a sweep configuration that uses a random search ('method':'random'). The sweep will randomly select a random set of values listed in the configuration for the batch size, epoch, and the learning rate.

W&B minimizes the metric specified in the metric key when "goal": "minimize" is associated with it. In this case, W&B will optimize for minimizing the metric score ("name": "score").

# 2: Define the search space
sweep_configuration = {
    "method": "random",
    "metric": {"goal": "minimize", "name": "score"},
    "parameters": {
        "x": {"max": 0.1, "min": 0.01},
        "y": {"values": [1, 3, 7]},
    },
}

Initialize the Sweep

W&B uses a Sweep Controller to manage sweeps on the cloud (standard), locally (local) across one or more machines. For more information about Sweep Controllers, see Search and stop algorithms locally.

A sweep identification number is returned when you initialize a sweep:

sweep_id = wandb.sweep(sweep=sweep_configuration, project="my-first-sweep")

For more information about initializing sweeps, see Initialize sweeps.

Start the Sweep

Use the wandb.agent API call to start a sweep.

wandb.agent(sweep_id, function=main, count=10)

Visualize results (optional)

Open your project to see your live results in the W&B App dashboard. With just a few clicks, construct rich, interactive charts like parallel coordinates plots, parameter importance analyzes, and additional chart types.

For more information about how to visualize results, see Visualize sweep results. For an example dashboard, see this sample Sweeps Project.

Stop the agent (optional)

In the terminal, press Ctrl+C to stop the current run. Press it again to terminate the agent.

2 - Add W&B (wandb) to your code

Add W&B to your Python code script or Jupyter Notebook.

There are numerous ways to add the W&B Python SDK to your script or notebook. This section provides a “best practice” example that shows how to integrate the W&B Python SDK into your own code.

Original training script

Suppose you have the following code in a Python script. We define a function called main that mimics a typical training loop. For each epoch, the accuracy and loss is computed on the training and validation data sets. The values are randomly generated for the purpose of this example.

We defined a dictionary called config where we store hyperparameters values. At the end of the cell, we call the main function to execute the mock training code.

import random
import numpy as np

def train_one_epoch(epoch, lr, bs):
    acc = 0.25 + ((epoch / 30) + (random.random() / 10))
    loss = 0.2 + (1 - ((epoch - 1) / 10 + random.random() / 5))
    return acc, loss

def evaluate_one_epoch(epoch):
    acc = 0.1 + ((epoch / 20) + (random.random() / 10))
    loss = 0.25 + (1 - ((epoch - 1) / 10 + random.random() / 6))
    return acc, loss

# config variable with hyperparameter values
config = {"lr": 0.0001, "bs": 16, "epochs": 5}

def main():
    # Note that we define values from `wandb.Run.config`
    # instead of defining hard values
    lr = config["lr"]
    bs = config["bs"]
    epochs = config["epochs"]

    for epoch in np.arange(1, epochs):
        train_acc, train_loss = train_one_epoch(epoch, lr, bs)
        val_acc, val_loss = evaluate_one_epoch(epoch)

        print("epoch: ", epoch)
        print("training accuracy:", train_acc, "training loss:", train_loss)
        print("validation accuracy:", val_acc, "training loss:", val_loss)

Training script with W&B Python SDK

The following code examples demonstrate how to add the W&B Python SDK into your code. If you start W&B Sweep jobs in the CLI, you will want to explore the CLI tab. If you start W&B Sweep jobs within a Jupyter notebook or Python script, explore the Python SDK tab.

To create a W&B Sweep, we added the following to the code example:

Import the W&B Python SDK.
Create a dictionary object where the key-value pairs define the sweep configuration. In the proceeding example, the batch size (batch_size), epochs (epochs), and the learning rate (lr) hyperparameters are varied during each sweep. For more information, see Define sweep configuration.
Pass the sweep configuration dictionary to wandb.sweep(). This initializes the sweep. This returns a sweep ID (sweep_id). For more information, see Initialize sweeps.
Use the wandb.init() API to generate a background process to sync and log data as a W&B Run.
(Optional) define values from wandb.config instead of defining hard coded values.
Log the metric you want to optimize with wandb.Run.log(). You must log the metric defined in your configuration. Within the configuration dictionary (sweep_configuration in this example), you define the sweep to maximize the val_acc value.
Start the sweep with the wandb.agent API call. Provide the sweep ID and the name of the function the sweep will execute (function=main), and specify the maximum number of runs to try to four (count=4). For more informationp, see Start sweep agents.

import wandb
import numpy as np
import random


# Define training function that takes in hyperparameter
# values from `wandb.Run.config` and uses them to train a
# model and return the metrics
def train_one_epoch(epoch, lr, bs):
    acc = 0.25 + ((epoch / 30) + (random.random() / 10))
    loss = 0.2 + (1 - ((epoch - 1) / 10 + random.random() / 5))
    return acc, loss


def evaluate_one_epoch(epoch):
    acc = 0.1 + ((epoch / 20) + (random.random() / 10))
    loss = 0.25 + (1 - ((epoch - 1) / 10 + random.random() / 6))
    return acc, loss


# Define a sweep config dictionary
sweep_configuration = {
    "method": "random",
    "name": "sweep",
    "metric": {"goal": "maximize", "name": "val_acc"},
    "parameters": {
        "batch_size": {"values": [16, 32, 64]},
        "epochs": {"values": [5, 10, 15]},
        "lr": {"max": 0.1, "min": 0.0001},
    },
}

# (Optional) Provide a name for the project.
project = "my-first-sweep"

def main():
    # Use the `with` context manager statement to automatically end the run.
    # This is equivalent to using `run.finish()` at the end of each run
    with wandb.init(project=project) as run:

        # This code fetches the hyperparameter values from `wandb.Run.config`
        # instead of defining them explicitly
        lr = run.config["lr"]
        bs = run.config["batch_size"]
        epochs = run.config["epochs"]

        # Execute the training loop and log the performance values to W&B
        for epoch in np.arange(1, epochs):
            train_acc, train_loss = train_one_epoch(epoch, lr, bs)
            val_acc, val_loss = evaluate_one_epoch(epoch)

            run.log(
                {
                    "epoch": epoch,
                    "train_acc": train_acc,
                    "train_loss": train_loss,
                    "val_acc": val_acc,
                    "val_loss": val_loss,
                }
            )


if __name__ == "__main__":
    # Initialize the sweep by passing in the config dictionary
    sweep_id = wandb.sweep(sweep=sweep_configuration, project=project)

    # Start the sweep job
    wandb.agent(sweep_id, function=main, count=4)

The preceding code snippet shows how to initialize a wandb.init() API within a with context manager statement to generate a background process to sync and log data as a W&B Run. This ensures the run is properly terminated after uploading the logged values. An alternative approach is to call wandb.init() and wandb.Run.finish() at the beginning and end of the training script, respectively.

To create a W&B Sweep, we first create a YAML configuration file. The configuration file contains the hyperparameters we want the sweep to explore. In the proceeding example, the batch size (batch_size), epochs (epochs), and the learning rate (lr) hyperparameters are varied during each sweep.

# config.yaml
program: train.py
method: random
name: sweep
metric:
  goal: maximize
  name: val_acc
parameters:
  batch_size:
    values: [16, 32, 64]
  lr:
    min: 0.0001
    max: 0.1
  epochs:
    values: [5, 10, 15]

For more information on how to create a W&B Sweep configuration, see Define sweep configuration.

You must provide the name of your Python script for the program key in your YAML file.

Next, we add the following to the code example:

Import the W&B Python SDK (wandb) and PyYAML (yaml). PyYAML is used to read in our YAML configuration file.
Read in the configuration file.
Use the wandb.init() API to generate a background process to sync and log data as a W&B Run. We pass the config object to the config parameter.
Define hyperparameter values from wandb.Run.config instead of using hard coded values.
Log the metric we want to optimize with wandb.Run.log(). You must log the metric defined in your configuration. Within the configuration dictionary (sweep_configuration in this example) we defined the sweep to maximize the val_acc value.

import wandb
import yaml
import random
import numpy as np


def train_one_epoch(epoch, lr, bs):
    acc = 0.25 + ((epoch / 30) + (random.random() / 10))
    loss = 0.2 + (1 - ((epoch - 1) / 10 + random.random() / 5))
    return acc, loss


def evaluate_one_epoch(epoch):
    acc = 0.1 + ((epoch / 20) + (random.random() / 10))
    loss = 0.25 + (1 - ((epoch - 1) / 10 + random.random() / 6))
    return acc, loss


def main():
    # Set up your default hyperparameters
    with open("./config.yaml") as file:
        config = yaml.load(file, Loader=yaml.FullLoader)

    with wandb.init(config=config) as run:
        for epoch in np.arange(1, run.config['epochs']):
            train_acc, train_loss = train_one_epoch(epoch, run.config['lr'], run.config['batch_size'])
            val_acc, val_loss = evaluate_one_epoch(epoch)
            run.log(
                {
                    "epoch": epoch,
                    "train_acc": train_acc,
                    "train_loss": train_loss,
                    "val_acc": val_acc,
                    "val_loss": val_loss,
                }
            )

# Call the main function.
main()

In your CLI, set a maximum number of runs for the sweep agent to try. This is optional. This example we set the maximum number to 5.

NUM=5

Next, initialize the sweep with the wandb sweep command. Provide the name of the YAML file. Optionally provide the name of the project for the project flag (--project):

wandb sweep --project sweep-demo-cli config.yaml

This returns a sweep ID. For more information on how to initialize sweeps, see Initialize sweeps.

Copy the sweep ID and replace sweepID in the proceeding code snippet to start the sweep job with the wandb agent command:

wandb agent --count $NUM your-entity/sweep-demo-cli/sweepID

For more information, see Start sweep jobs.

Consideration when logging metrics

Be sure to log the sweep’s metric to W&B explicitly. Do not log metrics for your sweep inside a subdirectory.

For example, consider the following pseudocode. A user wants to log the validation loss ("val_loss": loss). First they pass the values into a dictionary. However, the dictionary passed to wandb.Run.log() does not explicitly access the key-value pair in the dictionary:

# Import the W&B Python Library and log into W&B
import wandb
import random

def train():
    # Simulate training and validation metrics
    offset = random.random() / 5
    epoch = 5  # Simulate an epoch value
    acc = 1 - 2**-epoch - random.random() / epoch - offset
    loss = 2**-epoch + random.random() / epoch + offset
    return loss, acc


def main():
    with wandb.init(entity="<entity>", project="my-first-sweep") as run:
        val_loss, val_acc = train()
        # Incorrect. You must explicitly access the
        # key-value pair in the dictionary
        # See next code block to see how to correctly log metrics
        run.log({"val_loss": val_loss, "val_acc": val_acc})

sweep_configuration = {
    "method": "random",
    "metric": {"goal": "minimize", "name": "val_loss"},
    "parameters": {
        "x": {"max": 0.1, "min": 0.01},
        "y": {"values": [1, 3, 7]},
    },
}

# Initialize the sweep with the configuration dictionary
sweep_id = wandb.sweep(sweep=sweep_configuration, project="my-first-sweep")

# Start the sweep job
wandb.agent(sweep_id, function=main, count=10)

Instead, explicitly access the key-value pair within the Python dictionary. For example, the following code specifies the key-value pair when you pass the dictionary to the wandb.Run.log() method:

# Import the W&B Python Library and log into W&B
import wandb
import random


def train():
    offset = random.random() / 5
    acc = 1 - 2**-epoch - random.random() / epoch - offset
    loss = 2**-epoch + random.random() / epoch + offset

    return loss, acc


def main():
    with wandb.init(entity="<entity>", project="my-first-sweep") as run:
        # Correct. Explicitly access the key-value pair in the dictionary
        # when logging metrics
        val_loss, val_acc = train()
        run.log({"val_loss": val_loss, "val_acc": val_acc})


sweep_configuration = {
    "method": "random",
    "metric": {"goal": "minimize", "name": "val_loss"},
    "parameters": {
        "x": {"max": 0.1, "min": 0.01},
        "y": {"values": [1, 3, 7]},
    },
}

sweep_id = wandb.sweep(sweep=sweep_configuration, project="my-first-sweep")

wandb.agent(sweep_id, function=main, count=10)

3 - Define a sweep configuration

Learn how to create configuration files for sweeps.

A W&B Sweep combines a strategy for exploring hyperparameter values with the code that evaluates them. The strategy can be as simple as trying every option or as complex as Bayesian Optimization and Hyperband (BOHB).

Define a sweep configuration either in a Python dictionary or a YAML file. How you define your sweep configuration depends on how you want to manage your sweep.

Define your sweep configuration in a YAML file if you want to initialize a sweep and start a sweep agent from the command line. Define your sweep in a Python dictionary if you initialize a sweep and start a sweep entirely within a Python script or notebook.

The following guide describes how to format your sweep configuration. See Sweep configuration options for a comprehensive list of top-level sweep configuration keys.

Basic structure

Both sweep configuration format options (YAML and Python dictionary) utilize key-value pairs and nested structures.

Use top-level keys within your sweep configuration to define qualities of your sweep search such as the name of the sweep (name key), the parameters to search through (parameters key), the methodology to search the parameter space (method key), and more.

For example, the proceeding code snippets show the same sweep configuration defined within a YAML file and within a Python dictionary. Within the sweep configuration there are five top level keys specified: program, name, method, metric and parameters.

Define a sweep configuration in a YAML file if you want to manage sweeps interactively from the command line (CLI)

program: train.py
name: sweepdemo
method: bayes
metric:
  goal: minimize
  name: validation_loss
parameters:
  learning_rate:
    min: 0.0001
    max: 0.1
  batch_size:
    values: [16, 32, 64]
  epochs:
    values: [5, 10, 15]
  optimizer:
    values: ["adam", "sgd"]

Define a sweep in a Python dictionary data structure if you define training algorithm in a Python script or notebook.

The proceeding code snippet stores a sweep configuration in a variable named sweep_configuration:

sweep_configuration = {
    "name": "sweepdemo",
    "method": "bayes",
    "metric": {"goal": "minimize", "name": "validation_loss"},
    "parameters": {
        "learning_rate": {"min": 0.0001, "max": 0.1},
        "batch_size": {"values": [16, 32, 64]},
        "epochs": {"values": [5, 10, 15]},
        "optimizer": {"values": ["adam", "sgd"]},
    },
}

Within the top level parameters key, the following keys are nested: learning_rate, batch_size, epoch, and optimizer. For each of the nested keys you specify, you can provide one or more values, a distribution, a probability, and more. For more information, see the parameters section in Sweep configuration options.

Double nested parameters

Sweep configurations support nested parameters. To delineate a nested parameter, use an additional parameters key under the top level parameter name. Sweep configs support multi-level nesting.

Specify a probability distribution for your random variables if you use a Bayesian or random hyperparameter search. For each hyperparameter:

Create a top level parameters key in your sweep config.
Within the parameterskey, nest the following:
1. Specify the name of hyperparameter you want to optimize.
2. Specify the distribution you want to use for the distribution key. Nest the distribution key-value pair underneath the hyperparameter name.
3. Specify one or more values to explore. The value (or values) should be inline with the distribution key.
  1. (Optional) Use an additional parameters key under the top level parameter name to delineate a nested parameter.

Nested parameters defined in sweep configuration overwrite keys specified in a W&B run configuration.

For example, suppose you initialize a W&B run with the following configuration in a train.py Python script (see Lines 1-2). Next, you define a sweep configuration in a dictionary called sweep_configuration (see Lines 4-13). You then pass the sweep config dictionary to wandb.sweep to initialize a sweep config (see Line 16).

def main():
    run = wandb.init(config={"nested_param": {"manual_key": 1}})


sweep_configuration = {
    "top_level_param": 0,
    "nested_param": {
        "learning_rate": 0.01,
        "double_nested_param": {"x": 0.9, "y": 0.8},
    },
}

# Initialize sweep by passing in config.
sweep_id = wandb.sweep(sweep=sweep_configuration, project="<project>")

# Start sweep job.
wandb.agent(sweep_id, function=main, count=4)

The nested_param.manual_key that is passed when the W&B run is initialized is not accessible. The wandb.Run.config only possess the key-value pairs that are defined in the sweep configuration dictionary.

Sweep configuration template

The following template shows how you can configure parameters and specify search constraints. Replace hyperparameter_name with the name of your hyperparameter and any values enclosed in <>.

program: <insert>
method: <insert>
parameter:
  hyperparameter_name0:
    value: 0  
  hyperparameter_name1: 
    values: [0, 0, 0]
  hyperparameter_name: 
    distribution: <insert>
    value: <insert>
  hyperparameter_name2:  
    distribution: <insert>
    min: <insert>
    max: <insert>
    q: <insert>
  hyperparameter_name3: 
    distribution: <insert>
    values:
      - <list_of_values>
      - <list_of_values>
      - <list_of_values>
early_terminate:
  type: hyperband
  s: 0
  eta: 0
  max_iter: 0
command:
- ${Command macro}
- ${Command macro}
- ${Command macro}
- ${Command macro}

To express a numeric value using scientific notation, add the YAML !!float operator, which casts the value to a floating point number. For example, min: !!float 1e-5. See Command example.

Sweep configuration examples

program: train.py
method: random
metric:
  goal: minimize
  name: loss
parameters:
  batch_size:
    distribution: q_log_uniform_values
    max: 256 
    min: 32
    q: 8
  dropout: 
    values: [0.3, 0.4, 0.5]
  epochs:
    value: 1
  fc_layer_size: 
    values: [128, 256, 512]
  learning_rate:
    distribution: uniform
    max: 0.1
    min: 0
  optimizer:
    values: ["adam", "sgd"]

sweep_config = {
    "method": "random",
    "metric": {"goal": "minimize", "name": "loss"},
    "parameters": {
        "batch_size": {
            "distribution": "q_log_uniform_values",
            "max": 256,
            "min": 32,
            "q": 8,
        },
        "dropout": {"values": [0.3, 0.4, 0.5]},
        "epochs": {"value": 1},
        "fc_layer_size": {"values": [128, 256, 512]},
        "learning_rate": {"distribution": "uniform", "max": 0.1, "min": 0},
        "optimizer": {"values": ["adam", "sgd"]},
    },
}

Bayes hyperband example

program: train.py
method: bayes
metric:
  goal: minimize
  name: val_loss
parameters:
  dropout:
    values: [0.15, 0.2, 0.25, 0.3, 0.4]
  hidden_layer_size:
    values: [96, 128, 148]
  layer_1_size:
    values: [10, 12, 14, 16, 18, 20]
  layer_2_size:
    values: [24, 28, 32, 36, 40, 44]
  learn_rate:
    values: [0.001, 0.01, 0.003]
  decay:
    values: [1e-5, 1e-6, 1e-7]
  momentum:
    values: [0.8, 0.9, 0.95]
  epochs:
    value: 27
early_terminate:
  type: hyperband
  s: 2
  eta: 3
  max_iter: 27

The proceeding tabs show how to specify either a minimum or maximum number of iterations for early_terminate:

The brackets for this example are: [3, 3*eta, 3*eta*eta, 3*eta*eta*eta], which equals [3, 9, 27, 81].

early_terminate:
  type: hyperband
  min_iter: 3

The brackets for this example are [27/eta, 27/eta/eta], which equals [9, 3].

early_terminate:
  type: hyperband
  max_iter: 27
  s: 2

Macro and custom command arguments example

For more complex command line arguments, you can use macros to pass environment variables, the Python interpreter, and additional arguments. W&B supports pre defined macros and custom command line arguments that you can specify in your sweep configuration.

For example, the following sweep configuration (sweep.yaml) defines a command that runs a Python script (run.py) with the ${env}, ${interpreter}, and ${program} macros replaced with the appropriate values when the sweep runs.

The --batch_size=${batch_size}, --test=True, and --optimizer=${optimizer} arguments use custom macros to pass the values of the batch_size, test, and optimizer parameters defined in the sweep configuration.

program: run.py
method: random
metric:
  name: validation_loss
parameters:
  learning_rate:
    min: 0.0001
    max: 0.1
command:
  - ${env}
  - ${interpreter}
  - ${program}
  - "--batch_size=${batch_size}"
  - "--optimizer=${optimizer}"
  - "--test=True"

The associated Python script (run.py) can then parse these command line arguments using the argparse module.

# run.py  
import wandb
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--batch_size', type=int)
parser.add_argument('--optimizer', type=str, choices=['adam', 'sgd'], required=True)
parser.add_argument('--test', type=str2bool, default=False)
args = parser.parse_args()

# Initialize a W&B Run
with wandb.init('test-project') as run:
    run.log({'validation_loss':1})

See the Command macros section in Sweep configuration options for a list of pre-defined macros you can use in your sweep configuration.

Boolean arguments

The argparse module does not support boolean arguments by default. To define a boolean argument, you can use the action parameter or use a custom function to convert the string representation of the boolean value to a boolean type.

As an example, you can use the following code snippet to define a boolean argument. Pass store_true or store_false as an argument to ArgumentParser.

import wandb
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--test', action='store_true')
args = parser.parse_args()

args.test  # This will be True if --test is passed, otherwise False

You can also define a custom function to convert the string representation of the boolean value to a boolean type. For example, the following code snippet defines the str2bool function, which converts a string to a boolean value.

def str2bool(v: str) -> bool:
  """Convert a string to a boolean. This is required because
  argparse does not support boolean arguments by default.
  """
  if isinstance(v, bool):
      return v
  return v.lower() in ('yes', 'true', 't', '1')

3.1 - Sweep configuration options

A sweep configuration consists of nested key-value pairs. Use top-level keys within your sweep configuration to define qualities of your sweep search such as the parameters to search through (parameter key), the methodology to search the parameter space (method key), and more.

The proceeding table lists top-level sweep configuration keys and a brief description. See the respective sections for more information about each key.

Top-level keys	Description
`program`	(required) Training script to run
`entity`	The entity for this sweep
`project`	The project for this sweep
`description`	Text description of the sweep
`name`	The name of the sweep, displayed in the W&B UI.
`method`	(required) The search strategy
`metric`	The metric to optimize (only used by certain search strategies and stopping criteria)
`parameters`	(required) Parameter bounds to search
`early_terminate`	Any early stopping criteria
`command`	Command structure for invoking and passing arguments to the training script
`run_cap`	Maximum number of runs for this sweep

See the Sweep configuration structure for more information on how to structure your sweep configuration.

`metric`

Use the metric top-level sweep configuration key to specify the name, the goal, and the target metric to optimize.

Key	Description
`name`	Name of the metric to optimize.
`goal`	Either `minimize` or `maximize` (Default is `minimize`).
`target`	Goal value for the metric you are optimizing. The sweep does not create new runs when if or when a run reaches a target value that you specify. Active agents that have a run executing (when the run reaches the target) wait until the run completes before the agent stops creating new runs.

`parameters`

In your YAML file or Python script, specify parameters as a top level key. Within the parameters key, provide the name of a hyperparameter you want to optimize. Common hyperparameters include: learning rate, batch size, epochs, optimizers, and more. For each hyperparameter you define in your sweep configuration, specify one or more search constraints.

The proceeding table shows supported hyperparameter search constraints. Based on your hyperparameter and use case, use one of the search constraints below to tell your sweep agent where (in the case of a distribution) or what (value, values, and so forth) to search or use.

Search constraint	Description
`values`	Specifies all valid values for this hyperparameter. Compatible with `grid`.
`value`	Specifies the single valid value for this hyperparameter. Compatible with `grid`.
`distribution`	Specify a probability distribution. See the note following this table for information on default values.
`probabilities`	Specify the probability of selecting each element of `values` when using `random`.
`min`, `max`	(`int`or `float`) Maximum and minimum values. If `int`, for `int_uniform` -distributed hyperparameters. If `float`, for `uniform` -distributed hyperparameters.
`mu`	(`float`) Mean parameter for `normal` - or `lognormal` -distributed hyperparameters.
`sigma`	(`float`) Standard deviation parameter for `normal` - or `lognormal` -distributed hyperparameters.
`q`	(`float`) Quantization step size for quantized hyperparameters.
`parameters`	Nest other parameters inside a root level parameter.

W&B sets the following distributions based on the following conditions if a distribution is not specified:

categorical if you specify values
int_uniform if you specify max and min as integers
uniform if you specify max and min as floats
constant if you provide a set to value

`method`

Specify the hyperparameter search strategy with the method key. There are three hyperparameter search strategies to choose from: grid, random, and Bayesian search.

Grid search

Iterate over every combination of hyperparameter values. Grid search makes uninformed decisions on the set of hyperparameter values to use on each iteration. Grid search can be computationally costly.

Grid search executes forever if it is searching within in a continuous search space.

Random search

Choose a random, uninformed, set of hyperparameter values on each iteration based on a distribution. Random search runs forever unless you stop the process from the command line, within your python script, or the W&B App.

Specify the distribution space with the metric key if you choose random (method: random) search.

Bayesian search

In contrast to random and grid search, Bayesian models make informed decisions. Bayesian optimization uses a probabilistic model to decide which values to use through an iterative process of testing values on a surrogate function before evaluating the objective function. Bayesian search works well for small numbers of continuous parameters but scales poorly. For more information about Bayesian search, see the Bayesian Optimization Primer paper.

Bayesian search runs forever unless you stop the process from the command line, within your python script, or the W&B App.

Distribution options for random and Bayesian search

Within the parameter key, nest the name of the hyperparameter. Next, specify the distribution key and specify a distribution for the value.

The proceeding tables lists distributions W&B supports.

Value for `distribution` key	Description
`constant`	Constant distribution. Must specify the constant value (`value`) to use.
`categorical`	Categorical distribution. Must specify all valid values (`values`) for this hyperparameter.
`int_uniform`	Discrete uniform distribution on integers. Must specify `max` and `min` as integers.
`uniform`	Continuous uniform distribution. Must specify `max` and `min` as floats.
`q_uniform`	Quantized uniform distribution. Returns `round(X / q) * q` where X is uniform. `q` defaults to `1`.
`log_uniform`	Log-uniform distribution. Returns a value `X` between `exp(min)` and `exp(max)`such that the natural logarithm is uniformly distributed between `min` and `max`.
`log_uniform_values`	Log-uniform distribution. Returns a value `X` between `min` and `max` such that `log(`X`)` is uniformly distributed between `log(min)` and `log(max)`.
`q_log_uniform`	Quantized log uniform. Returns `round(X / q) * q` where `X` is `log_uniform`. `q` defaults to `1`.
`q_log_uniform_values`	Quantized log uniform. Returns `round(X / q) * q` where `X` is `log_uniform_values`. `q` defaults to `1`.
`inv_log_uniform`	Inverse log uniform distribution. Returns `X`, where `log(1/X)` is uniformly distributed between `min` and `max`.
`inv_log_uniform_values`	Inverse log uniform distribution. Returns `X`, where `log(1/X)` is uniformly distributed between `log(1/max)` and `log(1/min)`.
`normal`	Normal distribution. Return value is normally distributed with mean `mu` (default `0`) and standard deviation `sigma` (default `1`).
`q_normal`	Quantized normal distribution. Returns `round(X / q) * q` where `X` is `normal`. Q defaults to 1.
`log_normal`	Log normal distribution. Returns a value `X` such that the natural logarithm `log(X)` is normally distributed with mean `mu` (default `0`) and standard deviation `sigma` (default `1`).
`q_log_normal`	Quantized log normal distribution. Returns `round(X / q) * q` where `X` is `log_normal`. `q` defaults to `1`.

`early_terminate`

Use early termination (early_terminate) to stop poorly performing runs. If early termination occurs, W&B stops the current run before it creates a new run with a new set of hyperparameter values.

You must specify a stopping algorithm if you use early_terminate. Nest the type key within early_terminate within your sweep configuration.

Stopping algorithm

W&B currently supports Hyperband stopping algorithm.

Hyperband hyperparameter optimization evaluates if a program should stop or if it should to continue at one or more pre-set iteration counts, called brackets.

When a W&B run reaches a bracket, the sweep compares that run’s metric to all previously reported metric values. The sweep terminates the run if the run’s metric value is too high (when the goal is minimization) or if the run’s metric is too low (when the goal is maximization).

Brackets are based on the number of logged iterations. The number of brackets corresponds to the number of times you log the metric you are optimizing. The iterations can correspond to steps, epochs, or something in between. The numerical value of the step counter is not used in bracket calculations.

Specify either min_iter or max_iter to create a bracket schedule.

Key	Description
`min_iter`	Specify the iteration for the first bracket
`max_iter`	Specify the maximum number of iterations.
`s`	Specify the total number of brackets (required for `max_iter`)
`eta`	Specify the bracket multiplier schedule (default: `3`).
`strict`	Enable ‘strict’ mode that prunes runs aggressively, more closely following the original Hyperband paper. Defaults to false.

Hyperband checks which runs to end once every few minutes. The end run timestamp might differ from the specified brackets if your run or iteration are short.

`command`

Modify the format and contents with nested values within the command key. You can directly include fixed components such as filenames.

On Unix systems, /usr/bin/env ensures that the OS chooses the correct Python interpreter based on the environment.

W&B supports the following macros for variable components of the command:

Command macro	Description
`${env}`	`/usr/bin/env` on Unix systems, omitted on Windows.
`${interpreter}`	Expands to `python`.
`${program}`	Training script filename specified by the sweep configuration `program` key.
`${args}`	Hyperparameters and their values in the form `--param1=value1 --param2=value2`.
`${args_no_boolean_flags}`	Hyperparameters and their values in the form `--param1=value1` except boolean parameters are in the form `--boolean_flag_param` when `True` and omitted when `False`.
`${args_no_hyphens}`	Hyperparameters and their values in the form `param1=value1 param2=value2`.
`${args_json}`	Hyperparameters and their values encoded as JSON.
`${args_json_file}`	The path to a file containing the hyperparameters and their values encoded as JSON.
`${envvar}`	A way to pass environment variables. `${envvar:MYENVVAR}` __ expands to the value of MYENVVAR environment variable. __

4 - Initialize a sweep

Initialize a W&B Sweep

W&B uses a Sweep Controller to manage sweeps on the cloud (standard), locally (local) across one or more machines. After a run completes, the sweep controller will issue a new set of instructions describing a new run to execute. These instructions are picked up by agents who actually perform the runs. In a typical W&B Sweep, the controller lives on the W&B server. Agents live on your machines.

The following code snippets demonstrate how to initialize sweeps with the CLI and within a Jupyter Notebook or Python script.

Before you initialize a sweep, make sure you have a sweep configuration defined either in a YAML file or a nested Python dictionary object in your script. For more information, see Define sweep configuration.
Both the W&B Sweep and the W&B Run must be in the same project. Therefore, the name you provide when you initialize W&B (wandb.init()) must match the name of the project you provide when you initialize a W&B Sweep (wandb.sweep()).

Use the W&B SDK to initialize a sweep. Pass the sweep configuration dictionary to the sweep parameter. Optionally provide the name of the project for the project parameter (project) where you want the output of the W&B Run to be stored. If the project is not specified, the run is put in an “Uncategorized” project.

import wandb

# Example sweep configuration
sweep_configuration = {
    "method": "random",
    "name": "sweep",
    "metric": {"goal": "maximize", "name": "val_acc"},
    "parameters": {
        "batch_size": {"values": [16, 32, 64]},
        "epochs": {"values": [5, 10, 15]},
        "lr": {"max": 0.1, "min": 0.0001},
    },
}

sweep_id = wandb.sweep(sweep=sweep_configuration, project="project-name")

The wandb.sweep() function returns the sweep ID. The sweep ID includes the entity name and the project name. Make a note of the sweep ID.

Use the W&B CLI to initialize a sweep. Provide the name of your configuration file. Optionally provide the name of the project for the project flag. If the project is not specified, the W&B Run is put in an “Uncategorized” project.

Use the wandb sweep command to initialize a sweep. The proceeding code example initializes a sweep for a sweeps_demo project and uses a config.yaml file for the configuration.

wandb sweep --project sweeps_demo config.yaml

This command will print out a sweep ID. The sweep ID includes the entity name and the project name. Make a note of the sweep ID.

5 - Start or stop a sweep agent

Start or stop a W&B Sweep Agent on one or more machines.

Start a W&B Sweep on one or more agents on one or more machines. W&B Sweep agents query the W&B server you launched when you initialized a W&B Sweep (wandb sweep) for hyperparameters and use them to run model training.

To start a W&B Sweep agent, provide the W&B Sweep ID that was returned when you initialized a W&B Sweep. The W&B Sweep ID has the form:

entity/project/sweep_ID

Where:

entity: Your W&B username or team name.
project: The name of the project where you want the output of the W&B Run to be stored. If the project is not specified, the run is put in an “Uncategorized” project.
sweep_ID: The pseudo random, unique ID generated by W&B.

Provide the name of the function the W&B Sweep will execute if you start a W&B Sweep agent within a Jupyter Notebook or Python script.

The proceeding code snippets demonstrate how to start an agent with W&B. We assume you already have a configuration file and you have already initialized a W&B Sweep. For more information about how to define a configuration file, see Define sweep configuration.

Use the wandb agent command to start a sweep. Provide the sweep ID that was returned when you initialized the sweep. Copy and paste the code snippet below and replace sweep_id with your sweep ID:

wandb agent sweep_id

Use the W&B Python SDK library to start a sweep. Provide the sweep ID that was returned when you initialized the sweep. In addition, provide the name of the function the sweep will execute.

wandb.agent(sweep_id=sweep_id, function=function_name)

Stop W&B agent

Random and Bayesian searches will run forever. You must stop the process from the command line, within your python script, or the Sweeps UI.

Optionally specify the number of W&B Runs a Sweep agent should try. The following code snippets demonstrate how to set a maximum number of W&B Runs with the CLI and within a Jupyter Notebook, Python script.

First, initialize your sweep. For more information, see Initialize sweeps.

sweep_id = wandb.sweep(sweep_config)

Next, start the sweep job. Provide the sweep ID generated from sweep initiation. Pass an integer value to the count parameter to set the maximum number of runs to try.

sweep_id, count = "dtzl1o7u", 10
wandb.agent(sweep_id, count=count)

If you start a new run after the sweep agent has finished, within the same script or notebook, then you should call wandb.teardown() before starting the new run.

First, initialize your sweep with the wandb sweep command. For more information, see Initialize sweeps.

wandb sweep config.yaml

Pass an integer value to the count flag to set the maximum number of runs to try.

NUM=10
SWEEPID="dtzl1o7u"
wandb agent --count $NUM $SWEEPID

6 - Parallelize agents

Parallelize W&B Sweep agents on multi-core or multi-GPU machine.

Parallelize your W&B Sweep agents on a multi-core or multi-GPU machine. Before you get started, ensure you have initialized your W&B Sweep. For more information on how to initialize a W&B Sweep, see Initialize sweeps.

Parallelize on a multi-CPU machine

Depending on your use case, explore the proceeding tabs to learn how to parallelize W&B Sweep agents using the CLI or within a Jupyter Notebook.

Use the wandb agent command to parallelize your sweep agent across multiple CPUs with the terminal. Provide the sweep ID that was returned when you initialized the sweep.

Open more than one terminal window on your local machine.
Copy and paste the code snippet below and replace sweep_id with your sweep ID:

wandb agent sweep_id

Use the W&B Python SDK library to parallelize your W&B Sweep agent across multiple CPUs within Jupyter Notebooks. Ensure you have the sweep ID that was returned when you initialized the sweep. In addition, provide the name of the function the sweep will execute for the function parameter:

Open more than one Jupyter Notebook.
Copy and past the W&B Sweep ID on multiple Jupyter Notebooks to parallelize a W&B Sweep. For example, you can paste the following code snippet on multiple jupyter notebooks to paralleliz your sweep if you have the sweep ID stored in a variable called sweep_id and the name of the function is function_name:

wandb.agent(sweep_id=sweep_id, function=function_name)

Parallelize on a multi-GPU machine

Follow the procedure outlined to parallelize your W&B Sweep agent across multiple GPUs with a terminal using CUDA Toolkit:

Open more than one terminal window on your local machine.
Specify the GPU instance to use with CUDA_VISIBLE_DEVICES when you start a W&B Sweep job (wandb agent). Assign CUDA_VISIBLE_DEVICES an integer value corresponding to the GPU instance to use.

For example, suppose you have two NVIDIA GPUs on your local machine. Open a terminal window and set CUDA_VISIBLE_DEVICES to 0 (CUDA_VISIBLE_DEVICES=0). Replace sweep_ID in the proceeding example with the W&B Sweep ID that is returned when you initialized a W&B Sweep:

Terminal 1

CUDA_VISIBLE_DEVICES=0 wandb agent sweep_ID

Open a second terminal window. Set CUDA_VISIBLE_DEVICES to 1 (CUDA_VISIBLE_DEVICES=1). Paste the same W&B Sweep ID for the sweep_ID mentioned in the proceeding code snippet:

Terminal 2

CUDA_VISIBLE_DEVICES=1 wandb agent sweep_ID

7 - Visualize sweep results

Visualize the results of your W&B Sweeps with the W&B App UI.

Visualize the results of your W&B Sweeps with the W&B App. Navigate to the W&B App. Choose the project that you specified when you initialized a sweep. You will be redirected to your project workspace. Select the Sweep icon on the left panel (broom icon). From the Sweep UI, select the name of your Sweep from the list.

By default, W&B will automatically create a parallel coordinates plot, a parameter importance plot, and a scatter plot when you start a W&B Sweep job.

Parallel coordinates charts summarize the relationship between large numbers of hyperparameters and model metrics at a glance. For more information on parallel coordinates plots, see Parallel coordinates.

The scatter plot(left) compares the W&B Runs that were generated during the Sweep. For more information about scatter plots, see Scatter Plots.

The parameter importance plot(right) lists the hyperparameters that were the best predictors of, and highly correlated to desirable values of your metrics. For more information on parameter importance plots, see Parameter Importance.

You can alter the dependent and independent values (x and y axis) that are automatically used. Within each panel there is a pencil icon called Edit panel. Choose Edit panel. A model will appear. Within the modal, you can alter the behavior of the graph.

For more information on all default W&B visualization options, see Panels. See the Data Visualization docs for information on how to create plots from W&B Runs that are not part of a W&B Sweep.

8 - Manage sweeps with the CLI

Pause, resume, and cancel a W&B Sweep with the CLI.

Pause, resume, and cancel a W&B Sweep with the CLI. Pausing a W&B Sweep tells the W&B agent that new W&B Runs should not be executed until the Sweep is resumed. Resuming a Sweep tells the agent to continue executing new W&B Runs. Stopping a W&B Sweep tells the W&B Sweep agent to stop creating or executing new W&B Runs. Cancelling a W&B Sweep tells the Sweep agent to kill currently executing W&B Runs and stop executing new Runs.

In each case, provide the W&B Sweep ID that was generated when you initialized a W&B Sweep. Optionally open a new terminal window to execute the proceeding commands. A new terminal window makes it easier to execute a command if a W&B Sweep is printing output statements to your current terminal window.

Use the following guidance to pause, resume, and cancel sweeps.

Pause sweeps

Pause a W&B Sweep so it temporarily stops executing new W&B Runs. Use the wandb sweep --pause command to pause a W&B Sweep. Provide the W&B Sweep ID that you want to pause.

wandb sweep --pause entity/project/sweep_ID

Resume sweeps

Resume a paused W&B Sweep with the wandb sweep --resume command. Provide the W&B Sweep ID that you want to resume:

wandb sweep --resume entity/project/sweep_ID

Stop sweeps

Finish a W&B sweep to stop executing newW&B Runs and let currently executing Runs finish.

wandb sweep --stop entity/project/sweep_ID

Cancel sweeps

Cancel a sweep to kill all running runs and stop running new runs. Use the wandb sweep --cancel command to cancel a W&B Sweep. Provide the W&B Sweep ID that you want to cancel.

wandb sweep --cancel entity/project/sweep_ID

For a full list of CLI command options, see the wandb sweep CLI Reference Guide.

Pause, resume, stop, and cancel a sweep across multiple agents

Pause, resume, stop, or cancel a W&B Sweep across multiple agents from a single terminal. For example, suppose you have a multi-core machine. After you initialize a W&B Sweep, you open new terminal windows and copy the Sweep ID to each new terminal.

Within any terminal, use the wandb sweep CLI command to pause, resume, stop, or cancel a W&B Sweep. For example, the proceeding code snippet demonstrates how to pause a W&B Sweep across multiple agents with the CLI:

wandb sweep --pause entity/project/sweep_ID

Specify the --resume flag along with the Sweep ID to resume the Sweep across your agents:

wandb sweep --resume entity/project/sweep_ID

For more information on how to parallelize W&B agents, see Parallelize agents.

9 - Learn more about sweeps

Collection of useful sources for Sweeps.

Academic papers

Li, Lisha, et al. “Hyperband: A novel bandit-based approach to hyperparameter optimization.” The Journal of Machine Learning Research 18.1 (2017): 6765-6816.

Sweep Experiments

The following W&B Reports demonstrate examples of projects that explore hyperparameter optimization with W&B Sweeps.

Drought Watch Benchmark Progress
- Description: Developing the baseline and exploring submissions to the Drought Watch benchmark.
Tuning Safety Penalties in Reinforcement Learning
- Description: We examine agents trained with different side effect penalties on three different tasks: pattern creation, pattern removal, and navigation.
Meaning and Noise in Hyperparameter Search with W&B Stacey Svetlichnaya
- Description: How do we distinguish signal from pareidolia (imaginary patterns)? This article is showcases what is possible with W&B and aims to inspire further exploration.
Who is Them? Text Disambiguation with Transformers
- Description: Using Hugging Face to explore models for natural language understanding
DeepChem: Molecular Solubility
- Description: Predict chemical properties from molecular structure with random forests and deep nets.
Intro to MLOps: Hyperparameter Tuning
- Description: Explore why hyperparameter optimization matters and look at three algorithms to automate hyperparameter tuning for your machine learning models.

selfm-anaged

The following how-to-guide demonstrates how to solve real-world problems with W&B:

Sweeps with XGBoost
- Description: How to use W&B Sweeps for hyperparameter tuning using XGBoost.

Sweep GitHub repository

W&B advocates open source and welcome contributions from the community. Find the W&B Sweeps GitHub repository. For information on how to contribute to the W&B open source repo, see the W&B GitHub Contribution guidelines.

10 - Manage algorithms locally

Search and stop algorithms locally instead of using the W&B cloud-hosted service.

The hyper-parameter controller is hosted by Weights & Biased as a cloud service by default. W&B agents communicate with the controller to determine the next set of parameters to use for training. The controller is also responsible for running early stopping algorithms to determine which runs can be stopped.

The local controller feature allows the user to commence search and stop algorithms locally. The local controller gives the user the ability to inspect and instrument the code in order to debug issues as well as develop new features which can be incorporated into the cloud service.

This feature is offered to support faster development and debugging of new algorithms for the Sweeps tool. It is not intended for actual hyperparameter optimization workloads.

Before you get start, you must install the W&B SDK(wandb). Type the following code snippet into your command line:

pip install wandb sweeps

The following examples assume you already have a configuration file and a training loop defined in a python script or Jupyter Notebook. For more information about how to define a configuration file, see Define sweep configuration.

Run the local controller from the command line

Initialize a sweep similarly to how you normally would when you use hyper-parameter controllers hosted by W&B as a cloud service. Specify the controller flag (controller) to indicate you want to use the local controller for W&B sweep jobs:

wandb sweep --controller config.yaml

Alternatively, you can separate initializing a sweep and specifying that you want to use a local controller into two steps.

To separate the steps, first add the following key-value to your sweep’s YAML configuration file:

controller:
  type: local

Next, initialize the sweep:

wandb sweep config.yaml

wandb sweep generates a sweep ID. After you initialized the sweep, start a controller with wandb controller:

wandb controller {user}/{entity}/{sweep_id}

Once you have specified you want to use a local controller, start one or more Sweep agents to execute the sweep. Start a W&B Sweep similar to how you normally would. See Start sweep agents, for more information.

wandb sweep sweep_ID

Run a local controller with W&B Python SDK

The following code snippets demonstrate how to specify and use a local controller with the W&B Python SDK.

The simplest way to use a controller with the Python SDK is to pass the sweep ID to the wandb.controller method. Next, use the return objects run method to start the sweep job:

sweep = wandb.controller(sweep_id)
sweep.run()

If you want more control of the controller loop:

import wandb

sweep = wandb.controller(sweep_id)
while not sweep.done():
    sweep.print_status()
    sweep.step()
    time.sleep(5)

Or even more control over the parameters served:

import wandb

sweep = wandb.controller(sweep_id)
while not sweep.done():
    params = sweep.search()
    sweep.schedule(params)
    sweep.print_status()

If you want to specify your sweep entirely with code you can do something like this:

import wandb

sweep = wandb.controller()
sweep.configure_search("grid")
sweep.configure_program("train-dummy.py")
sweep.configure_controller(type="local")
sweep.configure_parameter("param1", value=3)
sweep.create()
sweep.run()

11 - Sweeps troubleshooting

Troubleshoot common W&B Sweep issues.

Troubleshoot common error messages with the guidance suggested.

`CommError, Run does not exist` and `ERROR Error uploading`

Your W&B Run ID might be defined if these two error messages are both returned. As an example, you might have a similar code snippet defined somewhere in your Jupyter Notebooks or Python script:

wandb.init(id="some-string")

You can not set a Run ID for W&B Sweeps because W&B automatically generates random, unique IDs for Runs created by W&B Sweeps.

W&B Run IDs need to be unique within a project.

We recommend you pass a name to the name parameter when you initialized W&B, if you want to set a custom name that will appear on tables and graphs. For example:

wandb.init(name="a helpful readable run name")

`Cuda out of memory`

Refactor your code to use process-based executions if you see this error message. More specifically, rewrite your code to a Python script. In addition, call the W&B Sweep Agent from the CLI, instead of the W&B Python SDK.

As an example, suppose you rewrite your code to a Python script called train.py. Add the name of the training script (train.py) to your YAML Sweep configuration file (config.yaml in this example):

program: train.py
method: bayes
metric:
  name: validation_loss
  goal: maximize
parameters:
  learning_rate:
    min: 0.0001
    max: 0.1
  optimizer:
    values: ["adam", "sgd"]

Next, add the following to your train.py Python script:

if _name_ == "_main_":
    train()

Navigate to your CLI and initialize a W&B Sweep with wandb sweep:

wandb sweep config.yaml

Make a note of the W&B Sweep ID that is returned. Next, start the Sweep job with wandb agent with the CLI instead of the Python SDK (wandb.agent). Replace sweep_ID in the code snippet below with the Sweep ID that was returned in the previous step:

wandb agent sweep_ID

`anaconda 400 error`

The following error usually occurs when you do not log the metric that you are optimizing:

wandb: ERROR Error while calling W&B API: anaconda 400 error: 
{"code": 400, "message": "TypeError: bad operand type for unary -: 'NoneType'"}

Within your YAML file or nested dictionary you specify a key named “metric” to optimize. Ensure that you log (wandb.log) this metric. In addition, ensure you use the exact metric name that you defined the sweep to optimize within your Python script or Jupyter Notebook. For more information about configuration files, see Define sweep configuration.

12 - Sweeps UI

Describes the different components of the Sweeps UI.

The state (State), creation time (Created), the entity that started the sweep (Creator), the number of runs completed (Run count), and the time it took to compute the sweep (Compute time) are displayed in the Sweeps UI. The expected number of runs a sweep will create (Est. Runs) is provided when you do a grid search over a discrete search space. You can also click on a sweep to pause, resume, stop, or kill the sweep from the interface.

13 - Tutorial: Create sweep job from project

Tutorial on how to create sweep jobs from a pre-existing W&B project.

This tutorial explains how to create sweep jobs from a pre-existing W&B project. We will use the Fashion MNIST dataset to train a PyTorch convolutional neural network how to classify images. The required code an dataset is located in the W&B examples repository (PyTorch CNN Fashion)

Explore the results in this W&B Dashboard.

1. Create a project

First, create a baseline. Download the PyTorch MNIST dataset example model from W&B examples GitHub repository. Next, train the model. The training script is within the examples/pytorch/pytorch-cnn-fashion directory.

Clone this repo git clone https://github.com/wandb/examples.git
Open this example cd examples/pytorch/pytorch-cnn-fashion
Run a run manually python train.py

Optionally explore the example appear in the W&B App UI dashboard.

View an example project page →

2. Create a sweep

From your project page, open the Sweep tab in the sidebar and select Create Sweep.

The auto-generated configuration guesses values to sweep over based on the runs you have completed. Edit the configuration to specify what ranges of hyperparameters you want to try. When you launch the sweep, it starts a new process on the hosted W&B sweep server. This centralized service coordinates the agents— the machines that are running the training jobs.

3. Launch agents

Next, launch an agent locally. You can launch up to 20 agents on different machines in parallel if you want to distribute the work and finish the sweep job more quickly. The agent will print out the set of parameters it’s trying next.

Now you’re running a sweep. The following image demonstrates what the dashboard looks like as the example sweep job is running. View an example project page →

Seed a new sweep with existing runs

Launch a new sweep using existing runs that you’ve previously logged.

Open your project table.
Select the runs you want to use with checkboxes on the left side of the table.
Click the dropdown to create a new sweep.

Your sweep will now be set up on our server. All you need to do is launch one or more agents to start running runs.

If you kick off the new sweep as a bayesian sweep, the selected runs will also seed the Gaussian Process.