Pin and compare runs - Weights & Biases Documentation

Use the W&B App to organize, identify, and compare important runs, such as top performers, production models, failed experiments, and reference runs. To accomplish this, you can organize and compare runs with:

Pinned runs: Pin runs from any project in your workspace to keep them visible at the top of the runs list. You can pin up to 20 runs in a workspace, including runs from other projects.
Baseline run: Specify a baseline run as your reference point for comparisons. The baseline run is always visible in the workspace and at the top of the runs list. In the runs table, summary metric deltas show how each run compares to the baseline. In line plots, the baseline appears with visually distinct styling to help with comparison.

These features are particularly useful for:

Comparing new experiments against your production model.
Comparing runs across projects.
Creating a baseline for your experiments to evaluate how new runs perform against it.
Tracking multiple candidate models during experimentation.
Evaluating whether new runs improve on your best results.

See Limitations.

Pin runs

Pin runs to keep them easily accessible at the top of your workspace. To hide a pinned or baseline run, click the icon. To show a hidden run, click the icon. Pinned runs appear at the top of the run selector with a circular pin icon, separated from other runs by a visual divider. To pin a run:

Navigate to your workspace.
In the run selector or runs table, find the run you want to pin.
Click the action () menu, then select Pin run.

You can pin up to 20 runs in a workspace. If you have a baseline run, you can pin up to 19 runs because the baseline is implicitly pinned.

To unpin a run, click the pin icon, or follow the procedure to pin the run, but select Unpin run instead. Runs that you pin to a project only impact your personal or saved workspace view, including runs pinned from another project.

Compare runs across projects

Compare runs from different projects by selecting runs from another project and pinning them to your current workspace.

Navigate to your project’s workspace.
Click on the Select runs from another project button (small square box with rounded corners and a diagonal arrow) at the top of your workspace’s run selector or the runs table.

Runs table with pinned runs from another project

In the modal, select a project from the Source project dropdown.
Search the source project’s runs to find the run you want to compare. You can search by run name or unique ID.
Select the checkbox next to each run you want to compare, then click Pin runs.

View runs that you pin from another project in the run selector or runs table. Pinned runs from other projects have an open circle icon next to their name. To see which project a pinned run comes from, hover over the run name. Artifacts, logs, and other details for pinned runs link back to the original run. W&B does not import this data into the current project or workspace. To view the details for a run pinned from another project, click the run. W&B opens the original run details in a new browser tab. To compare runs from different projects, pin the runs you want to compare, then use line plots to visually compare them.

Manage and compare a baseline run

You can designate one run as the baseline for the workspace to use it as a reference point for evaluating other runs in your workspace. In the runs selector and runs table, the baseline run appears at the top alongside pinned runs, and has a bookmark icon instead of a pin. In line plots, lines for the baseline run appear bolder than other lines. When hovering over the plot or legend, the baseline run’s line is dashed.

Demo of comparing another run with the baseline

Set a baseline run

To set a baseline run:

Navigate to your workspace.
In the run selector or runs table, find the run you want to use as your baseline.
Click the action () menu, then select Set as baseline.

The baseline run appears at the top of the run selector, separated from other runs by a visual divider. The baseline run has a bookmark icon instead of a circle.

Runs table with a baseline run and pinned runs

Change the baseline run

Only one run can be the baseline at a time. To change which run is your baseline:

Navigate to your workspace.
In the run selector or runs table, find the run you want to use as your new baseline.
Click the action () menu, then select Replace baseline.
If the menu item is inactive, ensure that you have at least one pinning slot available. If necessary, unpin a pinned run by clicking the circular pin icon next to a pinned run.
The new run becomes the baseline, and the previous baseline is automatically pinned so you can find it easily. Optionally, unpin it by clicking its pin icon.

Remove the baseline designation

To remove the baseline designation:

Navigate to your workspace.
In the run selector or runs table, find the current baseline run.
Click the action () menu, then select Remove baseline.
If the menu item is inactive, ensure that you have at least one pinning slot available. If necessary, unpin a pinned run by clicking the circular pin icon next to a pinned run.
The previous baseline is automatically pinned so you can find it easily. Optionally, unpin it by clicking its pin icon.

Compare runs to the baseline

The baseline run is always visible in line plots for metrics the run has logged. In line plots, lines for the baseline run appear bolder than other lines.

Hover over a part of the plot to display a tooltip with values for all visible runs, including the baseline run and pinned runs.
Hover over the baseline run’s legend label to display the line prominently. It appears as a heavy dashed line. Lines for other visible runs appear with reduced saturation.
Hover over another run’s legend label to display that run’s line prominently and compare it with the baseline, which appears as a heavy dashed line. Lines for other visible runs appear with reduced saturation.

Summary metric deltas

When a run is set as the baseline, by default every other run that logs the same summary metric as the baseline run shows the delta (amount of change) of that metric from the baseline. The delta appears to the right of the metric’s value in the run’s row in the runs table. By default, the delta is shown with dark gray text on a dark gray background. To turn on semantic coloring for quick visual reference, you can set the Metric directionality for a column. With directionality set:

If the other run outperforms (is directionally better than) the baseline, the delta is shown in dark red text with a light red background.
If the other run underperforms (is directionally better than) the baseline, the delta is shown in dark teal text with a light teal background.

To set the directionality for a metric:

In the runs table, hover over the column heading for the metric.
Click the action () menu that appears.
Set Metric directionality to Higher values are best or Lower values are best.

The following screenshot shows how the runs nanochat-train-base and nanochat-train-mid compare with the baseline run nanochat-train. Delta metrics are shown for TOTAL_TRAINING_TIME, TRAIN/DT, AND TRAIN/GRAD_NORM.

Screenshot comparing summary metric deltas from the baseline run

Hide summary metric deltas in a workspace

By default, a workspace with a baseline run always displays summary metric deltas. To hide them for a workspace:

In the workspace, click Settings.
In the drawer that appears, click Runs.
In the Baseline tab, toggle Show value deltas in the runs table.
Close the workspace settings drawer.

Use cases

This section describes some scenarios where pinned and baseline runs can help guide your experiments.

Track production models: Ensure that new models meet your quality bar before deployment.
1. Set your production model as the baseline.
2. Compare all experiments against your deployed model to identify candidates that outperform production.
Compare hyperparameter experiments: Evaluate hyperparameter sweeps or manual experiments against your best-known configuration.
1. Set your best known configuration as the baseline.
2. Pin promising candidates as you discover them.
3. Use the line plots to visually compare runs against the baseline.
4. Continue to update the baseline as you find better configurations.

Example workflow

This section illustrates how pinned and baseline runs can help you to compare runs.

Run this example code, which simulates a hyperparameter-tuning scenario with a series of runs. Replace placeholders surrounded with angle brackets (<>) with your own values.

import wandb
import random
import math

def train_model(learning_rate, batch_size, run_name, tags=None):
    """Simulate training a model with given hyperparameters."""
    config = {
        "learning_rate": learning_rate,
        "batch_size": batch_size,
        "optimizer": "adam",
        "architecture": "resnet50"
    }
    
    with wandb.init(
      # Replace with your team and project name
        project="hyperparameter-tuning",
        entity="<team>",
        name=run_name,
        config=config,
        tags=tags or []
    ) as run:
        # Simulate training loop
        for epoch in range(50):
            # Simulated metrics
            accuracy = 0.6 + 0.3 * (1 - math.exp(-learning_rate * epoch / 10))
            loss = 1.0 * math.exp(-learning_rate * epoch / 10)
            
            run.log({
                "epoch": epoch,
                "accuracy": accuracy,
                "loss": loss
            })

# Create baseline run with standard configuration
train_model(
    learning_rate=0.001,
    batch_size=64,
    run_name="baseline-config",
    tags=["baseline", "production"]
)

# Experiment with different learning rates
train_model(
    learning_rate=0.003,
    batch_size=64,
    run_name="lr-experiment-0.003",
    tags=["experiment"]
)

train_model(
    learning_rate=0.0001,
    batch_size=64,
    run_name="lr-experiment-0.0001",
    tags=["experiment"]
)

After running this code, your workspace has three runs.

Set baseline-config as your baseline run.
Pin baseline-config to keep it visible.
Compare the experiment runs against the baseline.

In the runs table, review the summary metric deltas next to each run’s values to compare the run to the baseline.
In line plots, compare the performance of one or more runs to the baseline, which is always visible.

Pin promising experiments for further investigation. In this example, after 50 epochs, lr-experiment-0.003 has the highest accuracy (~0.64) and the lowest loss (~0.86).

Limitations

The following features are not yet supported for pinned and baseline runs:

Grouping: When viewing runs in the run selector or runs table, if runs are grouped by a column, pinned and baseline runs are not visually distinct from other runs.
Reports: In a run set in a W&B Report, pinned and baseline runs are not visually distinct from other runs.
Workspace view only: The baseline does not appear when viewing a single run’s workspace.
Line plots only: Baseline comparison is available only for line plots, and is not yet available for other panels such as bar plots or media panels.

​Pin runs

​Compare runs across projects

​Manage and compare a baseline run

​Set a baseline run

​Change the baseline run

​Remove the baseline designation

​Compare runs to the baseline

​Summary metric deltas

​Hide summary metric deltas in a workspace

​Use cases

​Example workflow

​Limitations

Pin runs

Compare runs across projects

Manage and compare a baseline run

Set a baseline run

Change the baseline run

Remove the baseline designation

Compare runs to the baseline

Summary metric deltas

Hide summary metric deltas in a workspace

Use cases

Example workflow

Limitations