This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.


If you’re using fastai to train your models, W&B has an easy integration using the WandbCallback. Explore the details in interactive docs with examples →

Sign up and create an API key

An API key authenticates your machine to W&B. You can generate an API key from your user profile.

  1. Click your user profile icon in the upper right corner.
  2. Select User Settings, then scroll to the API Keys section.
  3. Click Reveal. Copy the displayed API key. To hide the API key, reload the page.

Install the wandb library and log in

To install the wandb library locally and log in:

  1. Set the WANDB_API_KEY environment variable to your API key.

    export WANDB_API_KEY=<your_api_key>
  2. Install the wandb library and log in.

    pip install wandb
    wandb login
pip install wandb
import wandb
!pip install wandb

import wandb

Add the WandbCallback to the learner or fit method

import wandb
from fastai.callback.wandb import *

# start logging a wandb run

# To log only during one training phase, cbs=WandbCallback())

# To log continuously for all training phases
learn = learner(..., cbs=WandbCallback())

WandbCallback Arguments

WandbCallback accepts the following arguments:

Args Description
log Whether to log the model’s: gradients , parameters, all or None (default). Losses & metrics are always logged.
log_preds whether we want to log prediction samples (default to True).
log_preds_every_epoch whether to log predictions every epoch or at the end (default to False)
log_model whether we want to log our model (default to False). This also requires SaveModelCallback
model_name The name of the file to save, overrides SaveModelCallback
  • False (default)
  • True will log folder referenced by learn.dls.path.
  • a path can be defined explicitly to reference which folder to log.

Note: subfolder “models” is always ignored.

dataset_name name of logged dataset (default to folder name).
valid_dl DataLoaders containing items used for prediction samples (default to random items from learn.dls.valid.
n_preds number of logged predictions (default to 36).
seed used for defining random samples.

For custom workflows, you can manually log your datasets and models:

  • log_dataset(path, name=None, metadata={})
  • log_model(path, name=None, metadata={})

Note: any subfolder “models” will be ignored.

Distributed Training

fastai supports distributed training by using the context manager distrib_ctx. W&B supports this automatically and enables you to track your Multi-GPU experiments out of the box.

Review this minimal example:

import wandb
from import *
from fastai.distributed import *
from fastai.callback.wandb import WandbCallback

path = rank0_first(lambda: untar_data(URLs.PETS) / "images")

def train():
    dls = ImageDataLoaders.from_name_func(
        label_func=lambda x: x[0].isupper(),
    wandb.init("fastai_ddp", entity="capecape")
    cb = WandbCallback()
    learn = vision_learner(dls, resnet34, metrics=error_rate, cbs=cb).to_fp16()
    with learn.distrib_ctx(sync_bn=False):

if __name__ == "__main__":

Then, in your terminal you will execute:

$ torchrun --nproc_per_node 2

in this case, the machine has 2 GPUs.

You can now run distributed training directly inside a notebook.

import wandb
from import *

from accelerate import notebook_launcher
from fastai.distributed import *
from fastai.callback.wandb import WandbCallback

path = untar_data(URLs.PETS) / "images"

def train():
    dls = ImageDataLoaders.from_name_func(
        label_func=lambda x: x[0].isupper(),
    wandb.init("fastai_ddp", entity="capecape")
    cb = WandbCallback()
    learn = vision_learner(dls, resnet34, metrics=error_rate, cbs=cb).to_fp16()
    with learn.distrib_ctx(in_notebook=True, sync_bn=False):

notebook_launcher(train, num_processes=2)

Log only on the main process

In the examples above, wandb launches one run per process. At the end of the training, you will end up with two runs. This can sometimes be confusing, and you may want to log only on the main process. To do so, you will have to detect in which process you are manually and avoid creating runs (calling wandb.init in all other processes)

import wandb
from import *
from fastai.distributed import *
from fastai.callback.wandb import WandbCallback

path = rank0_first(lambda: untar_data(URLs.PETS) / "images")

def train():
    cb = []
    dls = ImageDataLoaders.from_name_func(
        label_func=lambda x: x[0].isupper(),
    if rank_distrib() == 0:
        run = wandb.init("fastai_ddp", entity="capecape")
        cb = WandbCallback()
    learn = vision_learner(dls, resnet34, metrics=error_rate, cbs=cb).to_fp16()
    with learn.distrib_ctx(sync_bn=False):

if __name__ == "__main__":

in your terminal call:

$ torchrun --nproc_per_node 2
import wandb
from import *

from accelerate import notebook_launcher
from fastai.distributed import *
from fastai.callback.wandb import WandbCallback

path = untar_data(URLs.PETS) / "images"

def train():
    cb = []
    dls = ImageDataLoaders.from_name_func(
        label_func=lambda x: x[0].isupper(),
    if rank_distrib() == 0:
        run = wandb.init("fastai_ddp", entity="capecape")
        cb = WandbCallback()
    learn = vision_learner(dls, resnet34, metrics=error_rate, cbs=cb).to_fp16()
    with learn.distrib_ctx(in_notebook=True, sync_bn=False):

notebook_launcher(train, num_processes=2)


1 - fastai v1

For scripts using fastai v1, we have a callback that can automatically log model topology, losses, metrics, weights, gradients, sample predictions and best trained model.

import wandb
from wandb.fastai import WandbCallback


learn = cnn_learner(data, model, callback_fns=WandbCallback)

Requested logged data is configurable through the callback constructor.

from functools import partial

learn = cnn_learner(
    data, model, callback_fns=partial(WandbCallback, input_type="images")

It is also possible to use WandbCallback only when starting training. In this case it must be instantiated., callbacks=WandbCallback(learn))

Custom parameters can also be given at that stage., callbacks=WandbCallback(learn, input_type="images"))

Example Code

We’ve created a few examples for you to see how the integration works:

Fastai v1


WandbCallback() class supports a number of options:

Keyword argument Default Description
learn N/A the learner to hook.
save_model True save the model if it’s improved at each step. It will also load best model at the end of training.
mode auto min, max, or auto: How to compare the training metric specified in monitor between steps.
monitor None training metric used to measure performance for saving the best model. None defaults to validation loss.
log gradients gradients, parameters, all, or None. Losses & metrics are always logged.
input_type None images or None. Used to display sample predictions.
validation_data None data used for sample predictions if input_type is set.
predictions 36 number of predictions to make if input_type is set and validation_data is None.
seed 12345 initialize random generator for sample predictions if input_type is set and validation_data is None.