Launch Experiments with wandb.init

Call wandb.init() at the top of your script to start a new run

Call wandb.init() once at the beginning of your script to initialize a new job. This creates a new run in W&B and launches a background process to sync data.

  • On Prem: If you need a private cloud or local instance of W&B, see our Self Hosted offerings.

  • Automated Environments: Most of these settings can also be controlled via Environment Variables. This is often useful when you're running jobs on a cluster.

Reference Documentation

View the reference docs for this function, generated from the wandb Python library.

Common Questions

How do I launch multiple runs from one script?

If you're trying to start multiple runs from one script, add two things to your code:

  1. run = wandb.init(reinit=True): Use this setting to allow reinitializing runs

  2. run.finish(): Use this at the end of your run to finish logging for that run

import wandb
for x in range(10):
run = wandb.init(reinit=True)
for y in range (100):
wandb.log({"metric": x+y})
run.finish()

Alternatively you can use a python context manager which will automatically finish logging:

import wandb
for x in range(10):
run = wandb.init(reinit=True)
with run:
for y in range(100):
run.log({"metric": x+y})

InitStartError: Error communicating with wandb process

This error indicates that the library is having difficulty launching the process which synchronizes data to the server.

The following workarounds can help resolve the issue in certain environments:

Linux / OS X
Google Colab
Linux / OS X
wandb.init(settings=wandb.Settings(start_method="fork"))
Google Colab
wandb.init(settings=wandb.Settings(start_method="thread"))

How can I use wandb with multiprocessing, e.g. distributed training?

If your training program uses multiple processes you will need to structure your program to avoid making wandb method calls from processes where you did not run wandb.init(). There are several approaches to managing multiprocess training:

  1. Call wandb.init in all your processes, using the group keyword argument to define a shared group. Each process will have its own wandb run and the UI will group the training processes together.

  2. Call wandb.init from just one process and pass data to be logged over multiprocessing queues.

Check out the Distributed Training Guide for more detail on these two approaches, including code examples with Torch DDP.

How do I programmatically access the human-readable run name?

It's available as the .name attribute of a wandb.Run.

import wandb
wandb.init()
run_name = wandb.run.name

Can I just set the run name to the run ID?

If you'd like to overwrite the run name (like snowy-owl-10) with the run ID (like qvlp96vk) you can use this snippet:

import wandb
wandb.init()
wandb.run.name = wandb.run.id
wandb.run.save()

How can I save the git commit associated with my run?

When wandb.init is called in your script, we automatically look for git information to save, including a link to a remote repo and the SHA of the latest commit. The git information should show up on your run page. If you aren't seeing it appear there, make sure that your shell's current working directory when executing your script is located in a folder managed by git.

The git commit and command used to run the experiment are visible to you but are hidden to external users, so if you have a public project, these details will remain private.

Is it possible to save metrics offline and sync them to W&B later?

By default, wandb.init starts a process that syncs metrics in real time to our cloud hosted app. If your machine is offline, you don't have internet access, or you just want to hold off on the upload, here's how to run wandb in offline mode and sync later.

You'll need to set two environment variables.

  1. WANDB_API_KEY=$KEY, where $KEY is the API Key from your settings page

  2. WANDB_MODE="offline"

And here's a sample of what this would look like in your script:

import wandb
import os
os.environ["WANDB_API_KEY"] = YOUR_KEY_HERE
os.environ["WANDB_MODE"] = "dryrun"
config = {
"dataset": "CIFAR10",
"machine": "offline cluster",
"model": "CNN",
"learning_rate": 0.01,
"batch_size": 128,
}
wandb.init(project="offline-demo")
for i in range(100):
wandb.log({"accuracy": i})

Here's a sample terminal output:

And once you're ready, just run a sync command to send that folder to the cloud.

wandb sync wandb/dryrun-folder-name

LaunchError: Permission denied

If you're getting the error message Launch Error: Permission denied, you don't have permissions to log to the project you're trying to send runs to. This might be for a few different reasons.

  1. You aren't logged in on this machine. Run wandb login on the command line.

  2. You've set an entity that doesn't exist. "Entity" should be your username or the name of an existing team. If you need to create a team, go to our Subscriptions page.

  3. You don't have project permissions. Ask the creator of the project to set the privacy to Open so you can log runs to this project.