spaCy

3 minute read

spaCy is a popular “industrial-strength” NLP library: fast, accurate models with a minimum of fuss. As of spaCy v3, W&B can now be used with spacy train to track your spaCy model’s training metrics as well as to save and version your models and datasets. And all it takes is a few added lines in your configuration.

An API key authenticates your machine to W&B. You can generate an API key from your user profile.

For a more streamlined approach, you can generate an API key by going directly to the W&B authorization page. Copy the displayed API key and save it in a secure location such as a password manager.

Click your user profile icon in the upper right corner.
Select User Settings, then scroll to the API Keys section.
Click Reveal. Copy the displayed API key. To hide the API key, reload the page.

Install the `wandb` library and log in

To install the wandb library locally and log in:

Set the WANDB_API_KEY environment variable to your API key.
```
export WANDB_API_KEY=<your_api_key>
```
Install the wandb library and log in.
```
pip install wandb

wandb login
```

pip install wandb

import wandb
wandb.login()

!pip install wandb

import wandb
wandb.login()

Add the `WandbLogger` to your spaCy config file

spaCy config files are used to specify all aspects of training, not just logging – GPU allocation, optimizer choice, dataset paths, and more. Minimally, under [training.logger] you need to provide the key @loggers with the value "spacy.WandbLogger.v3", plus a project_name.

For more on how spaCy training config files work and on other options you can pass in to customize training, check out spaCy’s documentation.

[training.logger]
@loggers = "spacy.WandbLogger.v3"
project_name = "my_spacy_project"
remove_config_values = ["paths.train", "paths.dev", "corpora.train.path", "corpora.dev.path"]
log_dataset_dir = "./corpus"
model_log_interval = 1000

Name	Description
`project_name`	`str`. The name of the W&B Project. The project will be created automatically if it doesn’t exist yet.
`remove_config_values`	`List[str]` . A list of values to exclude from the config before it is uploaded to W&B. `[]` by default.
`model_log_interval`	`Optional int`. `None` by default. If set, enables model versioning with Artifacts. Pass in the number of steps to wait between logging model checkpoints. `None` by default.
`log_dataset_dir`	`Optional str`. If passed a path, the dataset will be uploaded as an Artifact at the beginning of training. `None` by default.
`entity`	`Optional str` . If passed, the run will be created in the specified entity
`run_name`	`Optional str` . If specified, the run will be created with the specified name.

Start training

Once you have added the WandbLogger to your spaCy training config you can run spacy train as usual.

python -m spacy train \
    config.cfg \
    --output ./output \
    --paths.train ./train \
    --paths.dev ./dev

python -m spacy train \
    config.cfg \
    --output ./output \
    --paths.train ./train \
    --paths.dev ./dev

!python -m spacy train \
    config.cfg \
    --output ./output \
    --paths.train ./train \
    --paths.dev ./dev

When training begins, a link to your training run’s W&B page will be output which will take you to this run’s experiment tracking dashboard in the W&B web UI.

Feedback

Was this page helpful?

Glad to hear it! If you have more to say, please let us know.

Sorry to hear that. Please tell us how we can improve.

Last modified August 7, 2025

Edit page Report issue PDF

spaCy

Sign up and create an API key

Install the wandb library and log in

Add the WandbLogger to your spaCy config file

Start training

Feedback

Install the `wandb` library and log in

Add the `WandbLogger` to your spaCy config file