spaCy is a popular "industrial-strength" NLP library: fast, accurate models with a minimum of fuss. As of spaCy v3, Weights and Biases can now be used with
spacy train to track your spaCy model's training metrics as well as to save and version your models and datasets. And all it takes is a few added lines in your configuration!
!pip install wandbimport wandbwandb.login()
pip install wandbwandb login
spaCy config files are used to specify all aspects of training, not just logging -- GPU allocation, optimizer choice, dataset paths, and more. Minimally, under
[training.logger] you need to provide the key
@loggers with the value
"spacy.WandbLogger.v2", plus a
project_name. You can also turn on dataset and model versioning by just adding a line to the config file.
[training.logger]@loggers = "spacy.WandbLogger.v2"project_name = "my_spacy_project"remove_config_values = ["paths.train", "paths.dev", "corpora.train.path", "corpora.dev.path"]log_dataset_dir = "./corpus"model_log_interval = 1000
Once you have added the
WandbLogger to your spaCy training config you can run
spacy train as usual.
!python -m spacy train \config.cfg \--output ./output \--paths.train ./train \--paths.dev ./dev
python -m spacy train \config.cfg \--output ./output \--paths.train ./train \--paths.dev ./dev