Artifacts

Overview of what W&B Artifacts are, how they work, and how to get started using W&B Artifacts.

Use W&B Artifacts to track and version data as the inputs and outputs of your W&B Runs. For example, a model training run might take in a dataset as input and produce a trained model as output. You can log hyperparameters, metadatra, and metrics to a run, and you can use an artifact to log, track, and version the dataset used to train the model as input and another artifact for the resulting model checkpoints as output.

Use cases

You can use artifacts throughout your entire ML workflow as inputs and outputs of runs. You can use datasets, models, or even other artifacts as inputs for processing.

Use Case Input Output
Model Training Dataset (training and validation data) Trained Model
Dataset Pre-Processing Dataset (raw data) Dataset (pre-processed data)
Model Evaluation Model + Dataset (test data) W&B Table
Model Optimization Model Optimized Model

Create an artifact

Create an artifact with four lines of code:

  1. Create a W&B run.
  2. Create an artifact object with the wandb.Artifact API.
  3. Add one or more files, such as a model file or dataset, to your artifact object.
  4. Log your artifact to W&B.

For example, the proceeding code snippet shows how to log a file called dataset.h5 to an artifact called example_artifact:

import wandb

run = wandb.init(project = "artifacts-example", job_type = "add-dataset")
artifact = wandb.Artifact(name = "example_artifact", type = "dataset")
artifact.add_file(local_path = "./dataset.h5", name = "training_dataset")
artifact.save()

# Logs the artifact version "my_data" as a dataset with data from dataset.h5

Download an artifact

Indicate the artifact you want to mark as input to your run with the use_artifact method.

Following the preceding code snippet, this next code block shows how to use the training_dataset artifact:

artifact = run.use_artifact("training_dataset:latest") #returns a run object using the "my_data" artifact

This returns an artifact object.

Next, use the returned object to download all contents of the artifact:

datadir = artifact.download() #downloads the full "my_data" artifact to the default directory.

Next steps


Create an artifact

Create, construct a W&B Artifact. Learn how to add one or more files or a URI reference to an Artifact.

Download and use artifacts

Download and use Artifacts from multiple projects.

Update an artifact

Update an existing Artifact inside and outside of a W&B Run.

Create an artifact alias

Create custom aliases for W&B Artifacts.

Create an artifact version

Create a new artifact version from a single run or from a distributed process.

Track external files

Track files saved outside the W&B such as in an Amazon S3 bucket, GCS bucket, HTTP file server, or even an NFS share.

Manage data

Explore artifact graphs

Traverse automatically created direct acyclic W&B Artifact graphs.

Artifact data privacy and compliance

Learn where W&B files are stored by default. Explore how to save, store sensitive information.

Tutorial: Create, track, and use a dataset artifact

Artifacts quickstart shows how to create, track, and use a dataset artifact with W&B.


Last modified January 20, 2025: Add svg logos to front page (#1002) (e1444f4)