Artifacts overview - Weights & Biases Documentation

Try in Colab Try in W&B

Use W&B Artifacts to track and version data as the inputs and outputs of your W&B Runs. For example, a model training run might take in a dataset as input and produce a trained model as output. You can log hyperparameters, metadata, and metrics to a run, and you can use an artifact to log, track, and version the dataset used to train the model as input and another artifact for the resulting model checkpoints as output.

Use cases

You can use artifacts throughout your entire ML workflow as inputs and outputs of runs. You can use datasets, models, or even other artifacts as inputs for processing.

Artifacts workflow diagram with inputs and outputs for model training, data processing, and model evaluation

Use Case	Input	Output
Model Training	Dataset (training and validation data)	Trained Model
Dataset Pre-Processing	Dataset (raw data)	Dataset (pre-processed data)
Model Evaluation	Model + Dataset (test data)	W&B Table
Model Optimization	Model	Optimized Model

The following code snippets are meant to be run in order.

Create an artifact

Create an artifact with four lines of code:

Create a W&B run.
Create an artifact object with wandb.Artifact.
Add one or more files, such as a model file or dataset, to the artifact object with wandb.Artifact.add_file().
Log your artifact to W&B with wandb.Run.log_artifact().

For example, the following code snippet shows how to log a file called dataset.h5 to an artifact called example_artifact:

import wandb

with wandb.init(project="artifacts-example", job_type="add-dataset") as run:
    artifact = wandb.Artifact(name="example_artifact", type="dataset")
    artifact.add_file(local_path="./dataset.h5", name="training_dataset")
    run.log_artifact(artifact)

The type of the artifact affects how it appears in the W&B platform. If you do not specify a type, it defaults to unspecified.
Each label of the dropdown represents a different type parameter value. In the above code snippet, the artifact’s type is dataset.

See the track external files page for information on how to add references to files or directories stored in external object storage, like an Amazon S3 bucket.

Download an artifact

Indicate the artifact you want to mark as input to your run with the wandb.Run.use_artifact() method. Continuing from the previous code snippet, the following code example shows how to use the artifact called example_artifact that was created earlier:

with wandb.init(project="artifacts-example", job_type="add-dataset") as run:
    artifact = run.use_artifact("training_dataset:latest")  # returns a run object using the "my_data" artifact

This returns an artifact object. Next, use the returned object to download all contents of the artifact:

datadir = artifact.download()  # downloads the full `my_data` artifact to the default directory.

You can pass a custom path into the root parameter to download an artifact to a specific directory. For alternate ways to download artifacts and to see additional parameters, see the guide on downloading and using artifacts.

Next steps

Learn how to version and update artifacts.
Learn how to trigger downstream workflows or notify a Slack channel in response to changes to your artifacts with automations.
Learn about the registry, a space that houses trained models.
Explore the Python SDK and CLI reference guides.

​Use cases

​Create an artifact

​Download an artifact

​Next steps

Use cases

Create an artifact

Download an artifact

Next steps