Create an artifact
Create, construct a W&B Artifact. Learn how to add one or more files or a URI reference to an Artifact.
3 minute read
Use W&B Artifacts to track and version data as the inputs and outputs of your W&B Runs. For example, a model training run might take in a dataset as input and produce a trained model as output. You can log hyperparameters, metadatra, and metrics to a run, and you can use an artifact to log, track, and version the dataset used to train the model as input and another artifact for the resulting model checkpoints as output.
You can use artifacts throughout your entire ML workflow as inputs and outputs of runs. You can use datasets, models, or even other artifacts as inputs for processing.
Use Case | Input | Output |
---|---|---|
Model Training | Dataset (training and validation data) | Trained Model |
Dataset Pre-Processing | Dataset (raw data) | Dataset (pre-processed data) |
Model Evaluation | Model + Dataset (test data) | W&B Table |
Model Optimization | Model | Optimized Model |
Create an artifact with four lines of code:
wandb.Artifact
API.For example, the proceeding code snippet shows how to log a file called dataset.h5
to an artifact called example_artifact
:
import wandb
run = wandb.init(project = "artifacts-example", job_type = "add-dataset")
artifact = wandb.Artifact(name = "example_artifact", type = "dataset")
artifact.add_file(local_path = "./dataset.h5", name = "training_dataset")
artifact.save()
# Logs the artifact version "my_data" as a dataset with data from dataset.h5
Indicate the artifact you want to mark as input to your run with the use_artifact
method.
Following the preceding code snippet, this next code block shows how to use the training_dataset
artifact:
artifact = run.use_artifact("training_dataset:latest") #returns a run object using the "my_data" artifact
This returns an artifact object.
Next, use the returned object to download all contents of the artifact:
datadir = artifact.download() #downloads the full "my_data" artifact to the default directory.
root
parameter to download an artifact to a specific directory. For alternate ways to download artifacts and to see additional parameters, see the guide on downloading and using artifacts.Create, construct a W&B Artifact. Learn how to add one or more files or a URI reference to an Artifact.
Download and use Artifacts from multiple projects.
Update an existing Artifact inside and outside of a W&B Run.
Create custom aliases for W&B Artifacts.
Create a new artifact version from a single run or from a distributed process.
Track files saved outside the W&B such as in an Amazon S3 bucket, GCS bucket, HTTP file server, or even an NFS share.
Traverse automatically created direct acyclic W&B Artifact graphs.
Learn where W&B files are stored by default. Explore how to save, store sensitive information.
Artifacts quickstart shows how to create, track, and use a dataset artifact with W&B.
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.