> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt
> Use this file to discover all available pages before exploring further.

> Create and log a W&B Artifact. Learn how to add one or more files or a URI reference to an Artifact.

# Create an artifact

Use the W\&B Python SDK to construct artifacts from [W\&B Runs](/models/ref/python/experiments/run). You can add [files, directories, URIs, and files from parallel runs to artifacts](#add-files-to-an-artifact). After you add a file to an artifact, save the artifact to the W\&B Server or [your own private server](/platform/hosting/hosting-options/self-managed). Each artifact is associated with a run.

For information on how to track external files, such as files stored in Amazon S3, see the [Track external files](./track-external-files) page.

## Construct an artifact

Construct a [W\&B Artifact](/models/ref/python/experiments/artifact) in three steps:

1. [Create an artifact Python object with `wandb.Artifact()`](/models/artifacts/construct-an-artifact#1-create-an-artifact-python-object-with-wandb-artifact)
2. [Add one or more files to the artifact](/models/artifacts/construct-an-artifact#2-add-one-more-files-to-the-artifact)
3. [Save your artifact to the W\&B server](/models/artifacts/construct-an-artifact#3-save-your-artifact-to-the-w\&b-server)

### 1. Create an artifact Python object with `wandb.Artifact()`

Initialize the [`wandb.Artifact()`](/models/ref/python/experiments/artifact) class to create an artifact object. Specify the following parameters:

* **Name**: The name of your artifact. The name should be unique, descriptive, and easy to remember.
* **Type**: The type of artifact. The type should be simple, descriptive, and correspond to a single step of your machine learning pipeline. Common artifact types include `'dataset'` or `'model'`.

<Note>
  W\&B uses the "name" and "type" you provide to create a directed acyclic graph in the W\&B App. See the [Explore and traverse artifact graphs](./explore-and-traverse-an-artifact-graph) for more information.
</Note>

<Warning>
  Artifacts can not have the same name, regardless of type. In other words, you can not create an artifact named `cats` of type `dataset` and another artifact with the same name of type `model`.
</Warning>

You can optionally provide a description and metadata when you initialize an artifact object. For more information on available attributes and parameters, see the [`wandb.Artifact`](/models/ref/python/experiments/artifact) Class definition in the Python SDK Reference Guide.

Copy and paste the following code snippet to create an artifact object. Replace the `<name>` and `<type>` placeholders with your own values:

```python theme={null}
import wandb

# Create an artifact object
artifact = wandb.Artifact(name="<name>", type="<type>")
```

### 2. Add one more files to the artifact

[Add files, directories, external URI references (such as Amazon S3) and more](/models/artifacts/construct-an-artifact#add-files-to-an-artifact) to your artifact object.

To add a single file, use the artifact object's [`Artifact.add_file()`](/models/ref/python/experiments/artifact#add_file) method:

```python theme={null}
artifact.add_file(local_path="path/to/file.txt", name="<name>")
```

To add a directory, use the [`Artifact.add_dir()`](/models/ref/python/experiments/artifact#add_dir) method:

```python theme={null}
artifact.add_dir(local_path="path/to/directory", name="<name>")
```

See the next section, [Add files to an artifact](/models/artifacts/construct-an-artifact#add-files-to-an-artifact), for more information on how to add different file types to an artifact.

### 3. Save your artifact to the W\&B server

Save your artifact to the W\&B server. Use the run object's [`wandb.Run.log_artifact()`](/models/ref/python/experiments/run#log_artifact) method to save the artifact.

```python theme={null}
with wandb.init(project="<project>", job_type="<job-type>") as run:
    run.log_artifact(artifact)
```

<Tip>
  **When to use to use `wandb.Run.log_artifact()`  or `Artifact.save()`**

  * Use `wandb.Run.log_artifact()` to create a new artifact and associate it with a specific run.
  * Use `Artifact.save()` to update an existing artifact without creating a new run.
</Tip>

Putting this all together, the following code snippet demonstrates how to create a dataset artifact, add a file to the artifact, and save the artifact to W\&B:

```python theme={null}
import wandb

artifact = wandb.Artifact(name="<name>", type="<type>")
artifact.add_file(local_path="path/to/file.txt", name="<name>")
artifact.add_dir(local_path="path/to/directory", name="<name>")

with wandb.init(project="<project>", job_type="<job-type>") as run:
    run.log_artifact(artifact)
```

Each time you log an artifact with the same name and type, W\&B creates a new version of that artifact. For more information, see [Create a new artifact version](/models/artifacts/create-a-new-artifact-version).

<Warning>
  W\&B performs calls `wandb.Run.log_artifact()` asynchronously for performant uploads. This can cause surprising behavior when logging artifacts in a loop. For example:

  ```python theme={null}
  with wandb.init() as run:
      for i in range(10):
          a = wandb.Artifact(name = "race",
              type="dataset",
              metadata={
                  "index": i,
              },
          )
          # ... add files to artifact a ...
          run.log_artifact(a)
  ```

  The artifact version **v0** is NOT guaranteed to have an index of 0 in its metadata because artifacts may be logged in an arbitrary order.
</Warning>

## Add files to an artifact

The following sections demonstrate how to add different types of objects to an artifact. Assume you have a directory with the following structure as you read through the examples:

```
root-directory
| - hello.txt
| - images/
| -- | cat.png
| -- | dog.png
| - checkpoints/
| -- | model.h5
| - models/
| -- | model.h5
```

### Add a single file

Use [`wandb.Artifact.add_file()`](/models/ref/python/experiments/artifact#method-artifact-add-file) to add a single local file to an artifact. Provide the local path to the file as the `local_path` parameter:

```python theme={null}
import wandb

# Initialize an artifact object
artifact = wandb.Artifact(name="<name>", type="<type>")

# Add a single file
artifact.add_file(local_path="path/file.format")
```

For example, suppose you had a file called `'hello.txt'` in your working local directory.

```python theme={null}
artifact.add_file("hello.txt")
```

The artifact now has the following content:

```
hello.txt
```

Optionally, pass a different name to the `name` parameter to rename the file within the artifact object itself. Continuing the previous example:

```python theme={null}
artifact.add_file(
    local_path="hello.txt", 
    name="new/path/hello_world.txt"
    )
```

The artifact is stored as:

```
new/path/hello_world.txt
```

The following table shows how different API calls produce different artifact contents:

| API Call                                                  | Resulting artifact  |
| --------------------------------------------------------- | ------------------- |
| `artifact.new_file('hello.txt')`                          | `hello.txt`         |
| `artifact.add_file('model.h5')`                           | `model.h5`          |
| `artifact.add_file('checkpoints/model.h5')`               | `model.h5`          |
| `artifact.add_file('model.h5', name='models/mymodel.h5')` | `models/mymodel.h5` |

### Add multiple files

Use the [`wandb.Artifact.add_dir()`](/models/ref/python/experiments/artifact#method-artifact-add-dir) method to add multiple files from a local directory to an artifact. Provide the local path to the directory as the `local_path` parameter.

```python theme={null}
import wandb

# Initialize an artifact object
artifact = wandb.Artifact(name="<name>", type="<type>")

# Add a local directory to the artifact
artifact.add_dir(local_path="path/file.format", name="optional-prefix")
```

The following table show how different API calls produce different artifact contents:

| API Call                                    | Resulting artifact                                                   |
| ------------------------------------------- | -------------------------------------------------------------------- |
| `artifact.add_dir('images')`                | <p><code>cat.png</code></p><p><code>dog.png</code></p>               |
| `artifact.add_dir('images', name='images')` | <p><code>images/cat.png</code></p><p><code>images/dog.png</code></p> |

### Add a URI reference

Artifacts track checksums and other information for reproducibility if the URI has a scheme that W\&B library knows how to handle.

Add an external URI reference to an artifact with the [`wandb.Artifact.add_reference()`](/models/ref/python/experiments/artifact#method-artifact-add-reference) method. Replace the `'uri'` string with your own URI. Optionally pass the desired path within the artifact for the name parameter.

```python theme={null}
# Add a URI reference
artifact.add_reference(uri="uri", name="optional-name")
```

Artifacts currently support the following URI schemes:

* `http(s)://`: A path to a file accessible over HTTP. The artifact will track checksums in the form of etags and size metadata if the HTTP server supports the `ETag` and `Content-Length` response headers.
* `s3://`: A path to an object or object prefix in S3. The artifact will track checksums and versioning information (if the bucket has object versioning enabled) for the referenced objects. Object prefixes are expanded to include the objects under the prefix, up to a maximum of 10,000 objects.
* `gs://`: A path to an object or object prefix in GCS. The artifact will track checksums and versioning information (if the bucket has object versioning enabled) for the referenced objects. Object prefixes are expanded to include the objects under the prefix, up to a maximum of 10,000 objects.

The following table shows how different API calls produce different artifact contents:

| API call                                                                      | Resulting artifact contents                                          |
| ----------------------------------------------------------------------------- | -------------------------------------------------------------------- |
| `artifact.add_reference('s3://my-bucket/model.h5')`                           | `model.h5`                                                           |
| `artifact.add_reference('s3://my-bucket/checkpoints/model.h5')`               | `model.h5`                                                           |
| `artifact.add_reference('s3://my-bucket/model.h5', name='models/mymodel.h5')` | `models/mymodel.h5`                                                  |
| `artifact.add_reference('s3://my-bucket/images')`                             | <p><code>cat.png</code></p><p><code>dog.png</code></p>               |
| `artifact.add_reference('s3://my-bucket/images', name='images')`              | <p><code>images/cat.png</code></p><p><code>images/dog.png</code></p> |

### Add files to artifacts from parallel runs

For large datasets or distributed training, multiple parallel runs might need to contribute to a single artifact.

```python theme={null}
import wandb
import time

# This example uses Ray to runs in parallel
# for demonstration purposes.
import ray

ray.init()

artifact_type = "dataset"
artifact_name = "parallel-artifact"
table_name = "distributed_table"
parts_path = "parts"
num_parallel = 5

# Each batch of parallel writers should have its own
# unique group name.
group_name = "writer-group-{}".format(round(time.time()))


@ray.remote
def train(i):
    """
    Our writer job. Each writer will add one image to the artifact.
    """
    with wandb.init(group=group_name) as run:
        artifact = wandb.Artifact(name=artifact_name, type=artifact_type)

        # Add data to a wandb table.
        table = wandb.Table(columns=["a", "b", "c"], data=[[i, i * 2, 2**i]])

        # Add the table to folder in the artifact
        artifact.add(table, "{}/table_{}".format(parts_path, i))

        # Upserting the artifact creates or appends data to the artifact
        run.upsert_artifact(artifact)


# Launch your runs in parallel
result_ids = [train.remote(i) for i in range(num_parallel)]

# Join on all the writers to make sure their files have
# been added before finishing the artifact.
ray.get(result_ids)

# Once all the writers are finished, finish the artifact
# to mark it ready.
with wandb.init(group=group_name) as run:
    artifact = wandb.Artifact(artifact_name, type=artifact_type)

    # Create a "PartitionTable" pointing to the folder of tables
    # and add it to the artifact.
    artifact.add(wandb.data_types.PartitionedTable(parts_path), table_name)

    # Finish artifact finalizes the artifact, disallowing future "upserts"
    # to this version.
    run.finish_artifact(artifact)
```

## Find path for logged artifacts and other metadata

The following code snippet shows how to use the [W\&B Public API](/models/ref/python/public-api/) to list the files in a run, including their names and URLs. Replace the `<entity/project/run-id>` placeholder with your own values:

```python theme={null}
from wandb.apis.public.files import Files
from wandb.apis.public.api import Api

# Example run object
run = Api().run("<entity/project/run-id>")

# Create a Files object to iterate over files in the run
files = Files(api.client, run)

# Iterate over files
for file in files:
    print(f"File Name: {file.name}")
    print(f"File URL: {file.url}")
    print(f"Path to file in the bucket: {file.direct_url}")
```

See the [File](/models/ref/python/public-api/file) Class for more information on available attributes and methods.
