This is the multi-page printable view of this section. Click here to print.
Data security
- 1: Bring your own bucket (BYOB)
- 2: Access BYOB using pre-signed URLs
- 3: Configure IP allowlisting for Dedicated Cloud
- 4: Configure private connectivity to Dedicated Cloud
- 5: Data encryption in Dedicated cloud
1 - Bring your own bucket (BYOB)
Bring your own bucket (BYOB) allows you to store W&B artifacts and other related sensitive data in your own cloud or on-prem infrastructure. In case of Dedicated cloud or SaaS Cloud, data that you store in your bucket is not copied to the W&B managed infrastructure.
- Communication between W&B SDK / CLI / UI and your buckets occurs using pre-signed URLs.
- W&B uses a garbage collection process to delete W&B Artifacts. For more information, see Deleting Artifacts.
- You can specify a sub-path when configuring a bucket, to ensure that W&B does not store any files in a folder at the root of the bucket. It can help you better conform to your organzation’s bucket governance policy.
Configuration options
There are two scopes you can configure your storage bucket to: at the Instance level or at a Team level.
- Instance level: Any user that has relevant permissions within your organization can access files stored in your instance level storage bucket.
- Team level: Members of a W&B Team can access files stored in the bucket configured at the Team level. Team level storage buckets allow greater data access control and data isolation for teams with highly sensitive data or strict compliance requirements.
You can configure your bucket at both the instance level and separately for one or more teams within your organization.
For example, suppose you have a team called Kappa in your organization. Your organization (and Team Kappa) use the Instance level storage bucket by default. Next, you create a team called Omega. When you create Team Omega, you configure a Team level storage bucket for that team. Files generated by Team Omega are not accessible by Team Kappa. However, files created by Team Kappa are accessible by Team Omega. If you want to isolate data for Team Kappa, you must configure a Team level storage bucket for them as well.
Availability matrix
The following table shows the availability of BYOB across different W&B Server deployment types. An X
means the feature is available on the specific deployment type.
W&B Server deployment type | Instance level | Team level | Additional information |
---|---|---|---|
Dedicated cloud | X | X | Both the instance and team level BYOB are available for Amazon Web Services, Google Cloud Platform and Microsoft Azure. For the team-level BYOB, you can connect to a cloud-native storage bucket in the same or another cloud, or even a S3-compatible secure storage like MinIO hosted in your cloud or on-prem infrastructure. |
SaaS Cloud | Not Applicable | X | The team level BYOB is available only for Amazon Web Services and Google Cloud Platform. W&B fully manages the default and only storage bucket for Microsoft Azure. |
Self-managed | X | X | Instance level BYOB is the default since the instance is fully managed by you. If your self-managed instance is in cloud, you can connect to a cloud-native storage bucket in the same or another cloud for the team-level BYOB. You can also use S3-compatible secure storage like MinIO for either of instance or team-level BYOB. |
Cross-cloud or S3-compatible storage for team-level BYOB
You can connect to a cloud-native storage bucket in another cloud or to an S3-compatible storage bucket like MinIO for team-level BYOB in your Dedicated cloud or Self-managed instance.
To enable the use of cross-cloud or S3-compatible storage, specify the storage bucket including the relevant access key in one of the following formats, using the GORILLA_SUPPORTED_FILE_STORES
environment variable for your W&B instance.
Configure an S3-compatible storage for team-level BYOB in Dedicated cloud or Self-managed instance
Specify the path using the following format:
s3://<accessKey>:<secretAccessKey>@<url_endpoint>/<bucketName>?region=<region>?tls=true
The region
parameter is mandatory, except for when your W&B instance is in AWS and the AWS_REGION
configured on the W&B instance nodes matches the region configured for the S3-compatible storage.
Configure a cross-cloud native storage for team-level BYOB in Dedicated cloud or Self-managed instance
Specify the path in a format specific to the locations of your W&B instance and storage bucket:
From W&B instance in GCP or Azure to a bucket in AWS:
s3://<accessKey>:<secretAccessKey>@<s3_regional_url_endpoint>/<bucketName>
From W&B instance in GCP or AWS to a bucket in Azure:
az://:<urlEncodedAccessKey>@<storageAccountName>/<containerName>
From W&B instance in AWS or Azure to a bucket in GCP:
gs://<serviceAccountEmail>:<urlEncodedPrivateKey>@<bucketName>
Reach out to W&B Support at support@wandb.com for more information.
Cloud storage in same cloud as W&B platform
Based on your use case, configure a storage bucket at the team or instance level. How a storage bucket is provisioned or configured is the same irrespective of the level it’s configured at, except for the access mechanism in Azure.
W&B recommends that you use a Terraform module managed by W&B to provision a storage bucket along with the necessary access mechanism and related IAM permissions:
- AWS
- GCP
- Azure - Instance level BYOB or Team level BYOB
-
Provision the KMS Key
W&B requires you to provision a KMS Key to encrypt and decrypt the data on the S3 bucket. The key usage type must be
ENCRYPT_DECRYPT
. Assign the following policy to the key:{ "Version": "2012-10-17", "Statement": [ { "Sid" : "Internal", "Effect" : "Allow", "Principal" : { "AWS" : "<Your_Account_Id>" }, "Action" : "kms:*", "Resource" : "<aws_kms_key.key.arn>" }, { "Sid" : "External", "Effect" : "Allow", "Principal" : { "AWS" : "<aws_principal_and_role_arn>" }, "Action" : [ "kms:Decrypt", "kms:Describe*", "kms:Encrypt", "kms:ReEncrypt*", "kms:GenerateDataKey*" ], "Resource" : "<aws_kms_key.key.arn>" } ] }
Replace
<Your_Account_Id>
and<aws_kms_key.key.arn>
accordingly.If you are using SaaS Cloud or Dedicated cloud, replace
<aws_principal_and_role_arn>
with the corresponding value:- For SaaS Cloud:
arn:aws:iam::725579432336:role/WandbIntegration
- For Dedicated cloud:
arn:aws:iam::830241207209:root
This policy grants your AWS account full access to the key and also assigns the required permissions to the AWS account hosting the W&B Platform. Keep a record of the KMS Key ARN.
- For SaaS Cloud:
-
Provision the S3 Bucket
Follow these steps to provision the S3 bucket in your AWS account:
-
Create the S3 bucket with a name of your choice. Optionally create a folder which you can configure as sub-path to store all W&B files.
-
Enable bucket versioning.
-
Enable server side encryption, using the KMS key from the previous step.
-
Configure CORS with the following policy:
[ { "AllowedHeaders": [ "*" ], "AllowedMethods": [ "GET", "HEAD", "PUT" ], "AllowedOrigins": [ "*" ], "ExposeHeaders": [ "ETag" ], "MaxAgeSeconds": 3600 } ]
-
Grant the required S3 permissions to the AWS account hosting the W&B Platform, which requires these permissions to generate pre-signed URLs that AI workloads in your cloud infrastructure or user browsers utilize to access the bucket.
{ "Version": "2012-10-17", "Id": "WandBAccess", "Statement": [ { "Sid": "WAndBAccountAccess", "Effect": "Allow", "Principal": { "AWS": "<aws_principal_and_role_arn>" }, "Action" : [ "s3:GetObject*", "s3:GetEncryptionConfiguration", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:ListBucketVersions", "s3:AbortMultipartUpload", "s3:DeleteObject", "s3:PutObject", "s3:GetBucketCORS", "s3:GetBucketLocation", "s3:GetBucketVersioning" ], "Resource": [ "arn:aws:s3:::<wandb_bucket>", "arn:aws:s3:::<wandb_bucket>/*" ] } ] }
Replace
<wandb_bucket>
accordingly and keep a record of the bucket name. If you are using Dedicated cloud, share the bucket name with your W&B team in case of instance level BYOB. In case of team level BYOB on any deployment type, configure the bucket while creating the team.If you are using SaaS Cloud or Dedicated cloud, replace
<aws_principal_and_role_arn>
with the corresponding value.- For SaaS Cloud:
arn:aws:iam::725579432336:role/WandbIntegration
- For Dedicated cloud:
arn:aws:iam::830241207209:root
- For SaaS Cloud:
-
For more details, see the AWS self-managed hosting guide.
-
Provision the GCS Bucket
Follow these steps to provision the GCS bucket in your GCP project:
-
Create the GCS bucket with a name of your choice. Optionally create a folder which you can configure as sub-path to store all W&B files.
-
Enable soft deletion.
-
Enable object versioning.
-
Set encryption type to
Google-managed
. -
Set the CORS policy with
gsutil
. This is not possible in the UI. -
Create a file called
cors-policy.json
locally. -
Copy the following CORS policy into the file and save it.
[ { "origin": ["*"], "responseHeader": ["Content-Type"], "exposeHeaders": ["ETag"], "method": ["GET", "HEAD", "PUT"], "maxAgeSeconds": 3600 } ]
-
Replace
<bucket_name>
with the correct bucket name and rungsutil
.gsutil cors set cors-policy.json gs://<bucket_name>
-
Verify the bucket’s policy. Replace
<bucket_name>
with the correct bucket name.gsutil cors get gs://<bucket_name>
-
-
If you are using SaaS Cloud or Dedicated cloud, grant the
Storage Admin
role to the GCP service account linked to the W&B Platform:- For SaaS Cloud, the account is:
wandb-integration@wandb-production.iam.gserviceaccount.com
- For Dedicated cloud the account is:
deploy@wandb-production.iam.gserviceaccount.com
Keep a record of the bucket name. If you are using Dedicated cloud, share the bucket name with your W&B team in case of instance level BYOB. In case of team level BYOB on any deployment type, configure the bucket while creating the team.
- For SaaS Cloud, the account is:
-
Provision the Azure Blob Storage
For the instance level BYOB, if you’re not using this Terraform module, follow the steps below to provision a Azure Blob Storage bucket in your Azure subscription:
-
Create a bucket with a name of your choice. Optionally create a folder which you can configure as sub-path to store all W&B files.
-
Enable blob and container soft deletion.
-
Enable versioning.
-
Configure the CORS policy on the bucket
To set the CORS policy through the UI go to the blob storage, scroll down to
Settings/Resource Sharing (CORS)
and then set the following:Parameter Value Allowed Origins *
Allowed Methods GET
,HEAD
,PUT
Allowed Headers *
Exposed Headers *
Max Age 3600
-
-
Generate a storage account access key, and keep a record of that along with the storage account name. If you are using Dedicated cloud, share the storage account name and access key with your W&B team using a secure sharing mechanism.
For the team level BYOB, W&B recommends that you use Terraform to provision the Azure Blob Storage bucket along with the necessary access mechanism and permissions. If you use Dedicated cloud, provide the OIDC issuer URL for your instance. Make a note of details that you need to configure the bucket while creating the team:
- Storage account name
- Storage container name
- Managed identity client id
- Azure tenant id
Configure BYOB in W&B
GORILLA_SUPPORTED_FILE_STORES
environment variable for your W&B instance, before you configure it for a team using the instructions below.To configure a storage bucket at the team level when you create a W&B Team:
-
Provide a name for your team in the Team Name field.
-
Select External storage for the Storage type option.
-
Choose either New bucket from the dropdown or select an existing bucket.
Multiple W&B Teams can use the same cloud storage bucket. To enable this, select an existing cloud storage bucket from the dropdown.
-
From the Cloud provider dropdown, select your cloud provider.
-
Provide the name of your storage bucket for the Name field. If you have a Dedicated cloud or Self-managed instance on Azure, provide the values for Account name and Container name fields.
-
(Optional) Provide the bucket sub-path in the optional Path field. Do this if you would not like W&B to store any files in a folder at the root of the bucket.
-
(Optional if using AWS bucket) Provide the ARN of your KMS encryption key for the KMS key ARN field.
-
(Optional if using Azure bucket) Provide the values for the Tenant ID and the Managed Identity Client ID fields.
-
(Optional on SaaS Cloud) Optionally invite team members when creating the team.
-
Press the Create Team button.
An error or warning appears at the bottom of the page if there are issues accessing the bucket or the bucket has invalid settings.
Reach out to W&B Support at support@wandb.com to configure instance level BYOB for your Dedicated cloud or Self-managed instance.
2 - Access BYOB using pre-signed URLs
W&B uses pre-signed URLs to simplify access to blob storage from your AI workloads or user browsers. For basic information on pre-signed URLs, refer to Pre-signed URLs for AWS S3, Signed URLs for Google Cloud Storage and Shared Access Signature for Azure Blob Storage.
When needed, AI workloads or user browser clients within your network request pre-signed URLs from the W&B Platform. W&B Platform then access the relevant blob storage to generate the pre-signed URL with required permissions, and returns it back to the client. The client then uses the pre-signed URL to access the blob storage for object upload or retrieval operations. URL expiry time for object downloads is 1 hour, and it is 24 hours for object uploads as some large objects may need more time to upload in chunks.
Team-level access control
Each pre-signed URL is restricted to specific buckets based on team level access control in the W&B platform. If a user is part of a team which is mapped to a blob storage bucket using secure storage connector, and if that user is part of only that team, then the pre-signed URLs generated for their requests would not have permissions to access blob storage buckets mapped to other teams.
Network restriction
W&B recommends restricting the networks that can use pre-signed URLs to access the blob storage, by using IAM policy based restrictions on the buckets.
In case of AWS, one can use VPC or IP address based network restriction. It ensures that your W&B specific buckets are accessed only from networks where your AI workloads are running, or from gateway IP addresses that map to your user machines if your users access artifacts using the W&B UI.
Audit logs
W&B also recommends to use W&B audit logs in addition to blob storage specific audit logs. For latter, refer to AWS S3 access logs,Google Cloud Storage audit logs and Monitor Azure blob storage. Admin and security teams can use audit logs to keep track of which user is doing what in the W&B product and take necessary action if they determine that some operations need to be limited for certain users.
3 - Configure IP allowlisting for Dedicated Cloud
You can restrict access to your Dedicated Cloud instance from only an authorized list of IP addresses. This applies to the access from your AI workloads to the W&B APIs and from your user browsers to the W&B app UI as well. Once IP allowlisting has been set up for your Dedicated Cloud instance, W&B denies any requests from other unauthorized locations. Reach out to your W&B team to configure IP allowlisting for your Dedicated Cloud instance.
IP allowlisting is available on Dedicated Cloud instances on AWS, GCP and Azure.
You can use IP allowlisting with secure private connectivity. If you use IP allowlisting with secure private connectivity, W&B recommends using secure private connectivity for all traffic from your AI workloads and majority of the traffic from your user browsers if possible, while using IP allowlisting for instance administration from privileged locations.
/32
IP addresses. Using individual IP addresses is not scalable and has strict limits per cloud.4 - Configure private connectivity to Dedicated Cloud
You can connect to your Dedicated Cloud instance over the cloud provider’s secure private network. This applies to the access from your AI workloads to the W&B APIs and optionally from your user browsers to the W&B app UI as well. When using private connectivity, the relevant requests and responses do not transit through the public network or internet.
Secure private connectivity is available on Dedicated Cloud instances on AWS, GCP and Azure:
- Using AWS Privatelink on AWS
- Using GCP Private Service Connect on GCP
- Using Azure Private Link on Azure
Once enabled, W&B creates a private endpoint service for your instance and provides you the relevant DNS URI to connect to. With that, you can create private endpoints in your cloud accounts that can route the relevant traffic to the private endpoint service. Private endpoints are easier to setup for your AI training workloads running within your cloud VPC or VNet. To use the same mechanism for traffic from your user browsers to the W&B app UI, you must configure appropriate DNS based routing from your corporate network to the private endpoints in your cloud accounts.
You can use secure private connectivity with IP allowlisting. If you use secure private connectivity for IP allowlisting, W&B recommends that you secure private connectivity for all traffic from your AI workloads and majority of the traffic from your user browsers if possible, while using IP allowlisting for instance administration from privileged locations.
5 - Data encryption in Dedicated cloud
W&B uses a W&B-managed cloud-native key to encrypt the W&B-managed database and object storage in every Dedicated cloud, by using the customer-managed encryption key (CMEK) capability in each cloud. In this case, W&B acts as a customer
of the cloud provider, while providing the W&B platform as a service to you. Using a W&B-managed key means that W&B has control over the keys that it uses to encrypt the data in each cloud, thus doubling down on its promise to provide a highly safe and secure platform to all of its customers.
W&B uses a unique key
to encrypt the data in each customer instance, providing another layer of isolation between Dedicated cloud tenants. The capability is available on AWS, Azure and GCP.
Dedicated cloud instances on GCP and Azure that W&B provisioned before August 2024 use the default cloud provider managed key for encrypting the W&B-managed database and object storage. Only new instances that W&B has been creating starting August 2024 use the W&B-managed cloud-native key for the relevant encryption.
Dedicated cloud instances on AWS have been using the W&B-managed cloud-native key for encryption from before August 2024.
W&B doesn’t generally allow customers to bring their own cloud-native key to encrypt the W&B-managed database and object storage in their Dedicated cloud instance, because multiple teams and personas in an organization could have access to its cloud infrastructure for various reasons. Some of those teams or personas may not have context on W&B as a critical component in the organization’s technology stack, and thus may remove the cloud-native key completely or revoke W&B’s access to it. Such an action could corrupt all data in the organization’s W&B instance and thus leave it in a irrecoverable state.
If your organization needs to use their own cloud-native key to encrypt the W&B-managed database and object storage to approve the use of Dedicated cloud for your AI workflows, W&B can review it on a exception basis. If approved, use of your cloud-native key for encryption would conform to the shared responsibility model
of W&B Dedicated cloud. If any user in your organization removes your key or revokes W&B’s access to it at any point when your Dedicated cloud instance is live, W&B would not be liable for any resulting data loss or corruption and also would not be responsible for recovery of such data.