Documentation
Search…
On Prem / Baremetal
Hosting W&B Server on baremetal servers on-premises
Run your bare metal infrastructure that connects to scaleable external data stores with W&B Server. See the following for instructions on how to provision a new instance and guidance on provisioning external data stores.
W&B application performance depends on scalable data stores that your operations team must configure and manage. The team must provide a MySQL 5.7 or MySQL 8 database server and an S3 compatible object store for the application to scale properly.
Talk to our sales team by reaching out to [email protected].

MySQL 5.7

While W&B currently supports MySQL 5.7 or MySQL 8.
There are a number of enterprise services that make operating a scalable MySQL database simpler. We suggest looking into one of the following solutions:
The most important things to consider when running your own MySQL database are:
  1. 1.
    Backups. You should be periodically backing up the database to a separate facility. We suggest daily backups with at least 1 week of retention.
  2. 2.
    Performance. The disk the server is running on should be fast. We suggest running the database on an SSD or accelerated NAS.
  3. 3.
    Monitoring. The database should be monitored for load. If CPU usage is sustained at > 40% of the system for more than 5 minutes it's likely a good indication the server is resource starved.
  4. 4.
    Availability. Depending on your availability and durability requirements you may want to configure a hot standby on a separate machine that streams all updates in realtime from the primary server and can be used to failover to incase the primary server crashes or become corrupted.
Once you've provisioned a compatible MySQL database you can create a database and user using the following SQL (replacing SOME_PASSWORD).
1
CREATE USER 'wandb_local'@'%' IDENTIFIED BY 'SOME_PASSWORD';
2
CREATE DATABASE wandb_local CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
3
GRANT ALL ON wandb_local.* TO 'wandb_local'@'%' WITH GRANT OPTION;
Copied!

Object Store

The object store can be an externally hosted Minio cluster, or W&B supports any S3 compatible object store that has support for signed urls. To see if your object store supports signed urls, you can run the following script. When connecting to an S3 compatible object store you can specify your credentials in the connection string, i.e.
1
s3://$ACCESS_KEY:[email protected]$HOST/$BUCKET_NAME
Copied!
By default we assume 3rd party object stores are not running over HTTPS. If you've configured a trusted SSL certificate for your object store, you can tell us to only connect over tls by adding the tls query parameter to the url, i.e.
This will only work if the SSL certificate is trusted. We do not support self-signed certificates.
1
s3://$ACCESS_KEY:[email protected]$HOST/$BUCKET_NAME?tls=true
Copied!
When using 3rd party object stores, you'll want to set BUCKET_QUEUE to internal://. This tells the W&B server to manage all object notifications internally instead of depending on SQS.
The most important things to consider when running your own object store are:
  1. 1.
    Storage capacity and performance. It's fine to use magnetic disks, but you should be monitoring the capacity of these disks. Average W&B usage results in 10's to 100's of Gigabytes. Heavy usage could result in Petabytes of storage consumption.
  2. 2.
    Fault tolerance. At a minimum, the physical disk storing the objects should be on a RAID array. Consider running Minio in distributed mode.
  3. 3.
    Availability. Monitoring should be configured to ensure the storage is available.
There are many enterprise alternatives to running your own object storage service such as:

Minio setup

If you're using minio, you can run the following commands to create a bucket .
1
mc config host add local http://$MINIO_HOST:$MINIO_PORT "$MINIO_ACCESS_KEY" "$MINIO_SECRET_KEY" --api s3v4
2
mc mb --region=us-east1 local/local-files
Copied!

Kubernetes Deployment

The following k8s yaml can be customized but should serve as a basic foundation for configuring local in kubernetes.
1
apiVersion: apps/v1
2
kind: Deployment
3
metadata:
4
name: wandb
5
labels:
6
app: wandb
7
spec:
8
strategy:
9
type: RollingUpdate
10
replicas: 1
11
selector:
12
matchLabels:
13
app: wandb
14
template:
15
metadata:
16
labels:
17
app: wandb
18
spec:
19
containers:
20
- name: wandb
21
env:
22
- name: HOST
23
value: https://YOUR_DNS_NAME
24
- name: LICENSE
25
value: XXXXXXXXXXXXXXX
26
- name: BUCKET
27
value: s3://$ACCESS_KEY:[email protected]$HOST/$BUCKET_NAME
28
- name: BUCKET_QUEUE
29
value: internal://
30
- name: AWS_REGION
31
value: us-east-1
32
- name: MYSQL
33
value: mysql://$USERNAME:[email protected]$HOSTNAME/$DATABASE
34
imagePullPolicy: IfNotPresent
35
image: wandb/local:latest
36
ports:
37
- name: http
38
containerPort: 8080
39
protocol: TCP
40
volumeMounts:
41
- name: wandb
42
mountPath: /vol
43
livenessProbe:
44
httpGet:
45
path: /healthz
46
port: http
47
readinessProbe:
48
httpGet:
49
path: /ready
50
port: http
51
startupProbe:
52
httpGet:
53
path: /ready
54
port: http
55
failureThreshold: 60 # allow 10 minutes for migrations
56
resources:
57
requests:
58
cpu: "2000m"
59
memory: 4G
60
limits:
61
cpu: "4000m"
62
memory: 8G
63
---
64
apiVersion: v1
65
kind: Service
66
metadata:
67
name: wandb-service
68
spec:
69
type: NodePort
70
selector:
71
app: wandb
72
ports:
73
- protocol: TCP
74
port: 80
75
targetPort: 8080
76
---
77
apiVersion: networking.k8s.io/v1
78
kind: Ingress
79
metadata:
80
name: wandb-ingress
81
annotations:
82
kubernetes.io/ingress.class: nginx
83
spec:
84
defaultBackend:
85
service:
86
name: wandb-service
87
port:
88
number: 80
Copied!
The k8s YAML above should work in most on-premises installations. However the details of your Ingress and optional SSL termination will vary. See networking below.

Openshift

W&B supports operating from within an Openshift kubernetes cluster. Simply follow the instructions in the kubernetes deployment section above.

Running the container as an un-privileged user

By default the container will run with a $UID of 999. If you're orchestrator requires the container be run with a non-root user you can specify a $UID >= 100000 and a $GID of 0. We must be started as the root group ($GID=0) for file system permissions to function properly. This is the default behavior when running containers in Openshift. An example security context for kubernetes would looks like:
1
spec:
2
securityContext:
3
runAsUser: 100000
4
runAsGroup: 0
Copied!

Docker

You can run wandb/local on any instance that also has Docker installed. We suggest at least 8GB of RAM and 4vCPU's. Simply run the following command to launch the container:
1
docker run --rm -d \
2
-e HOST=https://YOUR_DNS_NAME \
3
-e LICENSE=XXXXX \
4
-e BUCKET=s3://$ACCESS_KEY:$SECRET_KEY@$HOST/$BUCKET_NAME \
5
-e BUCKET_QUEUE=internal:// \
6
-e AWS_REGION=us-east1 \
7
-e MYSQL=mysql://$USERNAME:$PASSWORD@$HOSTNAME/$DATABASE \
8
-p 8080:8080 --name wandb-local wandb/local
Copied!
You'll want to configure a process manager to ensure this process is restarted if it crashes. A good overview of using SystemD to do this can be found here.

Networking

Load Balancer

You'll want to run a load balancer that terminates network requests at the appropriate network boundary. Some customers expose their wandb service on the internet, others only expose it on an internal VPN/VPC. It's important that both the machines being used to execute machine learning payloads and the devices users access the service through web browsers can communicate to this endpoint. Common load balancers include:
  1. 2.
    Istio
  2. 3.
    Caddy
  3. 4.
  4. 5.
    Apache
  5. 6.
    HAProxy

SSL / TLS

The W&B server does not terminate SSL. If your security policies require SSL communication within your trusted networks consider using a tool like Istio and side car containers. The load balancer itself should terminate SSL with a valid certificate. Using self-signed certificates is not supported and will cause a number of challenges for users. If possible using a service like Let's Encrypt is a great way to provided trusted certificates to your load balancer. Services like Caddy and Cloudflare manage SSL for you.

Example Nginx Configuration

The following is an example configuration using nginx as a reverse proxy.
1
events {}
2
http {
3
# If we receive X-Forwarded-Proto, pass it through; otherwise, pass along the
4
# scheme used to connect to this server
5
map $http_x_forwarded_proto $proxy_x_forwarded_proto {
6
default $http_x_forwarded_proto;
7
'' $scheme;
8
}
9
10
# Also, in the above case, force HTTPS
11
map $http_x_forwarded_proto $sts {
12
default '';
13
"https" "max-age=31536000; includeSubDomains";
14
}
15
16
# If we receive X-Forwarded-Host, pass it though; otherwise, pass along $http_host
17
map $http_x_forwarded_host $proxy_x_forwarded_host {
18
default $http_x_forwarded_host;
19
'' $http_host;
20
}
21
22
# If we receive X-Forwarded-Port, pass it through; otherwise, pass along the
23
# server port the client connected to
24
map $http_x_forwarded_port $proxy_x_forwarded_port {
25
default $http_x_forwarded_port;
26
'' $server_port;
27
}
28
29
# If we receive Upgrade, set Connection to "upgrade"; otherwise, delete any
30
# Connection header that may have been passed to this server
31
map $http_upgrade $proxy_connection {
32
default upgrade;
33
'' close;
34
}
35
36
server {
37
listen 443 ssl;
38
server_name www.example.com;
39
ssl_certificate www.example.com.crt;
40
ssl_certificate_key www.example.com.key;
41
42
proxy_http_version 1.1;
43
proxy_buffering off;
44
proxy_set_header Host $http_host;
45
proxy_set_header Upgrade $http_upgrade;
46
proxy_set_header Connection $proxy_connection;
47
proxy_set_header X-Real-IP $remote_addr;
48
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
49
proxy_set_header X-Forwarded-Proto $proxy_x_forwarded_proto;
50
proxy_set_header X-Forwarded-Host $proxy_x_forwarded_host;
51
52
location / {
53
proxy_pass http://$YOUR_UPSTREAM_SERVER_IP:8080/;
54
}
55
56
keepalive_timeout 10;
57
}
58
}
Copied!

Verifying your installation

Regardless of how your server was installed, it's a good idea everything is configured properly. W&B makes it easy to verify everything is properly configured by using our CLI.
1
pip install wandb
2
wandb login --host=https://YOUR_DNS_DOMAIN
3
wandb verify
Copied!
If you see any errors contact W&B support staff. You can also see any errors the application hit at startup by checking the logs.

Docker

1
docker logs wandb-local
Copied!

Kubernetes

1
kubectl get pods
2
kubectl logs wandb-XXXXX-XXXXX
Copied!