Serverless Training - Weights & Biases Documentation

Use Serverless Training to post-train and fine-tune LLMs on managed, serverless infrastructure. W&B provisions the training infrastructure (on CoreWeave) for you while allowing full flexibility in your environment’s setup. You get instant access to a managed training cluster that elastically auto-scales to dozens of GPUs. Serverless Training is now in public preview. Serverless Training offers two complementary methods:

Serverless RL: Post-train models with reinforcement learning so they learn new behaviors and improve reliability, speed, and costs when performing multi-turn agentic tasks. Serverless RL splits RL workflows into inference and training phases and multiplexes them across jobs to increase GPU utilization and reduce your training time and costs.
Serverless SFT: Fine-tune models with supervised learning on curated datasets. Use SFT for distillation, teaching output style and format, or warming up a model before applying RL.

Serverless Training is ideal for tasks like:

Voice agents
Deep research assistants
On-prem models
Content marketing analysis agents

Serverless Training trains low-rank adapters (LoRAs) to specialize a model for your specific task. This extends the original model’s capabilities with on-the-job experience. W&B automatically stores the LoRAs you train as artifacts in your account. You can also save them locally or to a third party for backup. Serverless Inference also automatically hosts models that you train through Serverless Training. See the ART quickstart or Google Colab notebook to get started. Explore a public demo workspace that demonstrates the following:

Train a Qwen model with OpenPipe RULER and Weave Scorers.
Track training progress and create custom plots with W&B Models.
Evaluate the final results on a Weave leaderboard.

Why Serverless Training?

Serverless Training can provide the following advantages in your post-training:

Lower training costs: By multiplexing shared infrastructure across many users, skipping the setup process for each job, and scaling your GPU costs down to 0 when you’re not actively training, Serverless Training reduces training costs significantly.
Faster training time: By splitting inference requests across many GPUs and immediately provisioning training infrastructure when you need it, Serverless Training speeds up your training jobs and lets you iterate faster.
Automatic deployment: Serverless Training automatically deploys every checkpoint you train, so you do not need to manually set up hosting infrastructure. You can access and test trained models immediately in local, staging, or production environments.

How Serverless Training uses W&B services

Serverless Training uses a combination of the following W&B components to operate:

Inference: To run your models
Models: To track performance metrics during the LoRA adapter’s training
Artifacts: To store and version the LoRA adapters
Weave (optional): To gain observability into how the model responds at each step of the training loop

Serverless Training is in public preview. During the preview, W&B charges you only for inference usage and artifact storage. W&B does not charge for adapter training during the preview period.

​Why Serverless Training?

​How Serverless Training uses W&B services

Why Serverless Training?

How Serverless Training uses W&B services