The guides in this section go beyond core Weights & Biases experiment-tracking features like logging data and media, building rich dashboards, and seamlessly integrating with popular frameworks and tools to cover advanced use cases and power-user features.
Need to track large-scale ML experiments distributed across multiple GPUs and multiple nodes? Then check out our guide to Distributed Training. For some approaches to distributed training and cross-validation, you also need to combine multiple runs together into a single experiment, as described in our guide on how to Group Runs.
At Weights & Biases, we're all about preventing you from losing any of your work. If you're using pre-emptible compute or your machine crashes, we'll help you Resume Runs where you left off. If you're in danger of losing valuable data,
wandb can even Save & Restore Files.
Tired of wondering whether training has finished or, worse, crashed? Set up Alerts to Slack or your e-mail, with configurable triggers right in your Python code.
The behavior of the tool is controllable from the command line, as described in our guide to Environment Variables.