Skip to main content
W&B Skills are reusable instruction sets that teach coding agents how to use W&B effectively. Instead of manually guiding your agent through W&B APIs and best practices, install Skills so that the agent can work with experiment tracking, tracing, evaluations, and monitoring on its own. Skills work with several major coding agents, including:
  • Claude Code
  • Codex
  • Cursor
  • GitHub Copilot
  • Gemini CLI
For a full list of supported agents, see the W&B Skills CLI documentation.

W&B Skills capabilities

Skills covers both the W&B Models SDK (training runs, metrics, artifacts, sweeps) and the Weave SDK (traces, evaluations, scorers). Includes helper libraries, reference docs, and data analysis patterns.
WorkflowCapabilities
Model training
  • Log metrics and rich media during training and fine-tuning.
  • Track and compare experiments.
  • Analyze runs and results, such as loss curves and accuracy scores.
  • Tune hyperparameters.
Agent building
  • Trace agentic AI applications.
  • Analyze traces and classify failure modes.
  • Evaluate models and agents with labeled datasets.
  • Run online evaluations for production monitoring.

Prerequisites

Skills requires the following:
  • Node.js (for the npx command).
  • A W&B API key. Create one at wandb.ai/authorize and then set it as an environment variable:
    export WANDB_API_KEY=<your-api-key>
    
  • (Optional) Set your W&B project name as a WANDB_PROJECT environment variable. This allows your agent to target the correct W&B project without you specifying it each time.

Install W&B Skills

To install W&B Skills globally, run the following command with the --global flag:
npx skills add wandb/skills --skill '*' --yes --global
To install Skills for a specific project, run the following command from your project directory:
npx skills add wandb/skills --skill '*' --yes
You can also install Skills for specific agents using the --agent flag:
npx skills add wandb/skills --agent claude-code --skill '*' --yes --global
For list of --agent and --skill options, see the skills CLI documentation.

Use W&B Skills

Once installed, you can ask the agent to perform W&B-related tasks for your project. The following example prompts demonstrate some of the tasks your agent can do with W&B Skills:
  • “Log training metrics for my PyTorch model to W&B.”
  • “Analyze the loss curves for my last 10 runs and identify the best performing configuration.”
  • “Trace my LangChain agent and log the results to Weave.”
  • “Run an evaluation on my agent using the test dataset and summarize the results.”
  • “Find the failure modes in my last evaluation and classify them.”
  • “Compare the configs of run A and run B and show me the differences.”

Usage tips

Skills performs better when you use more specific queries versus broader open-ended questions. The following table provides some recommneded example prompts versus prompts that are too vague.
RecommendedNot recommended
”What is the final validation loss for my last 5 runs?""How is my model doing?"
"Summarize the token usage across my last 10 traces.""Show me all my traces."
"Compare the configs of run A and run B.""What are my best runs?"
"What eval had the highest F1 score?""How are my evaluations going?”