Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt

Use this file to discover all available pages before exploring further.

Model Context Protocol (MCP) is an open standard that lets AI agents call external tools. The W&B MCP Server gives your IDE, coding assistant, or chat agent direct access to your W&B data and documentation, so it can answer questions about your runs, traces, evaluations, and artifacts without copy-paste. For a full list of what you can do with the server, see the W&B MCP Server capabilities section. It integrates natively with most IDEs, coding clients, and chat agents, including:
  • Cursor
  • Visual Studio Code (VS Code)
  • Claude Code
  • Codex
  • Gemini CLI
  • Mistral LeChat
  • Claude Desktop

Deployment types

You can use either the hosted MCP server or set up a local version if you need more isolation and flexibility. Using the local version requires your client to use a different URL to access the server.

Hosted server (recommended)

A W&B-managed MCP server that your client connects to over HTTP with your W&B API key. No installation, no local process to maintain.Use the hosted server

Local install

Run the MCP server on your own machine over STDIO or HTTP. Use when you need air-gapped operation, pinning to a specific release, custom server behavior, active server development, or support for a client that only speaks STDIO.Run the MCP server locally
If you run W&B on Dedicated Cloud or Self-Managed and the hosted MCP server isn’t yet enabled on your instance, contact W&B support or your W&B account team to request it.

Prerequisites

Before you configure any client:
  • Create a W&B API key at wandb.ai/authorize.
  • Set the key as the WANDB_API_KEY environment variable, or pass it to your client as a bearer token.
  • For Dedicated Cloud, Self-Managed, and local installs against a non-default instance, set the WANDB_BASE_URL environment variable to your instance URL.

Use the hosted server

W&B runs a managed MCP server for every deployment type. You don’t need to install anything. Configure your client to connect over HTTP with a W&B API key in the Authorization header.

Connection URL

The URL depends on your type of W&B deployment:
DeploymentServer URL
Multi-tenant Cloudhttps://mcp.withwandb.com/mcp
Dedicated Cloudhttps://<your-instance>/mcp
Self-Managedhttps://<your-instance>/mcp
For Dedicated Cloud or Self-Managed, replace https://mcp.withwandb.com/mcp with https://<your-instance>/mcp and keep everything else the same. The client configurations below use the Multi-tenant URL.
Run the following command in your terminal, replacing the bearer token with your W&B API key:
claude mcp add --transport http wandb https://mcp.withwandb.com/mcp \
  --header "Authorization: Bearer <your-wandb-api-key>"
Add --scope user to configure Claude Code globally. Omit it to configure only the current project.Verify the connection by asking List my W&B entities. The agent should call list_entities_tool and return your username and any teams. If the connection fails, see Troubleshooting. For more information, see Claude Code’s MCP documentation.

Run the MCP server locally

A local install is an alternative to the hosted server, not the default for any deployment type. Use it when the hosted server doesn’t fit your setup. Common reasons to run locally:
  • Air-gapped or offline environments where your client can’t reach a hosted W&B endpoint.
  • Pinned version. The hosted server follows the main branch. A local install can pin to a specific release tag.
  • Custom server behavior such as changing tool descriptions, adding tools, or setting a non-default response token budget.
  • Active development on the server itself.
  • STDIO-only clients or clients that require a local process.
For Dedicated Cloud or Self-Managed users, the hosted path is preferred. Only use a local install from wandb/wandb-mcp-server if the hosted server isn’t yet enabled on your instance or one of the reasons above applies. Set a WANDB_BASE_URL environment variable to your instance URL.

Local prerequisites

To run the server locally, make sure you have the following:
  • Python 3.11 or higher.
  • uv or pip.
  • A W&B API key, set as WANDB_API_KEY.
  • WANDB_BASE_URL set to your instance URL if you use Dedicated Cloud or Self-Managed.

Install the server

Choose an install method and run the following command to install the MCP server:
uvx --from git+https://github.com/wandb/wandb-mcp-server wandb_mcp_server

Configure your client

Select your MCP client and then run the following configuration, replacing <your-wandb-api-key> with your W&B API key as necessary:
Run the following command. Add --scope user for a global configuration.
claude mcp add wandb \
  -e WANDB_API_KEY=<your-wandb-api-key> \
  -e WANDB_BASE_URL=https://your-wandb-instance.example.com \
  -- uvx --from git+https://github.com/wandb/wandb-mcp-server wandb_mcp_server

Run the server with HTTP transport

For web-based clients and for testing, run the server with HTTP transport:
uvx wandb_mcp_server --transport http --host 0.0.0.0 --port 8080
To expose a local server to external clients, such as the OpenAI Responses API, use a tunnel:
uvx wandb_mcp_server --transport http --port 8080

# In another terminal
ngrok http 8080
Update your MCP client configuration to use the tunnel URL.

Environment variables

The following environment variables control authentication, instance routing, and server behavior for local installs. Set them in your client’s env block or export them in your shell.
VariableDescription
WANDB_API_KEYW&B API key for authentication. Required.
WANDB_BASE_URLCustom W&B instance URL for Dedicated Cloud or Self-Managed. Defaults to https://api.wandb.ai.
WANDB_MCP_PROXY_DOCSEnable the search_wandb_docs_tool documentation search proxy. Default: true.
WANDBOT_BASE_URLCustom endpoint for the docs search proxy.
MAX_RESPONSE_TOKENSToken budget for tool-response truncation. Default: 30000.
MCP_SERVER_LOG_LEVELLogging verbosity. One of DEBUG, INFO, WARNING, ERROR.
For the complete command-line reference and advanced options, see the wandb-mcp-server README.

W&B MCP Server capabilities

Use the MCP server to analyze experiments, debug traces, create reports, manage registry and artifacts, and answer questions from the W&B docs. The following example prompts demonstrate some of the tasks you can ask your agent to perform when it is connected to the W&B MCP Server:
  • “Show me the top 5 runs by eval/accuracy in your-team/your-project.”
  • “How did the latency of my hiring agent’s predict traces evolve over the last month?”
  • “Generate a W&B report comparing decisions made by the hiring agent last week.”
  • “What versions of the production-model artifact exist, and what changed between v2 and v3?”
  • “How do I create a leaderboard in Weave?”

Available tools

The server offers several tools for various purposes. The following table lists each tool’s name, when the agent should use it, and a concrete prompt you can use to invoke that tool.
Tools that help you discover project and entity names, and inspect schemas.
ToolUse whenExample prompt
list_entities_toolNo entity is specified, or to enumerate the teams and accounts the API key can reach.”What W&B teams do I have access to?”
query_wandb_entity_projectsThe entity is known but the project name is not, or an earlier query failed with “project not found”.”List all projects under your-team.”
probe_project_toolTo discover available metrics, config keys, and tags in an unfamiliar run-based project.”Probe your-team/your-project and tell me what metrics are logged.”
infer_trace_schema_toolTo discover field names, types, and sample values in an unfamiliar Weave traces project before querying.”What fields are on the Weave traces in your-team/your-project?”

Schema-first trace queries

For Weave trace queries, call infer_trace_schema_tool first to discover available fields, then call query_weave_traces_tool with a precise column list and detail_level:
detail_levelReturnsUse when
schemaStructural fields only. Fastest.Browsing or counting.
summaryTruncated inputs and outputs. The default.Most analysis tasks.
fullEverything untruncated.Drilling into a small number of specific traces.
This pattern keeps token usage low for broad questions and lets the agent escalate to full only for the traces that matter.

Usage tips

The following practices and workflows help you get better results from the W&B MCP Server. Start with the general practices, then read the subsection that matches your workload for more specific advice and multi-step tool chains.

General best practices

Follow these practices regardless of your use case:
  • Specify the entity and project. MCP tools need an explicit entity (your team or personal account) and project name. Include both in every question, for example “in your-team/your-project”.
  • Ask focused questions. Prefer “Which eval had the highest F1 score?” to “What is my best evaluation?”. Specific metrics and time ranges produce better tool calls.
  • Verify full retrieval. For broad questions such as “What are my best performing runs?”, ask the agent to confirm it retrieved all available runs rather than only the most recent ones.
  • Combine with W&B Skills. W&B Skills teach coding agents how to structure W&B workflows. Skills provide patterns and MCP provides data access, and the two work well together.

For trace-heavy workflows

Follow these practices when working with Weave traces:
  • Start with the schema. Call infer_trace_schema_tool before query_weave_traces_tool so the agent knows which fields and filter values are valid.
  • Pick the right detail_level. Use schema to browse, summary (the default) for analysis, and full only when drilling into a small number of specific traces.
  • Chain resolve_trace_roots_tool. After a child-trace query, pass the resulting trace_id list to resolve_trace_roots_tool to map each trace to its root session in one batched call.
  • Prefer summarize_evaluation_tool for evals. It aggregates the Evaluation.evaluate and predict_and_score hierarchy automatically. Only fall back to query_weave_traces_tool for raw trace data.
For an end-to-end workflow, see Triage failing LLM calls.

For run-heavy workflows

Follow these practices when working with W&B Models runs:
  • Probe before you query. Call probe_project_tool on an unfamiliar run-based project to discover metric keys, config keys, and tags before constructing GraphQL.
  • Use get_run_history_tool for time series. GraphQL doesn’t sample, so for loss curves and other time-series data get_run_history_tool is both faster and cheaper.
  • Let compare_runs_tool do the diff. It returns config and metric deltas with aligned history in a single call, avoiding manual comparison.
  • Run a health check first. When a training run looks wrong, call diagnose_run_tool before digging into history manually.
For end-to-end workflows, see Diagnose a bad training run and Summarize evals and compare model versions.

For Dedicated Cloud and Self-Managed

Follow these practices for non-multi-tenant deployments:
  • Prefer the hosted server on your instance at https://<your-instance>/mcp. It exposes the same tools as the Multi-tenant server with no client-side WANDB_BASE_URL needed. Only fall back to a local install if the hosted server isn’t yet enabled.
  • When you do run locally against your instance, set WANDB_BASE_URL to your instance URL in the client’s env block. Without it, the server targets api.wandb.ai and the server returns no data.
  • Rate limits on Dedicated Cloud are separate from Multi-tenant. See Dedicated Cloud rate limits for defaults and how to request changes.

For local installs

Follow these practices when running the server on your own machine:
  • Prefer STDIO transport for desktop clients (Cursor, VS Code, Claude Code, Claude Desktop). Only switch to HTTP transport when a client explicitly requires it (for example, the OpenAI Responses API).
  • When tool calls fail silently, set MCP_SERVER_LOG_LEVEL=DEBUG in the client’s env block and recheck the client’s MCP logs.
  • If you install from GitHub (uvx --from git+https://github.com/wandb/wandb-mcp-server wandb_mcp_server), uvx pins to the default branch. Pin an explicit tag by appending @v0.3.2 to the Git URL when you need a stable version.
Most real questions need more than one tool. Ask your agent to follow one of these chains.

Explore an unfamiliar project

To explore what has been logged to a project, chain these tools:
  1. list_entities_tool to find an entity or team.
  2. query_wandb_entity_projects to find the project.
  3. probe_project_tool for run-based projects, or infer_trace_schema_tool for Weave trace projects.
  4. A targeted query_wandb_tool or query_weave_traces_tool call using the discovered keys.

Triage failing LLM calls

To find bad traces and the sessions that produced them, chain these tools:
  1. query_weave_traces_tool with a filter on error or exception fields, and detail_level="summary".
  2. resolve_trace_roots_tool on the resulting trace_id list to map each failure to its root session.
  3. query_weave_traces_tool with detail_level="full" on a small number of specific roots to drill in.
  4. create_wandb_report_tool to document the findings.

Diagnose a bad training run

To run a health check on a suspicious training run, chain these tools:
  1. get_run_history_tool to pull the loss and validation curves.
  2. diagnose_run_tool for automated convergence, overfitting, and NaN checks.
  3. compare_runs_tool against a known-good baseline run.
  4. create_wandb_report_tool with line-plot panels to share the diagnosis.

Summarize evals and compare model versions

To find which model version performed best on an evaluation, chain these tools:
  1. summarize_evaluation_tool for per-scorer pass rates and error counts.
  2. list_artifact_versions_tool on the relevant model collection.
  3. compare_artifact_versions_tool between the candidate and current production version.
  4. log_analysis_to_wandb and create_wandb_report_tool to publish the comparison.

Troubleshooting

Use the following table to help you diagnose and resolve issues using the W&B MCP Server:
SymptomCause and fix
401 Unauthorized or Invalid API keyYour W&B API key is missing, malformed, or not authorized for the target entity or team. Regenerate a key at wandb.ai/authorize and confirm it is passed as a bearer token or set in WANDB_API_KEY.
Empty results for queries you expect to succeedThe team/entity or project name is incorrect, or the API key does not have access. Confirm both with the agent and retry.
404 Not Found or connection refused on https://<your-instance>/mcpThe hosted MCP server is not yet enabled on your Dedicated Cloud or Self-Managed instance, or the client is pointed at the wrong URL. Contact W&B support to request enablement, then confirm the URL in Connection URL.
429 Too Many Requests on Dedicated CloudYou have hit your instance’s rate limits. See Dedicated Cloud rate limits for how to request higher limits.
Local server cannot find uvx in Claude DesktopUse the full path to uvx in the command field of claude_desktop_config.json.