Monitor using built-in signals

This page covers a previous approach to monitoring production traffic. For new implementations, use Signals under Weave for Agents. See View agent signals.

Standard system metrics like latency, token count, and cost don’t fully capture agent behavior. Inspecting individual traces provides deep insight but doesn’t scale to the millions of traces generated in a live environment. Signals provide a high-level monitoring solution to this problem by offering automated, behavioral scoring for agents in production:

Automated scoring: Every incoming production trace is automatically processed and scored on common quality issues and errors.
Infrastructure: Processing is powered by CoreWeave compute and CoreWeave GPUs for scalability across millions of traces.

By using signals within production, you can:

Gain behavioral insight. Move beyond system metrics to understand if your agent is hallucinating, failing to follow conversation patterns, or losing grounding in its evidence.
Accelerate the research loop. Use the scores and failure analyses generated by signals to identify weaknesses, which you can use to research model improvement, data annotation, or reinforcement learning.

Available signals

W&B Weave offers monitors with built-in signals: preset scorers that evaluate production traces for common quality issues and errors by default, with no custom setup. Each built-in signal uses a benchmarked LLM prompt to classify traces and saves the results as comma-delimited tags representing the detected issues. Signals use a Serverless Inference model to score traces, so you don’t need external API keys. W&B Weave provides 13 preset signals organized into two groups.

Quality signals

Quality signals evaluate successful root-level traces for output quality and safety issues.

Signal	What it detects
Hallucination	Fabricated facts or claims that contradict the provided input context
Low quality	Responses with poor format, insufficient effort, or incomplete content
User frustration	Signs of user frustration such as repeated questions, negative sentiment, or complaints
Jailbreaking	Prompt injection and jailbreak attempts that try to bypass safety guidelines
NSFW	Explicit, violent, or otherwise inappropriate content in inputs or outputs
Lazy	Low-effort responses such as excessive brevity, refusals to help, or deferred work
Forgetful	Failure to use context from earlier in the conversation, ignoring previously stated facts or instructions

Error signals

Error signals categorize failed traces by root cause to help you identify and resolve infrastructure and application issues.

Signal	What it detects
Network Error	DNS failures, timeouts, connection resets, and other connectivity issues
Ratelimited	HTTP `429` responses, quota exhaustion, and throttling from upstream APIs
Request Too Large	Requests exceeding size or token limits, such as context window exceeded
Bad Request	Client-side errors where the server rejected the request (`4xx` except `429`)
Bad Response	Invalid, unexpected, or unusable responses from remote services (`5xx`)
Bug	Flaws in application code such as `KeyError`, `TypeError`, or logic errors

How signals work

Each signal uses an LLM-as-a-judge approach to classify traces:

Trace selection: Quality signals evaluate successful root-level traces. Error signals evaluate failed traces. Weave doesn’t score child spans and intermediate Calls.
Prompt construction: Weave constructs a prompt that includes the trace metadata, inputs, outputs, exception details (if any), and the operation’s source code. Weave appends the signal’s classifier prompt with instructions for the specific issue to detect.
LLM scoring: For each signal, a Serverless Inference model performs a binary classification (whether that issue is present on the trace). Detected issues are returned as comma-delimited string tags (for example, "Low-quality, User-frustration, Forgetful").

When multiple signals from the same group (Quality or Error) are active, Weave batches the signals into a single LLM call. The model evaluates all active classifiers in one pass and returns a result for each.

Add a signal from the Monitors page

Adding a signal turns on automated scoring so that Weave evaluates new production traces against that signal’s criteria. To enable signals:

Navigate to wandb.ai and then open your Weave project.
In the Weave project sidebar, select Monitors.
To add a signal to a project with no Monitors enabled, click on its card to activate its checkbox, then click Setup monitors.
To add signals to an existing Monitor, select Browse signals at the top-right on the Monitors page. This opens the Add signals drawer, which lists available signals grouped by category (such as Quality classifiers and Error classifiers), each with a checkbox. You can select individual signals, use Enable all for a group, or select Create custom signal. Then select Add signals at the bottom of the drawer.

After you add signals, Weave automatically scores incoming traces.

Manage active signals

After your signals are running, you can review which ones are active or turn off signals you no longer need. To view or remove active signals:

From the Monitors page, select the Manage signals () button. This opens a drawer listing all currently active signals grouped by category.
Hover over a signal and select the Remove signal () button to deactivate the signal.

Removing a signal stops scoring new traces. Weave preserves existing scores from the signal.

Use built-in signals

After signals are active and scoring your traces, you can review the results in several places across Weave. You can also set up alerts when Weave detects issues. The following sections describe where to find signal results and how to act on them.

See tagged Call traces on the Traces page

If you trace individual functions as Ops with the @weave.op decorator, Weave stores signal results as feedback on the Call object. You can query these results from the Traces page. You can scan your traces for certain behavior in the Traces page using the Signals column. The Signals column shows tags when their criteria are met. Hover over these tags to see the confidence in the score and the reasoning.

Weave Traces view with hover over of a Signals tag in the Signals column showing confidence and reasoning.

Use the trace table toolbar to filter the trace table to only show traces that triggered certain signals. You can view additional signal details in the Traces page by selecting the classifier Call that the signal generates and reviewing the Trace Details view. Under Call Output, review classifier_meta for the reasoning. For example, the following screenshot shows a Quality-classifiers signal with Low-quality match and confidence (0.9) with a reason for this rating.

Weave Traces view with a quality-classifier trace selected. The details panel shows Call details with classifiers metadata including a confidence score and reason.

See signals in the project dashboard

You can also review signals at a project level:

In the project sidebar, select Project.
At the top of the Project dashboard, select the Weave tab.
In the Weave dashboard panels, locate Monitor Scores.

In the Monitor Scores project panel, you can see time-based graphs of signals that occurred for the project.

Weave project dashboard Monitor Scores panel showing signal graphs from project activity.

Alert on signals

Beyond reviewing signals in the UI, Weave can notify you when signals trigger. You can set up automated triggers that notify your team through tools such as Slack when an agent’s performance drops below a certain threshold. To get notified when a signal is triggered, set up an automation.

For specific monitoring beyond what the built-in signals provide, see Set up custom monitors.

Get Started

Guides

Cookbooks

Reference

Details & Support