> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# What is Weave?

> Learn about W&B Weave and how it helps you build, evaluate, and improve LLM applications

W\&B Weave is an observability and evaluation platform for building reliable LLM applications. Weave helps you understand what your AI application is doing, measure how well it performs, and systematically improve it over time.

Building LLM applications is fundamentally different from traditional software development. LLM outputs are non-deterministic, making debugging harder. Quality is subjective and context-dependent. Small prompt changes can cause unexpected behavior changes. Traditional testing approaches fall short.

## The main threads of Weave

Weave provides the following core functionality:

* **Visibility** into agent sessions and multi-turn conversations, or into individual function calls and outputs in application code.
* **Systematic evaluation** to measure performance against curated test cases.
* **Version tracking** for prompts, models, and data so you can understand what changed.
* **Experimentation** with different prompt and model comparisons.
* **Feedback collection** to capture human judgments and annotations.
* **Monitoring** in production using guardrails and scorers for LLM safety and quality.

### Agentic tracing

Weave provides agentic observability for the full lifecycle of agent conversations including: sessions, LLM calls, and tool executions.

If you're building an agent, follow the [agent tracing quickstart](weave/agents-quickstart.mdx) or learn to use the Weave SDK to [trace your agents](/weave/guides/tracking/trace-agents).

If you're using a supported third-party agent harness, such as Claude Code or OpenAI Agent SDK, Weave instruments it automatically with no additional code. See [Integrations](/weave/guides/integrations) for all supported frameworks.

### Application Tracing

If you want to trace individual function calls, application code, or custom logic, use Weave Ops and Calls. Add one line to any function to track inputs, outputs, cost, token count, and latency.

* Track end-to-end how data flows through your LLM application.
* See source documents used to produce the LLM feedback.
* Drill down into specific prompts and how answers are produced.

To trace individual functions, follow the Weave [Op tracing quickstart](/weave/quickstart) or learn to use the Weave [Ops and Calls](/weave/guides/tracking/tracing).

If you're using a supported third-party agent framework, such as Claude Code, Weave instruments it automatically with no additional code. See [Integrations](/weave/guides/integrations) for all supported frameworks.

### Evaluations

Systematically benchmark and monitor your LLM application's performance with evaluations to iteratively improve quality and reliability.

* Easily track which versions of model/prompt resulted in what performance.
* Define metrics to evaluate responses using one or more scoring functions.
* Compare two or more different evaluations over multiple metrics. Contrast specific samples for their performance.

[Build an evaluation pipeline](/weave/tutorial-eval)

### Version everything

Weave tracks versions of your prompts, datasets, and model configurations. When something breaks, you can see exactly what changed. When something works, you can reproduce it.

[Learn about versioning](/weave/guides/tracking/objects)

### Experiment with prompts and models

Bring your API keys and quickly test prompts and compare responses from various commercial models using the Playground.

[Experiment in the Weave Playground](/weave/guides/tools/playground)

### Collect feedback

Capture human feedback, annotations, and corrections from production use. Use this data to build better test cases and improve your application.

[Collect feedback](/weave/guides/tracking/feedback)

### Monitor production

Score production traffic with the same scorers you use in evaluation. Set up guardrails to catch issues before they reach users.

[Set up guardrails and monitors](/weave/guides/evaluation/monitors)

## Get started using Weave

Weave provides SDKs for Python and TypeScript. Both SDKs support tracing, evaluation, datasets, and the core Weave features. Some advanced features like class-based Models and Scorers are currently not available for the Weave TypeScript SDK.

To get started using Weave:

1. Create a Weights & Biases account at [https://wandb.ai/site](https://wandb.ai/site/?utm_source=course\&utm_medium=course\&utm_campaign=weave) and get your API key from [https://wandb.ai/authorize](https://wandb.ai/authorize?utm_source=course\&utm_medium=course\&utm_campaign=weave)
2. Install Weave:

<CodeGroup>
  ```Python Python theme={null}
  pip install weave
  ```

  ```Typescript Typescript theme={null}
  npm install weave
  ```
</CodeGroup>

3. In your script, import Weave and initialize a project.

Replace `[YOUR-TEAM]` with your W\&B team name and `[YOUR-PROJECT]` with your W\&B project name.

<CodeGroup>
  ```Python Python theme={null}
  import weave
  client = weave.init('[YOUR-TEAM]/[YOUR-PROJECT]')
  ```

  ```TypeScript Typescript theme={null}
  import * as weave from 'weave';
  const client = await weave.init('[YOUR-TEAM]/[YOUR-PROJECT]');
  ```
</CodeGroup>

You're now ready to use Weave.