This is an interactive notebook. You can run it locally or use the links below:
Integrate with Weave: production dashboard
This notebook demonstrates how to use Weave’s APIs and functions to create a custom dashboard for production monitoring as an extension to the Traces view in Weave. This guide is intended for developers and ML engineers who run LLM applications in production and want tailored visibility into performance, cost, and user feedback beyond what the default Traces view provides. This notebook focuses on:- Fetching traces, costs, feedback, and other metrics from Weave.
- Creating aggregate views for user feedback and cost distribution.
- Creating visualizations for token usage and latency over time.
streamlit and running the production dashboard script.
Setup
To begin, install the following packages:Implementation
The following sections walk through initializing the Weave client, fetching call data, and generating visualizations for the dashboard.Initialize the Weave client and define costs
First, set up a function to initialize the Weave client and add costs for each model. This step is required so that downstream cost queries can attribute per-token pricing to each call. W&B includes the standard costs for many standard models, and also lets you add your own custom costs and custom models. The following example shows how to add custom costs for a few models and use the standard costs for the rest. The costs are calculated based on the tracked tokens for each call in Weave. For many LLM vendor libraries, Weave automatically tracks the token usage, but you can also return custom token counts for any call. For more information about defining the token count and cost calculation for a custom model, see the custom cost cookbook.Fetch calls data from Weave
With the client initialized and costs configured, the next step is to pull call data from Weave. There are two options for fetching call data:- Fetching data call-by-call.
- Using high-level APIs.
Fetch data call-by-call
The first option to access data from Weave is to retrieve a list of filtered calls and extract the wanted data call-by-call. To do this, use thecalls_query_stream API to fetch the calls data from Weave:
calls_query_streamAPI: This API fetches the calls data from Weave.filterdictionary: This dictionary contains the filter parameters to fetch the calls data. See the CallSchema reference for more details.expand_columnslist: This list contains the columns to expand in the calls data.sort_bylist: This list contains the sorting parameters for the calls data.include_costsboolean: This boolean indicates whether to include the costs in the calls data.include_feedbackboolean: This boolean indicates whether to include the feedback in the calls data.
Use high-level APIs
Instead of going through every call, Weave also provides high-level APIs to directly access model costs, feedback, and other metrics. For example, for the cost, use thequery_costs API to fetch the costs of all LLMs used in the project:
Gather inputs and generate visualizations
With call data and costs available as DataFrames, you can generate the visualizations using plotly. This is a starter dashboard that you can customize as you like. For a more advanced example, check out the Streamlit example in the knowledge-worker-weave repo.Conclusion
This cookbook demonstrated how to create a custom production monitoring dashboard using Weave’s APIs and functions. Weave focuses on fast integrations for streamlined input of data as well as extraction of the data for custom processes.- Data input:
- Framework-agnostic tracing with the
@weave-op()decorator and the option to import calls from CSV (see the related import cookbook). - Service API endpoints to log to Weave from various programming frameworks and languages. See the Service API reference for more details.
- Framework-agnostic tracing with the
- Data output:
- Download the data in CSV, TSV, JSONL, or JSON formats. See the Service API reference for more details.
- Export using programmatic access to the data. See the “Use Python” section in the export panel as described in this cookbook. See Querying and exporting calls for more details.