OpenAI - Weights & Biases Documentation

This guide shows you how to integrate the OpenAI Python and TypeScript libraries with Weave so that you can trace, evaluate, and monitor your LLM application. It’s intended for developers who already use OpenAI’s SDKs and want visibility into their calls during development and in production.

Experiment with OpenAI models on Weave without any setup using the LLM Playground.

Tracing

Storing traces of LLM applications in a central database is valuable both during development and in production. Use these traces for debugging and to help build a dataset of tricky examples to evaluate against while you improve your application. Weave can automatically capture traces for the openai Python library. To start capturing, call weave.init("[PROJECT_NAME]") with a project name of your choice. Weave automatically patches OpenAI regardless of when you import it, so all subsequent OpenAI calls are traced. If you don’t specify a W&B team when you call weave.init(), Weave uses your default entity. To find or update your default entity, refer to User Settings in the W&B Models documentation.

Automatic patching

Weave automatically patches OpenAI whether you import it before or after weave.init(). The following example shows the minimal setup you need to start tracing calls:

from openai import OpenAI
import weave

weave.init('emoji-bot')  # OpenAI is automatically patched!

client = OpenAI()
response = client.chat.completions.create(
  model="gpt-4",
  messages=[
    {
      "role": "system",
      "content": "You are AGI. You will be provided with a message, and your task is to respond using emojis only."
    },
    {
      "role": "user",
      "content": "How are you?"
    }
  ]
)

import { OpenAI } from 'openai';
import { wrapOpenAI } from '@wandb/weave';

const openai = wrapOpenAI(new OpenAI());

// This will now trace all calls to OpenAI
openai.chat.completions.create(
  {
    model: "gpt-4",
    messages: [
      {
        role: "system",
        content: "You are AGI. You will be provided with a message, and your task is to respond using emojis only."
      },
      {
        role: "user",
        content: "How are you?"
      }
    ]
  }
);

Optional: Explicit patching

For fine-grained control over when patching takes effect, patch OpenAI explicitly instead of relying on the automatic behavior:

import weave

weave.init('emoji-bot')
weave.integrations.patch_openai()  # Enable OpenAI tracing

from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
  model="gpt-4",
  messages=[
    {"role": "user", "content": "Make me a emoji"}
  ]
)

View a live trace

Weave also captures the function calling tools for OpenAI Functions and OpenAI Assistants.

Structured outputs

Weave supports tracing OpenAI structured outputs, which are useful when you need to ensure your LLM responses follow a specific format. The following example traces a call that extracts a typed UserDetail object from a user message:

from openai import OpenAI
from pydantic import BaseModel
import weave

class UserDetail(BaseModel):
    name: str
    age: int

client = OpenAI()
weave.init('extract-user-details')

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the user details from the message."},
        {"role": "user", "content": "My name is David and I am 30 years old."},
    ],
    response_format=UserDetail,
)

user_detail = completion.choices[0].message.parsed
print(user_detail)

Async support

Weave supports tracing async OpenAI calls, so applications that use AsyncOpenAI get the same visibility as synchronous applications.

from openai import AsyncOpenAI
import weave

client = AsyncOpenAI()
weave.init('async-emoji-bot')

async def call_openai():
    response = await client.chat.completions.create(
        model="gpt-4",
        messages=[
            {
                "role": "system", 
                "content": "You are AGI. You will be provided with a message, and your task is to respond using emojis only."
            },
            {
                "role": "user",
                "content": "How are you?"
            }
        ]
    )
    return response

# Call the async function
result = await call_openai()

Streaming support

Weave supports tracing streaming responses from OpenAI. The captured trace reflects the full streamed completion, so you can review the final output alongside the request parameters.

from openai import OpenAI
import weave

client = OpenAI()
weave.init('streaming-emoji-bot')

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {
            "role": "system", 
            "content": "You are AGI. You will be provided with a message, and your task is to respond using emojis only."
        },
        {
            "role": "user",
            "content": "How are you?"
        }
    ],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

Tracing function calls

Weave traces function calls made by OpenAI when you use tools, which helps you understand how the model invoked each tool and with what arguments.

from openai import OpenAI
import weave

client = OpenAI()
weave.init('function-calling-bot')

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The location to get the weather for"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The unit to return the temperature in"
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {
            "role": "user",
            "content": "What's the weather like in New York?"
        }
    ],
    tools=tools
)

print(response.choices[0].message.tool_calls)

Batch API

Weave supports the OpenAI Batch API, which lets you process multiple requests asynchronously while Weave still captures each request in your traces.

from openai import OpenAI
import weave

client = OpenAI()
weave.init('batch-processing')

# Create a batch file
batch_input = [
    {
        "custom_id": "request-1",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4",
            "messages": [{"role": "user", "content": "Hello, how are you?"}]
        }
    },
    {
        "custom_id": "request-2", 
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4",
            "messages": [{"role": "user", "content": "What's the weather like?"}]
        }
    }
]

# Submit the batch
batch = client.batches.create(
    input_file_id="your-file-id",
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

# Retrieve the batch results
completed_batch = client.batches.retrieve(batch.id)

Assistants API

Weave supports the OpenAI Assistants API, so you can trace conversational AI applications built around assistants, threads, and runs.

from openai import OpenAI
import weave

client = OpenAI()
weave.init('assistant-bot')

# Create an assistant
assistant = client.beta.assistants.create(
    name="Math Assistant",
    instructions="You are a personal math tutor. Answer questions about math.",
    model="gpt-4"
)

# Create a thread
thread = client.beta.threads.create()

# Add a message to the thread
message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="What is 2+2?"
)

# Run the assistant
run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)

# Get the assistant's response
messages = client.beta.threads.messages.list(thread_id=thread.id)

Cost tracking

Weave automatically tracks the cost of your OpenAI API calls so that you can monitor spend alongside performance. You can view the cost breakdown in the Weave UI.

Cost tracking is available for all OpenAI models, and Weave calculates costs based on OpenAI’s published pricing.

Tracing custom functions

To group OpenAI calls under your own application logic, trace custom functions that use OpenAI by applying the @weave.op decorator. This produces a parent trace for the function with the underlying OpenAI calls nested inside it.

from openai import OpenAI
import weave

client = OpenAI()
weave.init('custom-function-bot')

@weave.op
def generate_response(prompt: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {
                "role": "user",
                "content": prompt
            }
        ]
    )
    return response.choices[0].message.content

# This function call will be traced
result = generate_response("Hello, how are you?")

Next steps

With tracing set up for OpenAI, your application’s calls are now visible in Weave. From here, you can:

View traces in the Weave UI: Go to your Weave project to see traces of your OpenAI calls.
Create evaluations: Use your traces to build evaluation datasets.
Monitor performance: Track latency, costs, and other metrics.
Debug issues: Use traces to understand what’s happening in your LLM application.

For more information about these topics, see the evaluation guide and monitoring guide.

​Tracing

​Automatic patching

​Optional: Explicit patching

​Structured outputs

​Async support

​Streaming support

​Tracing function calls

​Batch API

​Assistants API

​Cost tracking

​Tracing custom functions

​Next steps

Tracing

Automatic patching

Optional: Explicit patching

Structured outputs

Async support

Streaming support

Tracing function calls

Batch API

Assistants API

Cost tracking

Tracing custom functions

Next steps