API overview

This reference describes the Serverless Inference REST API, which lets you call foundation models programmatically from your own applications. Use it to integrate hosted inference into services, scripts, or notebooks without managing model infrastructure.

Base URL

Access the Inference service at:

https://api.inference.wandb.ai/v1

Prerequisites

To call the Inference API, you need:

A W&B account with Inference credits.
A valid W&B API key.

If you belong to more than one team, or want to attribute your usage to a project, you’ll also need team and project IDs. In code samples, these appear as [YOUR-TEAM]/[YOUR-PROJECT]. If you don’t specify these, W&B uses your default entity and the project name inference.

Available methods

The Inference API provides OpenAI-compatible endpoints for interacting with foundation models. The following methods are available:

Chat Completions: Create chat completions using foundation models.
List Models: Get all available models and their IDs.

Authentication

All API requests require authentication using your W&B API key. Create an API key at wandb.ai/settings. Include your API key in the request headers:

For the OpenAI SDK, set the api_key parameter.
For direct API calls, use Authorization: Bearer [YOUR-API-KEY].

Error handling

For a complete list of error codes and how to resolve them, see API errors.

Next steps

After you have your API key, continue with one of the following:

Try the usage examples to see how the API works.
Explore models in the Inference UI.
Check usage limits for your account.

W&B Models

W&B Weave

Serverless Inference

Serverless Training

Base URL

Prerequisites

Available methods

Authentication

Error handling

Next steps

​Base URL

​Prerequisites

​Available methods

​Authentication

​Error handling

​Next steps

Base URL

Prerequisites

Available methods

Authentication

Error handling

Next steps