This reference describes the Serverless Inference REST API, which lets you call foundation models programmatically from your own applications. Use it to integrate hosted inference into services, scripts, or notebooks without managing model infrastructure.Documentation Index
Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt
Use this file to discover all available pages before exploring further.
Base URL
Access the Inference service at:Prerequisites
To call the Inference API, you need:- A W&B account with Inference credits.
- A valid W&B API key.
[YOUR-TEAM]/[YOUR-PROJECT]. If you don’t specify these, W&B uses your default entity and the project name inference.
Available methods
The Inference API provides OpenAI-compatible endpoints for interacting with foundation models. The following methods are available:- Chat Completions: Create chat completions using foundation models.
- List Models: Get all available models and their IDs.
Authentication
All API requests require authentication using your W&B API key. Create an API key at wandb.ai/settings. Include your API key in the request headers:- For the OpenAI SDK, set the
api_keyparameter. - For direct API calls, use
Authorization: Bearer [YOUR-API-KEY].
Error handling
For a complete list of error codes and how to resolve them, see API errors.Next steps
After you have your API key, continue with one of the following:- Try the usage examples to see how the API works.
- Explore models in the Inference UI.
- Check usage limits for your account.