> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# View reasoning information

> How to return and view reasoning in your Serverless Inference responses

Reasoning models, such as [Google's Gemma 4](https://huggingface.co/google/gemma-4-31B-it), return information about their reasoning steps alongside the final answer. This page explains how to identify reasoning-capable models on Serverless Inference, where to find reasoning output in a response, and how to turn reasoning on or off for models that support toggling it. Use this guide to inspect a model's intermediate reasoning or to control whether reasoning appears in a response.

To determine whether a model supports reasoning, check the following Supported models table or the **Supported Features** section of its catalog page in the UI.

Reasoning information appears in the `reasoning` field of responses. The value of this field is `null` in the responses of non-reasoning models.

## Supported models with reasoning

The following table lists the models on Serverless Inference that return reasoning output. Each supported model either always includes reasoning, or disables or enables reasoning by default:

| Model ID (for API usage)                       | Reasoning support   |
| ---------------------------------------------- | ------------------- |
| `deepseek-ai/DeepSeek-V4-Flash`                | Disabled by default |
| `deepseek-ai/DeepSeek-V4-Pro`                  | Disabled by default |
| `google/gemma-4-31B-it`                        | Disabled by default |
| `MiniMaxAI/MiniMax-M2.5`                       | Always on           |
| `moonshotai/Kimi-K2.6`                         | Always on           |
| `moonshotai/Kimi-K2.5`                         | Always on           |
| `nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8` | Enabled by default  |
| `nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B`     | Enabled by default  |
| `openai/gpt-oss-120b`                          | Always on           |
| `openai/gpt-oss-20b`                           | Always on           |
| `Qwen/Qwen3.6-35B-A3B`                         | Enabled by default  |
| `Qwen/Qwen3.6-27B`                             | Enabled by default  |
| `Qwen/Qwen3.5-35B-A3B`                         | Enabled by default  |
| `Qwen/Qwen3.5-27B`                             | Enabled by default  |
| `Qwen/Qwen3-235B-A22B-Thinking-2507`           | Always on           |
| `zai-org/GLM-5.1`                              | Enabled by default  |

### Models with `Always on` reasoning

If a model is listed as `Always on` in the preceding [Supported models](#supported-models) table, it always includes reasoning, and you can't disable it.

### Disable reasoning

If a model is listed as `Enabled by default` in the preceding [Supported models](#supported-models) table, you can disable reasoning to reduce token usage or simplify the response. To opt out of reasoning for a request, in `chat_template_kwargs`, set the `enable_thinking` flag to `False` (Python) or `false` (Bash). After the request completes, the response omits the reasoning content:

<Tabs>
  <Tab title="Python">
    ```python lines highlight={13-17} theme={null}
    import openai

    client = openai.OpenAI(
        base_url='https://api.inference.wandb.ai/v1',
        api_key="[YOUR-API-KEY]",  # Create an API key at https://wandb.ai/settings
    )

    response = client.chat.completions.create(
        model="google/gemma-4-31B-it",
        messages=[
            {"role": "user", "content": "3.11 and 3.8, which is greater?"}
        ],
        extra_body={
            "chat_template_kwargs": {
                "enable_thinking": False
            }
        },
    )
    ```
  </Tab>

  <Tab title="Bash">
    ```bash lines highlight={9} theme={null}
    curl https://api.inference.wandb.ai/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer [YOUR-API-KEY]" \
      -d '{
        "model": "google/gemma-4-31B-it",
        "messages": [
          { "role": "user", "content": "3.11 and 3.8, which is greater?" }
        ],
        "chat_template_kwargs": {"enable_thinking": false}
      }'
    ```
  </Tab>
</Tabs>

### Enable reasoning

If a model is listed as `Disabled by default` in the preceding [Supported models](#supported-models) table, you can enable reasoning by setting the `enable_thinking` flag to `True` (Python) or `true` (Bash) in the preceding code snippet.
