> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Serverless Inference 오류 처리 모범 사례는 무엇인가요?

W\&B Serverless Inference 오류를 적절히 처리하고 애플리케이션의 안정성을 유지하려면 다음 모범 사례를 따르세요.

<div id="always-implement-error-handling">
  ## 항상 오류 처리를 구현하세요
</div>

API 호출을 `try-except` 블록으로 감싸세요:

```python theme={null}
import openai

try:
    response = client.chat.completions.create(
        model="meta-llama/Llama-3.1-8B-Instruct",
        messages=messages
    )
except Exception as e:
    print(f"Error: {e}")
    # 적절하게 오류를 처리하세요
```

<div id="use-retry-logic-with-exponential-backoff">
  ## 지수 백오프를 적용한 재시도 로직 사용하기
</div>

시도 사이의 지연 시간을 점차 늘리면서 일시적인 실패가 발생하면 다시 시도하세요:

```python theme={null}
import time
from typing import Optional

def call_inference_with_retry(
    client, 
    messages, 
    model: str,
    max_retries: int = 3,
    base_delay: float = 1.0
) -> Optional[str]:
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model=model,
                messages=messages
            )
            return response.choices[0].message.content
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            
            # 지수 백오프로 지연 시간 계산
            delay = base_delay * (2 ** attempt)
            print(f"Attempt {attempt + 1} failed, retrying in {delay}s...")
            time.sleep(delay)
    
    return None
```

<div id="monitor-your-usage">
  ## 사용량을 모니터링하세요
</div>

* W\&B **Billing** 페이지에서 크레딧 사용량을 추적하세요.
* 한도에 도달하기 전에 알림을 설정하세요.
* 애플리케이션에서 API 사용량을 로깅하세요.

<div id="handle-specific-error-codes">
  ## 특정 오류 코드 처리하기
</div>

```python theme={null}
def handle_inference_error(error):
    error_str = str(error)
    
    if "401" in error_str:
        # 인증 실패
        raise ValueError("Check your API key and project configuration")
    elif "402" in error_str:
        # 크레딧 부족
        raise ValueError("Insufficient credits")
    elif "429" in error_str:
        # 요청 속도 제한 초과
        return "retry"
    elif "500" in error_str or "503" in error_str:
        # 서버 오류
        return "retry"
    else:
        # 알 수 없는 오류
        raise
```

<div id="set-appropriate-timeouts">
  ## 적절한 타임아웃 설정
</div>

사용 사례에 맞게 적절한 타임아웃을 설정하세요:

```python theme={null}
# 더 긴 응답을 위한 설정
client = openai.OpenAI(
    base_url='https://api.inference.wandb.ai/v1',
    api_key="your-api-key",
    timeout=60.0  # 60초 타임아웃
)
```

<div id="additional-tips">
  ## 추가 팁
</div>

* 디버깅할 수 있도록 오류를 타임스탬프와 함께 로깅합니다.
* 더 나은 동시성 처리를 위해 비동기 오퍼레이션을 사용합니다.
* 프로덕션 시스템에 서킷 브레이커를 구현합니다.
* 필요한 경우 응답을 캐시해 API 호출 수를 줄입니다.

***

<Badge stroke shape="pill" color="orange" size="md">[Inference](/ko/support/models/tags/inference)</Badge>