Model Circuit Breaker¶

Overview¶

The ModelCircuitBreaker module implements a fault-tolerance pattern to manage LLM provider failures. It tracks model performance in real-time and temporarily "trips" (disables) models that are consistently failing, preventing the bot from wasting API quota and reducing latency caused by doomed retry attempts.

Core Concepts¶

Error Categorization¶

The circuit breaker classifies exceptions into different categories, each with a specific cooldown strategy:

Category	Typical Cause	Cooldown Period
`QUOTA_EXHAUSTED`	429 Errors / Resource Exhausted	12 Hours
`MODEL_NOT_FOUND`	Incorrect model names / Deprecated models	1 Hour
`RATE_LIMITED`	Transient high-frequency usage	30 Seconds
`AUTHENTICATION`	Invalid API Keys / Permissions	2 Hours
`TRANSIENT`	Network timeouts / Connection issues	10 Seconds
`UNKNOWN`	Unexpected server-side errors	1 Minute

How it Works¶

Pre-Check: Before calling an LLM, the ModelManager queries is_available(model_name).
Failure Recording: If a model call fails, record_failure(model_name, error) is called.
Cooldown: The model is marked as "open" (unavailable). Consecutive failures lead to exponential backoff (up to 4x the base cooldown).
Reset: After the cooldown period expires, the circuit resets, allowing the model to be tried again.

Benefits¶

Quota Preservation: Stops calling models that have already reported quota exhaustion.
Improved UX: Automatically skips "dead" models and proceeds to fallbacks instantly, rather than waiting for multiple timeouts.
Self-Healing: Models are automatically reintroduced once their recovery period (or rate-limit window) is likely to have passed.

Usage Example¶

from llm.model_circuit_breaker import get_model_circuit_breaker

cb = get_model_circuit_breaker()

if cb.is_available("gemini-1.5-pro"):
    try:
        response = await model.invoke(prompt)
    except Exception as e:
        cb.record_failure("gemini-1.5-pro", e)

The circuit breaker is a singleton instance (get_model_circuit_breaker()) ensuring consistent failure tracking across the entire bot process.