Orchestrator¶
Overview¶
The Orchestrator class serves as the central coordinator for LLM-powered Discord interactions. It implements a sophisticated Two-Phase Agent Architecture to ensure high-quality, information-rich responses. It manages the entire lifecycle from capturing incoming Discord messages, gathering multi-layered context, executing information analysis, and finally generating a conversational reply.
Architecture¶
Core Components¶
- ContextManager: Aggregates context from multiple memory providers (Short-term, Procedural, Episodic, Knowledge).
- ModelManager: Handles model selection priorities for both analysis and generation phases.
- ToolsFactory: Dynamically discovers and filters tools based on user permissions and agent mode.
- Circuit Breaker: Tracks model health and automatically skips failing providers to ensure system resilience.
- ProtectedPromptManager: Enforces immutable system rules (like output formatting) while allowing personality customization.
Two-Phase Agent System¶
graph TD
A[Discord Message] --> B[ContextManager]
B --> C[Info Agent]
C --> D[Message Agent]
D --> E[Response Generation]
E --> F[Discord Response]
B --> G[Tool Selection]
G --> H[Tool Execution]
H --> C
C --> I[Analysis Output]
I --> D
Two-Phase Processing Flow¶
Phase 1: Information Agent (Info Agent)¶
Purpose: Analyze user intent and extract required information using specialized tools.
- Context Injection: Injects procedural context (user bio, server rules) and short-term memory directly into the message list.
- Tool Access: Accesses "Info" mode tools (Search, Memory Retrieval, Activity Stats).
- Sanitization: Specifically handles Gemini 3.x and Ollama requirements (e.g., converting past tool calls to text to prevent 400 errors).
- Circuit Breaker: Attempts to run the preferred model; if it fails, the circuit breaker triggers an immediate fallback to the next available provider.
Phase 2: Message Agent (Generation Agent)¶
Purpose: Formulate the final conversational response based on the Info Agent's analysis.
- Analysis Input: Receives the raw output and tool results from Phase 1.
- Protected Prompts: Uses
ProtectedPromptManagerto ensure the bot follows formatting rules (like using<som>and<eom>tags) regardless of personality settings. - Streaming Fallback: Since LangChain's standard middleware doesn't support streaming fallback, the Orchestrator implements a manual fallback loop to ensure the user always receives a response.
- Reasoning Optimization: Automatically injects thought-budget prompts for reasoning-capable models (e.g., DeepSeek R1, Gemma, Ollama reasoning models).
Class Reference¶
Orchestrator¶
Constructor¶
def __init__(self, bot: Any)
Initializes model manager, context manager, and sets up memory providers (Short-term, Procedural, Episodic, Knowledge) using the bot's resources.
Main Entry Point¶
async def handle_message(self, bot: Any, message_edit: Message, message: Message, logger: Any) -> OrchestratorResponse
Processes a Discord message through the two-phase pipeline. Supports streaming updates to the message_edit target.
Key Features¶
🖼️ Image Caching¶
The orchestrator maintains an image_cache during a single message cycle. If multiple fallback models are tried, it avoids redundant downloads of the same image attachments, improving speed and reducing bandwidth.
🛡️ Fault Tolerance¶
- Circuit Breaker: Automatically "opens" (skips) models that have recently reached rate limits or returned errors.
- Resilient Context: If memory providers fail, the bot continues with an empty context rather than crashing.
- Manual Fallback: Guarantees a response even during provider outages.
🧠 Model Optimization¶
- KV Cache Reuse: Prompts are ordered (Static System Prompt -> Dynamic User Context) to maximize Key-Value cache efficiency on inference providers.
- Thought Control: Injects
reasoning_optimization_promptfor models known to support chain-of-thought processing.
Integration Points¶
- UserDataCog: Source of user preferences and memory management.
- LanguageManager: Handles dynamic translation of system status messages ("Analyzing...", "Thinking...").
- DirectToolOutputMiddleware: Forces the Info Agent to return results immediately after a tool call, preventing infinite loops.
The Orchestrator is designed to be the "brain" of the bot, abstracting away the complexity of model selection and context assembly from the UI layers.