Overview
The Core Engine is the single runtime that every LuMay agent runs on - whether it's a voice call, a CRM automation, a legal review, or a custom workflow. It handles the reasoning loop (receive input → understand intent → call tools → generate response) and ensures every agent inherits the platform's governance, connectors, and observability without any per-agent wiring.
The engine is provider-agnostic: you can swap between OpenAI, Anthropic, or any compatible API without changing the agent configuration, the conversation flow, or any downstream connector. This matters because model capabilities and pricing change rapidly - your investment in agent design should survive the model upgrade cycle.
Action branch (when tool call is emitted)
The green (core) nodes are the platform's protected value - every request routes through the same boundary.
How It Works - Request Flow
When an agent receives a request (a spoken sentence, a chat message, or an API call), the Core Engine executes this sequence:
- Context retrieval - relevant context is retrieved from the knowledge base via RAG (vector search against uploaded documents).
- LLM reasoning - the input, retrieved context, conversation history, and system prompt are sent to the configured LLM (OpenAI or Anthropic). The LLM produces either a final response or a tool call.
- Tool dispatch - if the LLM emits a tool call (e.g.
create_ticket), the MCPRegistry routes it to the correct connector instance by tool name and provider. - Connector execution - the connector authenticates (OAuth token or API key from the vault), calls the external API, and returns a normalised result.
- Response generation - the LLM receives the connector result and generates the final response.
- Output delivery - the response is returned as speech synthesis, chat text, or a structured API response depending on the channel.
- Trace persistence - the entire exchange is recorded as an OpenTelemetry span with a correlation ID, alongside the transcript, sentiment score, and outcome.
Key Features
| Feature | What it does |
|---|---|
| Provider abstraction | Swap LLM providers (OpenAI ↔ Anthropic) without changing the agent, flow, or connectors. |
| RAG retrieval | Vector search against documents uploaded to the knowledge-base vault. Retrieves relevant context before each LLM call. |
| Structured outputs | Forces the LLM to return typed JSON, enums, or field values - not free text - for reliable downstream processing. |
| Stateful memory | Conversation history persists across turns within a session; long-term context is managed via the call session store. |
| Tool registry (MCPRegistry) | Dynamic tool loading; LLM tool calls are routed to the correct connector instance by tool name and provider ID. |
| Cost controls | Token-budget limits and latency caps per agent prevent runaway costs in production. |
| Provider fallback | Roadmap: automatic failover to a secondary LLM provider if the primary is unavailable. |
Use Cases
- Voice agent - the Core Engine reasons about the caller's intent, looks up their order record in Salesforce via tool dispatch, and reads the status back to the caller in real time.
- CRM automation - the engine classifies an inbound support email, creates a Freshdesk ticket with the correct priority and category, and sends a confirmation email - all from a single reasoning loop.
- Legal review - the engine searches uploaded contracts via RAG, extracts key clauses using structured outputs, and flags compliance issues against a configurable ruleset.
Related
- Voice Engine - the voice-specific pipeline that sits on top of the Core Engine
- Connectors & Adapters - the systems the Core Engine dispatches tool calls to
- Architecture - where the Core Engine sits in the six-layer stack (Layer 02 · Intelligence)
- Platform overview - all four engines and four trust pillars