Managing Agent Lifecycles at Scale with Orleans Grains

Eric Brasher February 25, 2026 at 9:15 AM 6 min read

Every FabrCore agent is an Orleans grain — a lightweight virtual actor with its own identity, state, and single-threaded execution guarantee. This means agents activate on demand, deactivate when idle, survive failures, and scale across a cluster without any thread-safety gymnastics on your part. In this post we walk through the lifecycle methods that make it all work.

The Grain Foundation

At its core, FabrCoreAgentProxy is the base class every agent extends. It wires together the Orleans grain runtime, LLM chat clients, tool resolution, and inter-agent messaging behind a simple set of overridable methods. The constructor takes exactly three parameters — AgentConfiguration, IServiceProvider, and IFabrCoreAgentHost — and does nothing async. All setup happens later in the lifecycle.

C# — Minimal Agent Structure

using FabrCore.Core;
using FabrCore.Sdk;
using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;

[AgentAlias("support-agent")]
[Description("Handles customer support inquiries")]
[FabrCoreCapabilities("Lookup orders, check status, process returns.")]
public class SupportAgent : FabrCoreAgentProxy
{
    private AIAgent? _agent;
    private AgentSession? _session;

    public SupportAgent(
        AgentConfiguration config,
        IServiceProvider serviceProvider,
        IFabrCoreAgentHost fabrcoreAgentHost)
        : base(config, serviceProvider, fabrcoreAgentHost) { }

    public override async Task OnInitialize() { /* ... */ }
    public override async Task<AgentMessage> OnMessage(AgentMessage message) { /* ... */ }
    public override Task OnEvent(EventMessage eventMessage) { /* ... */ }
}

When a message targets "user1:support-agent", Orleans activates the grain if it is not already in memory. The grain stays resident until idle, at which point Orleans deactivates it — flushing custom state and chat history automatically. If the agent is needed again, it re-activates and OnInitialize runs once more. This activation/deactivation cycle is entirely transparent to calling code.

Lifecycle Methods in Detail

FabrCore exposes a clear set of lifecycle hooks. Understanding when each runs is key to building reliable agents.

Method	When It Runs	Purpose
`Constructor`	Grain activation	DI wiring only — no async work
`OnInitialize()`	Before first message or on reconfigure	Set up LLM client, resolve tools, create threads
`OnMessage(AgentMessage)`	Request or OneWay message received	Core message processing, return a response
`OnMessageBusy(AgentMessage)`	Message arrives while OnMessage is running	Handle concurrent messages (default: "busy" response)
`OnEvent(EventMessage)`	Fire-and-forget event	React to stream event notifications
`GetHealth(HealthDetailLevel)`	Health check request	Return custom health metrics

OnInitialize — Wiring Up the Agent

OnInitialize is called once before the first message is processed, or when an agent is reconfigured. This is where you resolve tools from configured plugins, MCP servers, and standalone tool aliases, then create the chat client agent that connects to your LLM provider.

C# — OnInitialize with Tool Resolution

public override async Task OnInitialize()
{
    // Step 1: Resolve tools from plugins, standalone tools, and MCP servers
    var tools = await ResolveConfiguredToolsAsync();

    // Step 2: Add local tool methods defined in this class
    tools.Add(AIFunctionFactory.Create(LookupOrder));

    // Step 3: Create the chat client agent
    var result = await CreateChatClientAgent(
        chatClientConfigName: config.Models ?? "default",
        threadId: config.Handle ?? fabrcoreAgentHost.GetHandle(),
        tools: tools);

    _agent = result.Agent;
    _session = result.Session;
}

The call to ResolveConfiguredToolsAsync() is required before CreateChatClientAgent — tools are not auto-resolved. It discovers plugins via [PluginAlias], standalone tools via [ToolAlias], and connects any configured MCP servers.

OnMessage — Processing Requests

Every request or one-way message enters OnMessage. Orleans guarantees single-threaded execution, so you never have two OnMessage calls running simultaneously on the same grain. The recommended pattern streams the LLM response back to the caller:

C# — Streaming OnMessage

public override async Task<AgentMessage> OnMessage(AgentMessage message)
{
    var response = message.Response();
    var chatMessage = new ChatMessage(ChatRole.User, message.Message);

    await foreach (var update in _agent!.RunStreamingAsync(chatMessage, _session!))
    {
        response.Message += update.Text;
    }

    return response;
}

Token counts are automatically captured and attached to the response Args (e.g., _tokens_input, _tokens_output). Chat history is auto-flushed after OnMessage completes, and compaction runs if the configured token threshold is exceeded.

OnEvent — Fire-and-Forget Events

Events use the EventMessage class (CloudEvents-inspired) and arrive via the AgentEvent stream. They are one-way — no response is expected.

C# — Handling Events

public override Task OnEvent(EventMessage eventMessage)
{
    switch (eventMessage.Type)
    {
        case "order.status-changed":
            logger.LogInformation("Order status changed: {Data}", eventMessage.Data);
            break;
    }
    return Task.CompletedTask;
}

Health Monitoring

Every agent grain exposes health information through GetHealth. The base implementation returns basic status, uptime, and message counts. Override it to surface domain-specific metrics:

C# — Custom Health Reporting

public override AgentHealthStatus GetHealth(HealthDetailLevel level)
{
    var health = base.GetHealth(level);

    if (level >= HealthDetailLevel.Detailed)
    {
        health = health with
        {
            Message = _isReady ? "Ready" : "Initializing"
        };
    }

    return health;
}

Health states include Healthy, Degraded, Unhealthy, and NotConfigured. The diagnostics API at /fabrcoreapi/diagnostics/agents aggregates health across the cluster, making it straightforward to monitor hundreds of agents from a single dashboard. At the Detailed level, health responses include agent type, uptime, messages processed, active timer count, and reminder count.

Why Orleans Grains Matter for AI Agents

The grain model solves several problems that emerge when running AI agents at scale:

Isolation — Each agent grain runs single-threaded. No locks, no race conditions, no shared mutable state between agents. Custom state is persisted automatically on deactivation.
Location transparency — Agents can live on any silo in the cluster. Orleans routes messages to the correct machine transparently. Add silos to scale horizontally.
Automatic lifecycle — Grains activate on first message and deactivate when idle. No manual resource management. Chat history and custom state flush automatically.
Failure recovery — If a silo goes down, grains re-activate on a healthy silo. Persistent reminders survive restarts. Timers resume on activation.
Concurrent message handling — OnMessage is marked [AlwaysInterleave], allowing OnMessageBusy to handle messages that arrive while the agent is already processing. Stale message protection kicks in after 5 minutes.

This combination means you can go from a single-server development setup to a multi-silo production cluster without changing your agent code. The lifecycle methods stay the same — Orleans and FabrCore handle the rest.

Built with FabrCore on .NET 10.

Eric Brasher

Builder of FabrCore and OpenCaddis.