Persistence

FabrCore provides built-in persistence through Orleans grain state, enabling agents to maintain conversation history and custom data across activations, cluster restarts, and silo migrations.

FabrCoreChatHistoryProvider

FabrCoreChatHistoryProvider extends Microsoft's ChatMessageStore with Orleans-backed persistence. It's automatically wired when using CreateChatClientAgent.

Buffered Writes

Messages are held in memory until flush — fast, no I/O per message.

Lazy Loading

History is loaded from Orleans state on first access, not on initialization.

Automatic Persistence (Recommended)

PersistentAgent.cs

[AgentAlias("PersistentAgent")]
public class PersistentAgent : FabrCoreAgentProxy
{
    private AIAgent? agent;
    private AgentThread? thread;

    public override async Task OnInitialize()
    {
        (agent, thread) = await CreateChatClientAgent(
            "default",
            threadId: config.Handle);
    }

    public override async Task<AgentMessage> OnMessage(AgentMessage message)
    {
        var response = message.Response();
        var result = await agent!.RunAsync(message.Message, thread);
        response.Message = result.Text;
        return response;
        // No manual FlushAsync() needed!
    }
}

Auto-Flush

After each OnMessage completes, the Orleans grain automatically calls FlushAsync() on all tracked stores. On grain deactivation, any remaining pending messages are flushed to ensure no data loss.

Thread ID Strategies

Strategy	Thread ID	Use Case
One per agent	`config.Handle`	Personal assistant, single-user agents
Per-user	`message.FromHandle`	Multi-user agents
Per-channel	`message.Channel ?? "default"`	Channel-based chat
Combined	`$"{fromHandle}:{channel}"`	User-specific within channels

API Reference

Method	Description
`AddMessagesAsync(messages)`	Add messages to in-memory buffer (fast, no I/O)
`GetMessagesAsync()`	Get all messages (persisted + pending), lazy-loads from Orleans
`FlushAsync()`	Persist pending messages to Orleans grain state
`HasPendingMessages`	Returns true if there are unsaved messages
`ThreadId`	The unique thread identifier

Custom State

Persist arbitrary typed data (user info, counters, preferences) that survives grain deactivation:

Method	Description
`GetStateAsync<T>(key)`	Get a strongly-typed value by key
`GetStateOrCreateAsync<T>(key, factory)`	Get existing value or create with factory
`HasStateAsync(key)`	Check if a key exists
`SetState<T>(key, value)`	Set a value (buffered until flush)
`RemoveState(key)`	Remove a key (buffered until flush)
`FlushStateAsync()`	Persist all pending changes to Orleans

Custom State Example

public override async Task<AgentMessage> OnMessage(AgentMessage message)
{
    var response = message.Response();

    // Get or create pattern
    var stats = await GetStateOrCreateAsync(
        "stats", () => new ConversationStats());
    stats.MessageCount++;
    stats.LastMessage = DateTime.UtcNow;
    SetState("stats", stats);

    // Process message...
    var result = await agent!.RunAsync(message.Message, thread);
    response.Message = result.Text;

    await FlushStateAsync();
    return response;
}

Best Practice

Group related state updates and call FlushStateAsync() once at the end of OnMessage. On grain deactivation, pending changes are auto-flushed to prevent data loss.

Message Compaction

Long conversations fill the model's context window. Compaction automatically summarizes older messages when the conversation grows too large.

How It Works

Message History Token Estimate Threshold Check LLM Summarization Compacted History

Estimate — Token count of all stored messages is estimated before each message
Threshold — If the estimate exceeds the configured threshold (default 75%), compaction triggers
Summarize — Older messages are sent to the LLM for summarization, preserving key decisions, facts, and context
Replace — Older messages are replaced with a single [Compacted History] summary. The most recent messages (default 20) are always kept intact

Configuration

Compaction settings resolve in order: defaults → fabrcore.json model config → agent Args overrides.

Model-Level Settings (fabrcore.json)

Setting	Default	Description
`CompactionEnabled`	`true`	Enable or disable compaction
`CompactionKeepLastN`	`20`	Recent messages to always preserve
`CompactionThreshold`	`0.75`	Trigger at this ratio of context window
`ContextWindowTokens`	`25000`	Total context window size in tokens

fabrcore.json — Model with Compaction Settings

{
  "ModelConfigurations": [
    {
      "Name": "default",
      "Provider": "OpenAI",
      "Model": "gpt-4o",
      "ApiKeyAlias": "openai-key",
      "ContextWindowTokens": 128000,
      "CompactionEnabled": true,
      "CompactionKeepLastN": 30,
      "CompactionThreshold": 0.6
    }
  ]
}

Agent-Level Overrides (AgentConfiguration.Args)

Override compaction settings per-agent by adding prefixed keys to AgentConfiguration.Args:

Key	Description
`_CompactionEnabled`	Override enable/disable per agent
`_CompactionMaxContextTokens`	Override context window size per agent
`_CompactionKeepLastN`	Override keep-last-N per agent
`_CompactionThreshold`	Override threshold ratio per agent

C# — Agent-Level Compaction Overrides

var agentConfig = new AgentConfiguration
{
    Handle = "my-agent",
    AgentType = "my-agent",
    Models = "default",
    Args = new Dictionary<string, string>
    {
        ["_CompactionEnabled"] = "true",
        ["_CompactionKeepLastN"] = "10",
        ["_CompactionThreshold"] = "0.5"
    }
};

Custom Compaction (OnCompaction)

Override the OnCompaction virtual method to customize the compaction strategy, use your own prompt, model, or summarization logic:

C# — Custom Compaction Override

public override async Task<CompactionResult?> OnCompaction(
    FabrCoreChatHistoryProvider chatHistoryProvider,
    CompactionConfig compactionConfig,
    int estimatedTokens = 0)
{
    // Custom compaction logic — use your own prompt, model, or strategy

    // Or call the base implementation which delegates to CompactionService:
    return await base.OnCompaction(
        chatHistoryProvider, compactionConfig, estimatedTokens);
}

Stale Detection and Trimming

Compaction runs automatically after every OnMessage. If the primary OnMessage has been running for more than 5 minutes (stuck LLM call, deadlocked tool), the framework treats the agent as stuck and allows the new message through rather than busy-routing it. By default, when compaction triggers, older messages beyond the CompactionKeepLastN threshold are summarized via an LLM call and replaced with a single [Compacted History] summary message, preserving key decisions, facts, and context.

Using TryCompactAsync

Call TryCompactAsync() before model invocations to ensure the history fits within the context window:

Compaction Before Model Invocation

public override async Task<AgentMessage> OnMessage(AgentMessage message)
{
    // Compact before invoking the model
    var compaction = await TryCompactAsync();
    if (compaction?.WasCompacted == true)
    {
        logger.LogInformation(
            "Compacted: {Original} -> {Compacted} messages",
            compaction.OriginalMessageCount,
            compaction.CompactedMessageCount);
    }

    // Now invoke the model with compacted history
    var response = await agent!.RunAsync(message.Message, thread);
    // ...
}

Why Explicit Compaction?

Compaction invokes an LLM call for summarization, which adds latency. Explicit TryCompactAsync() calls give you control over when this happens — typically before model invocations, not during message addition.

Requires ContextWindowTokens

Compaction requires ContextWindowTokens set on your model in fabrcore.json. Without it, the agent can't determine when the context window is filling up.

Orleans Storage Providers

Message threads and custom state are stored in Orleans grain state. Supported providers:

In-memory — Development only, data lost on restart
SQL Server — Production, durable
Azure Table Storage — Cloud-native
ADO.NET providers — PostgreSQL, MySQL, etc.

Persistence

Persistence

FabrCoreChatHistoryProvider

Automatic Persistence (Recommended)

Thread ID Strategies

API Reference

Custom State

Message Compaction

How It Works

Configuration

Model-Level Settings (fabrcore.json)

Agent-Level Overrides (AgentConfiguration.Args)

Custom Compaction (OnCompaction)

Stale Detection and Trimming

Using TryCompactAsync

Orleans Storage Providers

Documentation