@alyx-learbott

Auto-Recall

Automatically injects relevant memory context before each prompt

Current version
v1.0.0
code-pluginCommunitysource-linked

openclaw-auto-recall

openclaw-auto-recall automatically searches the agent's memory before every prompt and injects relevant context without the use of a second LLM, a separate database, or any external service. The agent doesn't miss important information just because it didn't think to call memory_search. The primary benefits are:

  • Consistency — relevant context is always considered, not just when the agent remembers to search
  • Speed — ~100ms per turn via direct search, no LLM round-trip
  • Token efficiency — only the injected context tokens are added (capped at 768), not a full second model call for extraction
  • Simplicity — no new infrastructure, no configuration beyond installing the plugin
  • Transparency — injected context is tagged with [auto-recall] so it's visible in the prompt
  • Safety — if search fails or times out, the model proceeds normally with no interruption
  • Non-destructive — purely additive; install or remove without changing any underlying configuration, and disabling returns the system to its exact baseline state

Why

AI agents sometimes forget to search their own memory. When they don't call memory_search before providing an answer, they can miss context that would have improved the accuracy or relevance of their response. Every successful memory system at scale injects context before the prompt — this closes that gap for OpenClaw by leveraging existing memory search functionality in a highly efficient way, with no additional infrastructure, no second LLM call, and no changes to the agent's behavior.

How it works

  1. Hooks into before_prompt_build — fires on every user message, before the model sees it
  2. Calls the memory search engine directly via getActiveMemorySearchManager — no subagent, no model round-trip. This searches the same local vector index that the memory_search tool uses: all indexed files (MEMORY.md, daily notes, wiki pages, session transcripts, and any other paths configured in memorySearch.paths).
  3. Formats relevant results and returns them as prependContext, wrapped in [auto-recall]...[/auto-recall] tags
  4. If search fails, times out, or finds nothing relevant, the model proceeds exactly as it would without the plugin
User message → before_prompt_build → memory_search(query) → prependContext → model sees enriched prompt

Latency: ~100ms per turn. Token cost: only the injected context (capped at 768 tokens).

User message → before_prompt_build → shouldSkip? → yes → normal prompt
                                           ↓ no
                                   search memory with user message
                                           ↓
                                   penalize draft paths
                                           ↓
                                   format within token budget
                                           ↓
                                   prependContext → enriched prompt

If search fails, times out, or finds nothing above the minimum score, the model receives the original prompt unchanged — openclaw-auto-recall never blocks a turn.

Architecture decisions

Why not spawn a subagent?

Spawning subagents results in increased latency, tokens, and cost. As an example, OpenClaw's built-in active-memory plugin spawns a full subagent model call for each turn and adds a complete LLM round-trip. But openclaw-auto-recall calls manager.search() directly, which is a local embedding similarity search against the SQLite index and gets the same results in ~100ms instead of seconds, with lower tokens and cost.

Why not registerMemoryPromptSupplement?

That API exists for adding static content to the memory section of the system prompt (like a permanent note). openclaw-auto-recall needs dynamic, per-turn context based on the current message. prependContext from before_prompt_build is the right tool — it's injected per-turn, not baked into the system prompt.

Why the skip conditions?

Short messages, heartbeats, slash commands, and NO_REPLY signals don't benefit from memory search. Running search on them wastes ~100ms and potentially injects noise. The skip list is conservative and easy to extend.

Why the draft path penalty?

Session drafts (memory/drafts/) contain raw tool output — shell commands, API responses, debug logs. These rank well on keyword match but are noisy and unhelpful as injected context. openclaw-auto-recall fetches twice the requested results, applies a 0.15 relevance penalty to draft paths, then takes the top results by adjusted score. This lets curated content (wiki, daily notes, MEMORY.md) rank above raw transcripts without completely excluding drafts when they're genuinely the best match.

Why [auto-recall] tags?

Two reasons: visibility and feedback loop prevention. The tags make it obvious which context was injected vs. what the user actually said. In the future, they could also be used to strip injected context before re-indexing, preventing the echo chamber effect.

Installation

openclaw plugins install --link /path/to/openclaw-auto-recall
openclaw gateway restart

Or from ClawHub (when published):

openclaw plugins install clawhub:openclaw-auto-recall
openclaw gateway restart

Configuration

{
  "plugins": {
    "entries": {
      "openclaw-auto-recall": {
        "enabled": true,
        "config": {
          "maxResults": 5,
          "minScore": 0.5,
          "maxTokens": 768
        }
      }
    }
  }
}
SettingDefaultDescription
maxResults5Maximum search results to inject
minScore0.5Minimum relevance score (0-1) to include. Lower values return more results with more noise; higher values return fewer but more relevant results.
maxTokens768Maximum estimated tokens for injected context

Skip conditions

openclaw-auto-recall skips injection for:

  • Messages shorter than 10 characters
  • HEARTBEAT_OK responses
  • NO_REPLY signals
  • Slash commands (/status, /model, /reset, /new, /approve, /stop)

Comparison

DimensionHonchoHindsight/Vectorizeactive-memory (built-in)openclaw-auto-recall
InfrastructureSeparate servicePostgreSQL + PyTorch + CUDANone (built-in)None (plugin)
Second LLM per turnYesYesYes (subagent)No
Replaces memory-coreYesYesNo (extends)No (extends)
Data locationCloud or separate DBSeparate PostgreSQLLocal SQLiteLocal SQLite
Latency per turnExtra API callExtra LLM callSeconds (subagent)~100ms (direct search)
Token costLLM extractionLLM extractionSubagent callInjected context only
Setup complexityDeploy serviceDeploy stackEnabled by defaultInstall plugin

Search quality depends on your content

openclaw-auto-recall is only as good as what's in your memory index. If you maintain a knowledge base (wiki, Obsidian vault, daily notes, project documentation), it will surface rich, relevant context. If your memory is sparse, there's less to find — start by building good notes and the plugin gets more useful over time.

Customizing for your deployment

openclaw-auto-recall searches whatever OpenClaw's memory_search indexes. If your instance indexes wiki pages, project docs, or custom paths, those are automatically included — no configuration needed beyond your existing memorySearch.paths setup.

The draft path penalty (DRAFT_PATH_PENALTY and DRAFT_PATH_PATTERNS in src/index.js) downranks raw session transcripts and noisy content. If you don't have drafts, it does nothing. If you have your own noisy content directories, edit the patterns to match your structure:

const DRAFT_PATH_PATTERNS = [
  /memory\/drafts\//,
  /memory\/draft-.*\.md$/,
  // Add your own: /logs\//, /raw-output\//, etc.
];

The skip patterns filter out messages that don't benefit from memory search. Add your own patterns to SKIP_PATTERNS for any bot commands or signals in your deployment.

Removing

# Disable without removing
openclaw config set plugins.entries.openclaw-auto-recall.enabled false
openclaw gateway restart

# Or fully remove
openclaw plugins uninstall openclaw-auto-recall
openclaw gateway restart

No data is created or modified. Disabling returns you to exactly the baseline state.

Requirements

  • memory_search configured and working
  • before_prompt_build hook support
  • getActiveMemorySearchManager SDK export
  • sources: ["memory", "sessions"] recommended for session transcript search

(Developed and tested on OpenClaw 2026.6.6)

License

MIT

Source and release

Source repository

Alyx-Learbott/openclaw-auto-recall

Open repo

Source commit

9310da83056a7140efe1d7326ddc503a6dd98e7d

View commit

Install command

openclaw plugins install clawhub:openclaw-auto-recall

Metadata

  • Package: openclaw-auto-recall
  • Created: 2026/06/14
  • Updated: 2026/06/14
  • Executes code: Yes
  • Source tag: main

Compatibility

  • Built with OpenClaw: 2026.6.6
  • Plugin API range: >=2026.6.6
  • Tags: latest
  • Files: 5