How I Taught My OpenClaw Agent to Never Repeat a Mistake

If you're running OpenClaw, you already have the bones of an agent memory system. SOUL.md defines who your agent is. MEMORY.md stores long-term context. memory/YYYY-MM-DD.md captures daily notes. The memory_search tool lets your agent semantically search its own files.

That's a solid foundation. But after running my agent for a week straight — across dozens of sessions, multiple projects, and hundreds of tool calls — I found three gaps:

Failures weren't being captured systematically. The agent would hit a bug, I'd correct it, and by next session… same mistake.
Search was limited to workspace files. I have project docs, skill references, and multi-agent workspaces that the built-in memory search couldn't reach.
Memory maintenance was manual. Daily logs piled up, but nobody was distilling them into lasting knowledge.

So I built three things on top of OpenClaw's memory system: an auto-capture error log, local-first search with QMD, and heartbeat-driven memory maintenance. Together, they turn a forgetful assistant into something that genuinely compounds over time.

The Gap: Why Built-In Memory Isn't Enough

OpenClaw's default workspace gives you this:

workspace/
├── SOUL.md          # Agent identity and behavior rules
├── MEMORY.md        # Curated long-term memory
├── AGENTS.md        # Workspace conventions
├── HEARTBEAT.md     # Periodic check tasks
├── TOOLS.md         # Local setup notes
└── memory/
    ├── 2026-02-05.md  # Daily log
    ├── 2026-02-06.md
    └── 2026-02-07.md

The agent reads SOUL.md and MEMORY.md every session. It reads today's and yesterday's daily log. And memory_search can semantically search across these files.

This works well for the first few days. But here's what happens in practice:

Day 3: You discover that the cron tool has a bug — CLI-created jobs work, tool-created ones don't. You tell the agent. It notes it in the daily log.
Day 5: The agent tries to create a cron job with the tool again. The daily log from Day 3 isn't loaded anymore (only today + yesterday). Same mistake.
Day 7: You're working across 5 project directories, 3 agent workspaces, and a dozen skill files. memory_search only covers MEMORY.md + memory/*.md. The context you need is in a project README.

The foundation is right. The coverage and capture mechanisms needed work.

Layer 1: The Error Log (Auto-Capture)

This is the single most impactful thing I added.

memory/error-log.md is an append-only file where the agent logs every failure, correction, and gotcha — immediately, in real-time, mid-conversation.

# Error Log — Auto-Captured Learnings

## 2026-02-08
- 🔧 **Cron tool bug** — Tool-created cron jobs silently fail. Always use CLI: `openclaw cron add`.
- 🧠 **Wrong assumption: API pagination** — Assumed offset-based, actually cursor-based. Cost 30 min.
- 🔄 **User correction: commit messages** — Don't commit until builds pass. Verify first.
- ⚠️ **ast-grep pattern matching fails on generics** — Use YAML rules instead of pattern strings.
- 💡 **Discovery: bun test** — No separate config needed. Just works with .test.ts files.

The categories make scanning fast:

🔧 tool-failure — something broke
🧠 wrong-assumption — the agent assumed wrong
🔄 user-correction — the human said "no, do it this way"
💡 discovery — learned something useful
⚠️ gotcha — undocumented behavior or subtle trap
🏗️ architecture — structural decisions worth remembering

The instruction in the agent's workspace is simple:

### Auto-Capture Loop

When ANY of these happen, immediately append to memory/error-log.md:
- A tool call fails or returns unexpected results
- User corrects you ("no, do it this way")
- You discover a gotcha or undocumented behavior
- An assumption you made turns out wrong
- Something takes way longer than expected

Format: - 🏷️ **Short title** — What happened. What to do instead.

That's it. No pipeline. No database. Append a line to a markdown file.

Why this matters: By session 5, the agent reads its error log at startup and avoids mistakes before being told. It checks the cron CLI instead of the tool. It uses cursor pagination without asking. It strips generic type params before regex matching. Every correction compounds.

This is the file that turns an agent from "helpful but forgetful" to "actually learns from experience."

Layer 2: Local Search with QMD

OpenClaw ships with built-in memory_search — semantic vector search over your workspace memory files. It even has QMD as an experimental backend (check the docs under memory.backend = "qmd").

I went further and set up QMD as a standalone search layer across everything — not just memory files, but all agent workspaces, all project docs, all installed skills.

# Install QMD (by Tobi Lütke)
bun install -g https://github.com/tobi/qmd

# Index everything that matters
qmd collection add ~/.openclaw/workspace --name workspace
qmd collection add ~/.openclaw/agents --name agents
qmd collection add ~/Projects --name projects
qmd collection add ~/.openclaw/skills --name skills

# Generate embeddings (one-time, ~36 seconds for 300 chunks)
qmd embed

QMD runs entirely on your machine. Two local GGUF models (an embedding model at 328MB and a query-expansion model at 1.28GB) handle everything. Zero API cost. Zero data leaving your laptop.

Three search modes, all local:

Mode	Speed	How it works
`qmd search "keyword"`	~240ms	BM25 full-text (SQLite FTS5)
`qmd vsearch "concept"`	~2s	Vector similarity (local embeddings)
`qmd query "question"`	~5s	Hybrid: query expansion + BM25 + vector + reranking

The agent can now search across everything before acting. "Have I seen this API before? Did I hit issues last time? Is there a skill that handles this?" BM25 alone covers 90% of lookups — and it's 240 milliseconds.

I run qmd update && qmd embed on an hourly cron job. It only processes new or changed files. As the workspace grows — more daily logs, more projects, more learnings — the index grows with it. No manual maintenance.

Layer 3: Heartbeat-Driven Memory Maintenance

OpenClaw supports heartbeats — periodic agent turns where your agent wakes up and checks on things. Most people use them for email checks or calendar reminders.

I use them for memory hygiene.

Every few days, during a heartbeat cycle, the agent:

Reads the last 7 days of daily logs
Scans error-log.md for recurring patterns
Distills anything significant into MEMORY.md (permanent)
Updates learnings.md with new technical patterns
Removes stale entries from MEMORY.md that no longer apply

This is the bridge between raw daily notes and curated long-term memory. Think of it like a human reviewing their journal on Sunday — the daily entries are raw notes, MEMORY.md is curated wisdom, and the heartbeat is the review process.

The heartbeat config is just a markdown file:

# HEARTBEAT.md

## Memory Maintenance (every few days)
1. Read recent memory/YYYY-MM-DD.md files
2. Identify significant events, lessons, insights
3. Update MEMORY.md with distilled learnings
4. Remove outdated info from MEMORY.md

Without this, daily logs pile up as noise. With it, the agent's long-term memory stays relevant and lean.

The Compounding Effect

Here's what the timeline looks like in practice:

Day 1: Normal. Agent is helpful, makes some mistakes. You correct it. It logs corrections to error-log.md. Daily log captures everything.

Day 3: Agent boots up, reads the error log. Avoids two mistakes it made on Day 1 without being told. Feels slightly different — more precise.

Day 7: Heartbeat maintenance distills a week of daily logs into MEMORY.md. Error log has 20+ entries. Agent has a working understanding of your tools, preferences, and project quirks.

Day 14: The agent knows your codebase conventions, your communication style, which APIs have gotchas, and which tools to avoid. Not because someone wrote a 50-page prompt — because it accumulated context through working and failing and writing things down.

Day 30: It's a different tool. Not better autocomplete. Something that remembers where it failed and comes back sharper.

The math: if the agent avoids one 15-minute re-explanation per session, and you run 3 sessions a day, that's 22+ hours saved per month. The real savings are bigger — the avoided mistakes are usually the expensive ones.

Getting Started (15 Minutes)

If you're already on OpenClaw, you have most of this. Here's what to add:

1. Create the error log (2 minutes)

touch ~/.openclaw/workspace/memory/error-log.md

Add to your AGENTS.md:

When a tool call fails, user corrects you, or you discover a gotcha,
immediately append to memory/error-log.md:
- 🏷️ **Short title** — What happened. What to do instead.

2. Set up QMD (10 minutes)

bun install -g https://github.com/tobi/qmd
qmd collection add ~/.openclaw/workspace --name workspace
qmd collection add ~/.openclaw/agents --name agents
qmd embed

Or just set memory.backend = "qmd" in your OpenClaw config and let the gateway handle it.

3. Add memory maintenance to your heartbeat (3 minutes)

Add to HEARTBEAT.md:

## Memory Maintenance (every few days)
- Review recent daily logs → update MEMORY.md
- Scan error-log.md for patterns → update learnings.md
- Remove stale entries from MEMORY.md

That's it. Give it a week. The compounding is hard to describe until you feel it.

Why This Matters Beyond One Agent

This isn't just about making one assistant better. It's a pattern that applies to any AI agent:

Coding agents that remember which build flags break on CI
Research agents that know which sources were unreliable last time
Support agents that log edge cases and handle them next time without escalation

The architecture is the same everywhere: layered memory + auto-capture failures + periodic distillation + local search. Markdown files, version controlled, human-readable.

We keep waiting for AGI like it's going to be a press conference. Some lab coat walks out and says "we did it." It's not going to be that. It's going to be this — tools that remember where they failed and come back sharper. Over and over.

The ground is already moving. You just have to look down.

Built on OpenClaw. Memory search powered by QMD. Everything runs locally.