February 8, 2026 (3mo ago)

How I Taught My OpenClaw Agent to Never Repeat a Mistake

OpenClaw gives your AI agent a workspace with memory files. I took that foundation and built a continual learning system on top — an auto-capture error log, local search with QMD, and heartbeat-driven memory maintenance. Here's how it compounds.

9 min readBy Adhish Thite
How I Taught My OpenClaw Agent to Never Repeat a Mistake

How I Taught My OpenClaw Agent to Never Repeat a Mistake

If you're running OpenClaw, you already have the bones of an agent memory system. SOUL.md defines who your agent is. MEMORY.md stores long-term context. memory/YYYY-MM-DD.md captures daily notes. The memory_search tool lets your agent semantically search its own files.

That's a solid foundation. But after running my agent for a week straight — across dozens of sessions, multiple projects, and hundreds of tool calls — I found three gaps:

  1. Failures weren't being captured systematically. The agent would hit a bug, I'd correct it, and by next session… same mistake.
  2. Search was limited to workspace files. I have project docs, skill references, and multi-agent workspaces that the built-in memory search couldn't reach.
  3. Memory maintenance was manual. Daily logs piled up, but nobody was distilling them into lasting knowledge.

So I built three things on top of OpenClaw's memory system: an auto-capture error log, local-first search with QMD, and heartbeat-driven memory maintenance. Together they turn a forgetful assistant into one that genuinely gets better the longer you use it.


The Gap: Why Built-In Memory Isn't Enough

OpenClaw's default workspace gives you this:

workspace/
├── SOUL.md          # Agent identity and behavior rules
├── MEMORY.md        # Curated long-term memory
├── AGENTS.md        # Workspace conventions
├── HEARTBEAT.md     # Periodic check tasks
├── TOOLS.md         # Local setup notes
└── memory/
    ├── 2026-02-05.md  # Daily log
    ├── 2026-02-06.md
    └── 2026-02-07.md

The agent reads SOUL.md and MEMORY.md every session, plus today's and yesterday's daily log, and memory_search can semantically search across all of these files.

For the first few days this works well. Then practice catches up with you:

  • Day 3: You discover that the cron tool has a bug — CLI-created jobs work, tool-created ones don't. You tell the agent. It notes it in the daily log.
  • Day 5: The agent tries to create a cron job with the tool again. The daily log from Day 3 isn't loaded anymore (only today + yesterday). Same mistake.
  • Day 7: You're working across 5 project directories, 3 agent workspaces, and a dozen skill files. memory_search only covers MEMORY.md + memory/*.md. The context you need is in a project README.

The foundation is solid; what it lacked was coverage and a way to capture failures as they happened.


Layer 1: The Error Log (Auto-Capture)

Of everything I added, this one made the biggest difference.

memory/error-log.md is an append-only file where the agent logs every failure, correction, and gotcha the moment it happens, mid-conversation.

# Error Log — Auto-Captured Learnings
 
## 2026-02-08
- 🔧 **Cron tool bug** — Tool-created cron jobs silently fail. Always use CLI: `openclaw cron add`.
- 🧠 **Wrong assumption: API pagination** — Assumed offset-based, actually cursor-based. Cost 30 min.
- 🔄 **User correction: commit messages** — Don't commit until builds pass. Verify first.
- ⚠️ **ast-grep pattern matching fails on generics** — Use YAML rules instead of pattern strings.
- 💡 **Discovery: bun test** — No separate config needed. Just works with .test.ts files.

The categories make scanning fast:

  • 🔧 tool-failure — something broke
  • 🧠 wrong-assumption — the agent assumed wrong
  • 🔄 user-correction — the human said "no, do it this way"
  • 💡 discovery — learned something useful
  • ⚠️ gotcha — undocumented behavior or subtle trap
  • 🏗️ architecture — structural decisions worth remembering

The instruction in the agent's workspace is simple:

### Auto-Capture Loop
 
When ANY of these happen, immediately append to memory/error-log.md:
- A tool call fails or returns unexpected results
- User corrects you ("no, do it this way")
- You discover a gotcha or undocumented behavior
- An assumption you made turns out wrong
- Something takes way longer than expected
 
Format: - 🏷️ **Short title** — What happened. What to do instead.

There's no pipeline behind this and no database. The agent appends a line to a markdown file.

By session 5, the payoff shows up. The agent reads its error log at startup and avoids mistakes before being told. It checks the cron CLI instead of the tool, uses cursor pagination without asking, and strips generic type params before regex matching. Every correction it logged earlier is now working for it.

This is the file that takes an agent from helpful but forgetful to one that actually learns from experience.


Layer 2: Local Search with QMD

OpenClaw ships with built-in memory_search — semantic vector search over your workspace memory files. It even has QMD as an experimental backend (check the docs under memory.backend = "qmd").

I went further and set up QMD as a standalone search layer across everything — not just memory files, but all agent workspaces, all project docs, all installed skills.

# Install QMD (by Tobi Lütke)
bun install -g https://github.com/tobi/qmd
 
# Index everything that matters
qmd collection add ~/.openclaw/workspace --name workspace
qmd collection add ~/.openclaw/agents --name agents
qmd collection add ~/Projects --name projects
qmd collection add ~/.openclaw/skills --name skills
 
# Generate embeddings (one-time, ~36 seconds for 300 chunks)
qmd embed

QMD runs entirely on your machine. Two local GGUF models (an embedding model at 328MB and a query-expansion model at 1.28GB) handle everything. Zero API cost. Zero data leaving your laptop.

Three search modes, all local:

ModeSpeedHow it works
qmd search "keyword"~240msBM25 full-text (SQLite FTS5)
qmd vsearch "concept"~2sVector similarity (local embeddings)
qmd query "question"~5sHybrid: query expansion + BM25 + vector + reranking

The agent can now search across everything before it acts. Have I seen this API before? Did I hit issues last time? Is there a skill that handles this? BM25 alone covers about 90% of those lookups, and at 240 milliseconds it's effectively instant.

An hourly cron job runs qmd update && qmd embed, processing only new or changed files. As the workspace grows with more daily logs, more projects, and more learnings, the index keeps pace on its own.


Layer 3: Heartbeat-Driven Memory Maintenance

OpenClaw supports heartbeats — periodic agent turns where your agent wakes up and checks on things. Most people use them for email checks or calendar reminders.

I use them for memory hygiene.

Every few days, during a heartbeat cycle, the agent:

  1. Reads the last 7 days of daily logs
  2. Scans error-log.md for recurring patterns
  3. Distills anything significant into MEMORY.md (permanent)
  4. Updates learnings.md with new technical patterns
  5. Removes stale entries from MEMORY.md that no longer apply

This is the bridge between raw daily notes and curated long-term memory. Think of it like a human reviewing their journal on Sunday — the daily entries are raw notes, MEMORY.md is curated wisdom, and the heartbeat is the review process.

The heartbeat config is just a markdown file:

# HEARTBEAT.md
 
## Memory Maintenance (every few days)
1. Read recent memory/YYYY-MM-DD.md files
2. Identify significant events, lessons, insights
3. Update MEMORY.md with distilled learnings
4. Remove outdated info from MEMORY.md

Skip this step and the daily logs just pile up as noise. Run it on a cadence and the agent's long-term memory stays relevant and lean.


The Compounding Effect

Here's what the timeline looks like in practice:

Day 1: Normal. Agent is helpful, makes some mistakes. You correct it. It logs corrections to error-log.md. Daily log captures everything.

Day 3: Agent boots up, reads the error log. Avoids two mistakes it made on Day 1 without being told. Feels slightly different — more precise.

Day 7: Heartbeat maintenance distills a week of daily logs into MEMORY.md. Error log has 20+ entries. Agent has a working understanding of your tools, preferences, and project quirks.

Day 14: The agent knows your codebase conventions, your communication style, which APIs have gotchas, and which tools to avoid. Nobody wrote it a 50-page prompt to get there. It accumulated that context the slow way, by working and failing and writing things down.

Day 30: It feels like a different tool. This isn't fancier autocomplete; it's something that remembers where it failed and comes back sharper.

The math: if the agent avoids one 15-minute re-explanation per session, and you run 3 sessions a day, that's 22+ hours saved per month. The real savings are bigger — the avoided mistakes are usually the expensive ones.


Getting Started (15 Minutes)

If you're already on OpenClaw, you have most of this. Here's what to add:

1. Create the error log (2 minutes)

touch ~/.openclaw/workspace/memory/error-log.md

Add to your AGENTS.md:

When a tool call fails, user corrects you, or you discover a gotcha,
immediately append to memory/error-log.md:
- 🏷️ **Short title** — What happened. What to do instead.

2. Set up QMD (10 minutes)

bun install -g https://github.com/tobi/qmd
qmd collection add ~/.openclaw/workspace --name workspace
qmd collection add ~/.openclaw/agents --name agents
qmd embed

Or just set memory.backend = "qmd" in your OpenClaw config and let the gateway handle it.

3. Add memory maintenance to your heartbeat (3 minutes)

Add to HEARTBEAT.md:

## Memory Maintenance (every few days)
- Review recent daily logs → update MEMORY.md
- Scan error-log.md for patterns → update learnings.md
- Remove stale entries from MEMORY.md

Then give it a week. The compounding is hard to describe until you've felt it.


Why This Matters Beyond One Agent

This isn't really about one assistant. The same pattern applies to any AI agent:

  • Coding agents that remember which build flags break on CI
  • Research agents that know which sources were unreliable last time
  • Support agents that log edge cases and handle them next time without escalation

The architecture is the same everywhere: layered memory, auto-captured failures, periodic distillation, and local search, all in markdown files that are version controlled and human-readable.

A lot of us are waiting for AGI to arrive as some kind of announcement, a lab coat walking out to say "we did it." I think it shows up more quietly than that, as tools that remember where they failed and come back a little sharper each time. That part is already happening, if you bother to notice it.


Built on OpenClaw. Memory search powered by QMD. Everything runs locally.

AT

Want to discuss this further?

I'm always happy to chat about AI, ML, or interesting engineering challenges.