Your AI coding agent has amnesia. Here's how to fix it.
Every Claude Code or Cursor session starts from zero. The agent re-greps the same files, re-runs the same lint checks, re-debates the same finding you already reviewed. Persistent findings — a local fact store the agent reads at startup — kills the loop.
You opened Claude Code Tuesday afternoon. After 20 minutes of
exploration, the agent finally surfaced a real bug: your auth callback
was missing a tenant scope check on resolveManagerScope
at lib/auth/dal.ts:142. You filed it. You shipped the
fix.
Wednesday morning you start a new session. You ask the agent about something completely unrelated. Twenty minutes in, it's wandering through the auth code again — and 15 turns later, it surfaces the exact same finding. You already fixed that one. The agent has no idea.
This is the amnesia problem. Every agent session starts from zero. The work the agent did yesterday — the files it scanned, the lint findings it produced, the audits it ran — vanishes the moment the chat closes. You pay for that work twice. Three times. Forever.
This post is about why that happens, why "just remember in
CLAUDE.md" doesn't scale, and what a real persistence
layer looks like — including the compounding payoffs you only get
once findings outlive sessions.
The cost of amnesia, in turns
Every agent session has a discovery phase before it can do useful work. On a real codebase, that phase is roughly:
- Read repo structure (
ls,tree, or arepo_maptool). - Identify which files are relevant to the current task.
- Run any diagnostics that haven't been run yet (lint, type-check, audit).
- Read the active findings on those files.
- Start working.
Without persistence, every step except #1 is redundant. You ran diagnostics last session. You identified the relevant files last session. You read the findings last session. The agent doesn't know any of that, so it re-does it.
The cost compounds across sessions:
Session 1: 20 turns of exploration → 5 turns of work
Session 2: 20 turns of exploration → 5 turns of work
Session 3: 20 turns of exploration → 5 turns of work
Total exploration cost: 60 turns
Total productive work: 15 turns
If exploration carried over: 20 + 5 + 5 + 5 = 35 turns total. You pay for exploration once if your agent has memory. You pay for it every time if it doesn't.
And it's not just the dollar cost. Re-discovery is slow. The "open laptop, ask question, wait three minutes for the agent to find its bearings" pattern is one of the worst feelings in modern development. It's why a lot of teams give up on agent workflows after two weeks — the friction front-loads every session.
Three things that get lost without persistence
1. Diagnostic results
You ran typescript_diagnostics on the
app/dashboard/ tree yesterday. It produced 8 findings, 6
of which you fixed. The other 2 are real, you just haven't gotten to
them.
Today's session: the agent has no idea those findings exist. To
know that app/dashboard/manager/layout.tsx has an open
issue, it has to re-run TSC. Same compute, same output, same 30
seconds of waiting.
2. Audit results
Worse: audits with side effects (or just slow audits). A
tenant_leak_audit across a 47-table Postgres schema
takes real time. Running it yesterday produced a list of flagged
tables. Without persistence, you either run it again every session
(slow, expensive) or forget about it (dangerous).
3. Intentional decisions
This is the worst loss. You looked at a finding yesterday and decided "this is intentional — the unused export is part of our public SDK surface, leave it." You moved on.
Today the agent re-runs the lint, sees the same finding, and helpfully suggests removing the export. You explain again. You write the same justification again. Every session, the same conversation.
The agent isn't dumb. It just has no record of what humans (or it, in a previous turn) already decided was intentional. The signal is real; the agent has no way to keep it.
Why CLAUDE.md doesn't solve it
The natural first instinct is "put it in CLAUDE.md."
That works for project rules ("we use TypeScript strict mode,
follow this auth pattern") but breaks down for findings:
- Findings are too specific to fit. "On
lib/auth/dal.ts:142, the missing tenant scope is intentional because…" is one finding. There are dozens. CLAUDE.md becomes unreadable. - Findings drift with the code. Line 142 today isn't line 142 after a refactor. CLAUDE.md doesn't follow code motion.
- The agent has to read CLAUDE.md every turn. Token cost. The longer it gets, the more you pay every prompt.
- It's append-only by convention. Stale entries build up. There's no expiration mechanism, no finding-was-fixed-and-now-it-shouldn't-be-suppressed-anymore logic.
CLAUDE.md is great for "how to think about this project." It's bad for "what we already discovered about this project."
What persistence actually requires
A real persistence layer for agent findings needs five things, each of which CLAUDE.md misses:
- Stable identity. A finding has to survive code edits without becoming a "new" finding. Identity should be based on the rule + file + a structural anchor (function name, AST node), not line numbers.
- Lifecycle states. A finding starts open. It can be resolved, suppressed (with a reason), or stale. The store knows the difference.
- Ack semantics. "We reviewed this and decided it's intentional" is a real action with provenance. Who acked, why, when. Different from "we ignored it."
- Auto-revoke on structural change. If the code the ack was attached to changes structurally, the ack should auto-revoke — forcing a re-review instead of silently hiding a new bug that snuck in during a refactor.
- Freshness. Every fact should know whether it's current, stale (file changed since), contradicted (a newer fact disagrees), or unknown.
None of this is exotic. It's the same model your issue tracker uses to keep tickets alive across edits — except scoped to programmatic findings the agent can read in 50ms instead of paginated through a web UI.
The shape of a fact store
Concretely, what a stored fact looks like:
{
factId: "fact_01H8K2…",
subjectKind: "file",
subjectId: "lib/auth/dal.ts",
kind: "exports",
value: { name: "verifySession", kind: "function" },
sourceTool: "project_index_refresh",
indexedAt: "2026-04-26T15:14:02Z",
fileHash: "a1c2…",
freshness: "fresh_indexed"
} And a stored finding:
{
findingId: "find_01H8K2…",
code: "auth/missing-tenant-scope",
severity: "warn",
subjectKind: "file",
subjectId: "lib/auth/dal.ts",
line: 142,
identity: {
matchBasedId: "auth/missing-tenant-scope:lib/auth/dal.ts:resolveManagerScope",
ruleId: "auth/missing-tenant-scope"
},
status: "open",
ack: null,
createdAt: "2026-04-26T15:14:02Z",
lastSeenAt: "2026-04-26T15:14:02Z"
} The interesting field is identity.matchBasedId. It's
not a hash of the line content. It's a structural anchor — the rule
ID + file + the function the finding lives inside. Reformat the file,
the line moves, the anchor stays. Same finding, same id, same ack
attached.
Rename resolveManagerScope to
requireManagerScope — that's a structural change. The
match-based id no longer matches. The ack auto-revokes. The next agent
session sees an open finding again and has to re-decide whether the
intentional-pattern still applies. That's not a bug; that's the design.
Stale acks are how you smuggle a real bug past your own audit.
What this unlocks for the agent
Once findings persist, the agent's startup looks completely different. Instead of:
turn 1: "Let me look at the repo..."
turn 2: rg -n "manager onboarding"
turn 3: ls app/api/manager
turn 4: cat app/api/manager/onboarding/route.ts
turn 5: cat lib/auth/dal.ts
turn 6: rg -n "verifySession"
turn 7: ... It's:
turn 1: project_open_loops()
→ 3 unresolved findings, including:
"auth/missing-tenant-scope on lib/auth/dal.ts:resolveManagerScope"
turn 2: reef_inspect({ subjectId: "lib/auth/dal.ts" })
→ Last reviewed yesterday. 1 finding open. 1 ack: "module exports any
types intentionally — SDK boundary."
turn 3: ...starts work... Three turns instead of seven. And turn 1 carries information no amount of grep would surface — the auto-revoke ledger telling you which intentional patterns are still trusted vs. which need a re-review after recent changes.
The compounding payoff
The first session with persistence costs the same as a session without. You're indexing the codebase, running diagnostics, building findings — that work has to happen.
The second session is where it pays off. The agent reads
project_open_loops and gets yesterday's unresolved work
plus today's freshness deltas. Total cost: one tool call, ~600 tokens.
By the tenth session, the value is enormous. The store has:
- Findings spanning weeks of agent + human review.
- Acks from the team on intentional patterns, with reasons.
- A history of which diagnostic runs caught what, and when.
- Auto-revoked acks pointing at recent refactors that need fresh attention.
That's a real institutional memory the agent contributes to and draws from. Onboarding a new teammate? Their first agent session inherits everyone's prior reviews. Switching projects? Reef state persists per-project; the new project's agent session starts with that project's history, not the wrong one.
Where persistence hurts (be honest about it)
Three places where persistent findings can mislead, worth knowing:
Stale acks during big refactors
The auto-revoke heuristic is good but not perfect. A heavy refactor
that changes a lot of structural anchors at once can revoke acks the
team intended to keep. Run finding_acks_report after
big refactors and re-ack what should still be acked.
Drift between findings and reality
If a diagnostic source's rules change (you upgraded ESLint), old findings might not match new ones one-to-one. Same code, slightly different finding shape. The store doesn't auto-resolve this — you need to re-run the diagnostic and let the new findings replace the old.
Acking to silence noise
The biggest failure mode. Acks are meant to be reviewed
intentional decisions. If the team starts acking findings just to
make the dashboard quieter, you've inverted the value: now the store
is hiding signal, not preserving it. Use
finding_acks_report periodically to audit who acked what
and why. If the reasons are thin, treat that as a process problem.
What you actually need to wire up
For agentmako specifically — and this is the shape any persistent findings layer should take:
- A local SQLite store, per-project, under
.mako-ai/. No hosted service. - An MCP server that exposes read tools (
file_findings,project_findings,project_open_loops,verification_state,reef_inspect) and ack tools (finding_ack,finding_ack_batch). - A CLAUDE.md / AGENTS.md instruction telling the agent to call
project_open_loopsat the start of every session andfile_findingsbefore editing any file. - Optionally, commit
.mako-ai/to git so the team shares findings — or .gitignore it if Reef state is per-developer.
That's it. Once the store has a few sessions of history, the compounding kicks in. The agent stops starting from zero. The team stops re-explaining the same intentional patterns. The cost per session drops, and the institutional memory grows.
The shift
Most agent tooling discussions focus on the model: which Claude version, which prompt, which system message. Those matter. But the larger lever — the one that compounds across sessions instead of saving 5% per turn — is whether the work the agent does today is still available tomorrow.
An agent without persistence is a fresh hire every morning. They re-learn the codebase, re-discover the bugs, re-debate the intentional patterns. Productive eventually, but starting from zero every day.
An agent with persistence is the same teammate every morning. They remember what they (and you) decided yesterday. They flag what's new without re-flagging what's old. They get faster, not slower, the longer they're around.
That's the difference Reef makes in agentmako — and it's the difference any agent tool needs to make the leap from "interesting demo" to "actually useful on a real codebase." Apache-2.0, local-first, no hosted service. Drop it in, run it for a week, watch your second session feel different from your first.
Want this for your codebase?
agentmako is local-first, Apache-2.0, and works with every MCP-compatible coding agent.