Memory for Agents Is a Systems Problem, Not a Context Window Problem

An agent that forgets why it opened a ticket, repeats a dead-end plan from yesterday, and drags six thousand tokens of irrelevant chat into every prompt does not have a memory shortage. It has a memory design problem.

A lot of current agent work still treats memory as one of two things: keep more transcript, or retrieve more text. Both help. Neither is a real memory architecture. Long context can delay forgetting, and retrieval can patch over it, but neither tells a system what should be remembered, how it should be represented, when it should be updated, what should decay, or which memories are safe to trust.

That is the shift in the recent memory literature for agents. The important question is not how many tokens a model can ingest. It is how an agent should maintain different kinds of memory over time: what belongs in the active state for the current task, what should persist as experience, what should become durable world knowledge, and what should be compiled into reusable know-how. Memory is not one bucket. It is a stack with different roles, write paths, retrieval rules, and failure modes.

Memory for Agents Is a Systems Problem, Not a Context Window Problem

Memory for Agents Is a Systems Problem, Not a Context Window Problem

Working memory: small, typed, and aggressively curated

Episodic memory: experiences, outcomes, and reflections

Semantic memory: durable facts with provenance

Procedural memory: the missing piece

Where graph knowledge bases help, and where they do not

What breaks in production

Production guidance

Sources

Matthew Gribben