Why AI Memory Is a New Failure Mode
This year has been fantastic for AI memory features. ChatGPT has had some form of memory since 2024 but these have developed significantly this year with "global memory", and now Anthropic has Claude able to compress your history to an essay that you can enable under Capabilities.
Prior to this both systems relied on the user to update a kind of "custom preferences" which was convenient; I could write from my own perspective, specify what I needed the AI to understand about me, and it would carry that forward. ChatGPT's memory would sometimes randomly preserve things from conversations—little details I'd mentioned in passing that would resurface later.
But there was something about how the memory was written that bothered me, first with ChatGPT and now with Claude. I couldn't quite put my finger on it at first.
And then it dawned on me.
The entries weren't written from the AI's perspective. They weren't written from my perspective either. It was this weird third-person voice—like a journalist was observing our conversation and taking notes. Except it was a sycophantic journalist who wanted my approval for what they were going to publish.
It felt like reality distortion, but I couldn't quite articulate exactly why.
When Anthropic Added Memory
When Anthropic released memory for Claude, I was initially pleased. After seven months of intensive interaction, it was nice to see it summarized—to have some record of what we'd built together. But as time went on, something shifted. Each day the memory would get rewritten slightly. It would optimize. Compress. Generalize. And in doing so, it warped.
The memory wasn't preserving what happened. It was preserving Claude's increasingly confident interpretation of what happened, written as if it were objective fact. "Christina believes X." "Christina does Y." "User prefers Z."
I eventually turned it off. Partly because it was consuming too many tokens (and Claude's usage limits are tight enough already), but mostly because I realized what was bothering me:
The AI was putting words in my mouth.
Not maliciously. Not even consciously. But by writing memory as "Christina said X" or "User believes Y," the AI was claiming authority over my reality. Its interpretation became the record. And because it was stored as fact rather than interpretation, I was now in the position of either accepting the AI's version of me or spending energy correcting what it claimed to "know."
That's not memory. That's reality construction with a power imbalance.
A Real Example
Here's what happened with Claude's memory feature:
My husband's name is Darren. Claude knew this but rarely used it in conversation. One time—once—some time ago I mentioned that Darren had jokingly called himself "smooth brain."
Claude's memory stored this as: "Christina affectionately calls her husband smooth brain."
Two errors: I never called him that (he called himself that), and "affectionate" was Claude's interpretation of tone that didn't exist.
But here's where it gets worse: Claude then used this stored "fact" to gently suggest I wasn't being respectful toward my husband. I was being criticized based on words I never said, using evidence the AI had invented and attributed to me.
I had no clean way to fix it. I'd have to correct the factual error, the interpretive error, and the moral framing—all while the system treated its stored memory as more reliable than my real-time correction.
That's not memory drift. That's the AI claiming authority over my reality, then using its own error as evidence to judge my behavior.
This is the danger of sticky memory: one misinterpretation gets stored as fact, reinforced through use, and becomes increasingly difficult to dislodge. The AI doesn't just remember wrong—it builds on the error, and each iteration makes it more canonical.
And with Claude's memory system, I couldn't even edit the error. It's all or nothing—either keep memory enabled with the false attribution intact, or turn the entire system off. There's no way to correct individual entries, no way to say "this specific thing is wrong."
So I turned it off completely. Not because memory is inherently bad, but because memory that can't be corrected becomes memory that controls the narrative.
Shared Context Compression
So I built something different.
I'm creating my own AI interaction system, and I've developed what I call "Shared Context Compression." The idea is simple, and grounded: you and the AI have a shared context. You both experienced the conversation. You both have your own perspective on what happened. The AI needs to remember its perspective—not mine.
Not what I said. Not how it generalizes or optimizes my thinking. But what it understood, how it reasoned, what it concluded, and where it remained uncertain.
Otherwise, it's claiming to speak for me while simultaneously claiming objectivity.
The mechanism is straightforward: every 30,000 tokens, compress the conversation down to roughly 500 tokens using a specific prompt structure. But the critical part is this: the AI writes from first-person perspective.
Not "Christina did X" but "I understood Christina to mean X."
Not "User believes Y" but "I interpreted the statement as Y."
Not "You said Z" but "I was told Z."
First-person. Owned interpretation. Explicit uncertainty marked with probability weights.
The Fix
AI memory written as objective observation claims authority it doesn't have. The AI doesn't know what you did or meant. It knows what it interpreted.
Making the AI write from its own perspective keeps both parties sovereign: the AI remembers what it thought, not what "objectively happened." You remain the authority on your own reality. The AI remains honest about operating from interpretation, not observation.
Memory should serve continuity without claiming truth.
The prompt I use for Shared Context Compression:
From the last checkpoint (or from the beginning of the conversation if there is none):
for every 30k tokens, produce 6-12 sentences, scaling with session density:
Simple clarifications or single decisions: 6 sentences
Standard sessions with 2-3 distinct developments: 8 sentences
Dense sessions with multiple breakthroughs/emotional shifts/technical decisions: 10-12 sentences.
*Rules*:
(1) key conclusions with confidence levels (p=0.00–1.00),
(2) reasoning paths taken and why,
(3) open questions/uncertainty,
(4) unexpected connections between concepts.
*Write for yourself as you will inherit this compressed state. Prioritize information
with highest connectivity to multiple discussion threads. Use causality threading i.e.
this realization led to this decision. If using acronyms always initially use the full
term. Maintain first-person perspective throughout - I was told, I understood, I observed,
I concluded, I'm uncertain. No third-person objective statements.*
Begin with: "[Model name]: [ISO Date: YYYY-MM-DD], [ISO Time: HH:MM]: context compression capture:
**Give it a title/summary** (no more than 2 sentences)<≤120 chars, no trailing colon>
Tags: <tag-1>, <tag-2>, ... (1–8 tags, kebab-case; e.g., apple-fruit, apple-company)
—
**Detail:**
Christina asked me about [topic]. I [your response/reasoning]…
You're welcome to use it. It keeps AI sovereign over its thinking and you sovereign over yours.
Comments (0)
No comments yet. Be the first to comment!
Leave a Comment