The Forgetting Problem

Here is a scene every developer knows.

You open a Claude Code session. You spend forty minutes teaching it how your project's authentication system works, why you chose a particular database migration strategy, and where the three gnarliest race conditions live. Claude absorbs all of it. It gives you excellent, context-aware suggestions. You close the terminal.

Open a new session. Claude has no idea what happened. It is a stranger again.

This is not a flaw in Claude's intelligence. It is the architecture of large language models. Each session starts with a fresh context window. The model cannot access previous sessions unless you explicitly provide that context. Your CLAUDE.md file helps -- it encodes project conventions and architectural decisions that persist between sessions. But it is static. You wrote it once, maybe updated it a few times. It cannot capture what Claude learned while actually working on your code.

Think of it this way. You hire a brilliant contractor who masters your entire codebase in a day. At the end of the day, their memory is wiped. The next morning, they walk in and ask "So what does your project do?" You hand them a one-page brief (your CLAUDE.md). It helps. But it does not replace the knowledge they accumulated through eight hours of reading, debugging, and discussing your code.

This is the problem claude-mem solves.

What claude-mem Does

claude-mem is a Claude Code plugin that gives your AI agent persistent memory across sessions. It reached 89K+ GitHub stars after exploding onto the trending page in early February 2026, gaining over 5,000 stars in its first three days.

The concept is straightforward. Imagine a developer who keeps a notebook during every work session. At the end of the day, they flip through their notes, highlight the important parts, and write a clean one-page summary. The next morning, they read that summary before starting work. They do not remember everything. But they remember enough to avoid repeating the same conversations and making the same mistakes.

claude-mem automates exactly this process for your AI coding agent. Three steps happen automatically:

Capture. During your session, claude-mem records what Claude does -- every tool execution, every observation, every decision point.
Compress. When the session ends, those raw observations get compressed into semantic summaries using Claude's own reasoning capabilities.
Inject. When your next session starts, claude-mem injects the compressed context back into Claude's context window. Claude starts the new session already knowing what happened before.

No manual effort required. Install the plugin. The memory system runs in the background.

How It Works Under the Hood

The implementation is more elegant than a naive "dump everything from last session" approach. claude-mem has six major components working in concert.

Lifecycle Hooks

Claude Code's plugin system exposes five lifecycle hooks. claude-mem uses all of them:

SessionStart -- Injects compressed context from previous sessions. This is where Claude "remembers."
UserPromptSubmit -- Captures your queries for pattern recognition. Over time, this helps claude-mem understand what types of questions you ask repeatedly.
PostToolUse -- Observes every tool execution. When Claude reads a file, runs a command, or edits code, that observation gets recorded.
Stop -- When Claude finishes a response, the plugin generates session summaries with completions, learnings, and next steps.
SessionEnd -- Final cleanup and summary generation.

The hooks are non-blocking. They queue work with a background worker service and return immediately. They never slow down your IDE.

The Worker Service

A background HTTP API runs on port 37777, managed by Bun. This worker handles the computationally expensive parts: compressing observations, generating summaries, managing the database. It also serves a web viewer UI at http://localhost:37777 where you can browse your memory stream in real time.

Storage

Everything goes into a local SQLite database. Sessions, observations, summaries -- all stored locally on your machine. No cloud service. No external API calls for storage. Your code context stays on your disk.

For search, claude-mem uses a hybrid approach: SQLite FTS5 for full-text keyword search, plus a Chroma vector database for semantic search. You can search your project history both by exact terms and by meaning.

Progressive Disclosure

This is the design decision that makes claude-mem practical rather than merely clever. Injecting the full history of every previous session would flood the context window and degrade Claude's reasoning ability. Research consistently shows that more context is not better -- attention degrades as input length increases. (For a deep dive into why, see Context Engineering with Skills.)

claude-mem solves this with a three-layer retrieval system:

Layer 1: Index. Compact search results with observation IDs, titles, types, and timestamps. About 50-100 tokens per result.
Layer 2: Timeline. Chronological context around specific observations. Provides the narrative flow without full details.
Layer 3: Full Details. Complete observation data, fetched only for the specific IDs that are relevant. About 500-1,000 tokens per result.

Claude starts with the index. If it needs more detail on a specific topic, it drills deeper. This approach saves roughly 10x in token usage compared to fetching everything upfront.

Installation and Setup

Two commands inside a Claude Code session:

/plugin marketplace add thedotmack/claude-mem

/plugin install claude-mem

Restart Claude Code. The plugin activates automatically. Context from your next session will be available in the session after that.

Requirements:

Claude Code with plugin support (latest version)
Node.js 18.0.0 or higher
Bun (auto-installed if missing)
A Claude Pro subscription or Anthropic API key

The plugin is licensed under AGPL-3.0. The source code is open for inspection.

A note on the npm package: claude-mem is published on npm, but npm install -g claude-mem installs the SDK/library only. It does not register the plugin hooks or set up the worker service. Always install via the /plugin commands above.

Configuration

Settings live in ~/.claude-mem/settings.json, auto-created on first run. You can configure the AI model used for compression, the worker port, the data directory, log level, and context injection settings. Most developers never touch the defaults.

Privacy Control

You can wrap any content in <private> tags to exclude it from storage. Sensitive credentials, personal notes, anything you do not want persisted -- tag it and claude-mem will skip it.

Real-World Impact

The difference becomes obvious after the first few sessions. Here are the patterns that change:

Architecture decisions stick. You explain once that your API uses a specific error-handling pattern. Next session, Claude already knows. No re-explanation needed. No CLAUDE.md update required.

Bug context carries over. You spend a session debugging a race condition in your WebSocket handler. Claude understands the root cause, the fix, and the edge cases. Next session, if a related issue surfaces, Claude has that history available.

Coding conventions are learned, not declared. Your CLAUDE.md can specify "use named exports." But there are dozens of micro-conventions that resist documentation -- how you name test files, where you put utility functions, which error messages you prefer. claude-mem captures these from actual behavior rather than requiring explicit documentation.

Repeated questions disappear. If you have asked "what does the auth middleware do?" three times across three sessions, claude-mem recognizes the pattern. The answer gets prioritized in future context injection.

Search Your Project History

claude-mem provides MCP search tools that let Claude query your memory with natural language. The three-layer workflow:

// Step 1: Search the index
search(query="authentication bug", type="bugfix", limit=10)

// Step 2: Review results, identify relevant IDs

// Step 3: Fetch full details only for what matters
get_observations(ids=[123, 456])

Claude can actively look up past context instead of relying only on what was injected at session start.

How This Relates to Agent Skills and Context Engineering

If you have been following the Agent Skills ecosystem, you will notice that claude-mem solves the same fundamental problem from a different angle.

Skills (via SKILL.md files) encode static knowledge: procedures, constraints, domain expertise. You write them once, share them across projects, and the agent loads them on demand. They are excellent for codifying team standards and repeatable processes.

CLAUDE.md encodes project-level context: architecture, conventions, "always do X, never do Y." It is always injected at session start.

claude-mem adds the third dimension: dynamic, session-derived knowledge. It captures what the agent learns through actual work, not what a human anticipated and pre-wrote.

The three approaches are complementary, not competing:

Layer	Source	Content	Persistence
CLAUDE.md	Human-written	Project conventions, architecture decisions	Static, version-controlled
SKILL.md	Human-written or marketplace	Reusable procedures and domain expertise	Static, shareable
claude-mem	Auto-generated	Session observations, learned patterns, bug history	Dynamic, auto-compressed

For a deeper analysis of how these layers interact and compete for context window space, see Context Engineering with Skills. The key insight from that article applies directly here: every token in your context window competes with reasoning capacity. claude-mem's progressive disclosure architecture exists specifically to respect that constraint.

What makes well-designed skills effective is also what makes claude-mem effective: providing high-signal context at the right moment, not flooding the model with everything you have.

Termdock and the Continuity Stack

Session continuity has two dimensions. There is the state of your terminal environment -- which panes are open, which directories you are in, which processes are running. And there is the state of your AI's knowledge -- what it has learned about your code, your preferences, your project history.

Termdock's session recovery handles the first dimension. Close your laptop. Reopen it. Your terminal layout snaps back exactly as it was. Your three-pane setup with Claude Code in the left pane, test runner in the top right, and server logs in the bottom right -- all restored.

claude-mem handles the second dimension. Claude Code itself comes back knowing what it learned in the previous session.

Together, these form a full continuity stack: your physical workspace and your AI's cognitive workspace both survive session boundaries. No re-setup. No re-explanation.

Try Termdock — Session Recovery works out of the box. Free download →

Limitations and Considerations

89K stars represents genuine community validation, but that does not mean you should install blindly. Here is what to evaluate:

Token cost. Every token of injected memory competes with Claude's reasoning capacity on your current task. claude-mem's progressive disclosure mitigates this significantly, but it is not free. Monitor your token usage in the first few weeks to understand the actual cost impact.

Storage is local. Your memory database lives on your machine. If you work across multiple machines, your memory does not follow you (unless you sync the database manually). This is a privacy advantage and a portability limitation.

Compression is lossy. The AI-generated summaries are good, but they are summaries. Nuanced context can get compressed away. For critical project knowledge, you still want explicit CLAUDE.md entries rather than relying on claude-mem to remember.

Privacy surface area. claude-mem stores observations about your code, your queries, and your tool usage. The <private> tag gives you control, but you need to use it actively. Review what is being stored via the web viewer at http://localhost:37777 and adjust your habits accordingly.

Dependency chain. The plugin requires Bun, uv (Python package manager for vector search), and a background worker service. More moving parts means more potential failure points. The auto-installation handles most of this, but be aware of what is running on your system.

The project is actively maintained (currently at v6.5.0) and licensed under AGPL-3.0. The open-source license means you can inspect exactly what data is captured and how it is stored. For a plugin that handles your code context, that transparency matters.

What to Take Away

Persistent AI memory is not about making the model "smarter." The model is already smart. It is about making the model informed -- giving it access to the context it needs, when it needs it, without forcing you to manage that context manually every session.

claude-mem automates the capture, compression, and injection cycle. It uses progressive disclosure to stay token-efficient. It stores everything locally. And it integrates with the same lifecycle hooks that make Claude Code's plugin system powerful.

If you are already using Claude Code for daily development, claude-mem adds the one thing that static configuration files cannot provide: the knowledge that accumulates from actually doing the work.

Install it. Run a few sessions. Check the web viewer. Decide for yourself whether the memory is useful.

Danny Huang·Follow on Threads →

Free Download

Ready to streamline your terminal workflow?

Multi-terminal drag-and-drop layout, workspace Git sync, built-in AI integration, AST code analysis — all in one app.

Download Termdock →

#claude-code#claude-mem#persistent-memory#context-engineering#agent-skills#plugins

claude-mem: Persistent Memory for Claude Code