MemPalace/mempalace: local-first AI memory that stores verbatim, not summaries

mempalace is an open-source memory system for AI agents that makes two deliberate choices: it is local-first (no API keys, runs on your machine) and it stores conversation history verbatim rather than summarizing it, then retrieves with semantic search. The README leads with a benchmark number, and the most interesting thing about this project is not the number itself but how carefully it has corrected its own claims. This page covers the design, the install, and that honesty record, which is rare enough to be a selling point.

The design

The metaphor is a memory palace, and it is load-bearing:

A Palace is the whole memory store.
Wings hold people and projects.
Rooms hold topics.
Drawers hold the original verbatim content.

On top of that sit a knowledge graph (entity-relationship edges, with time-validity windows and decay dynamics) and a set of 29 MCP tools that expose palace reads and writes, graph operations, and per-agent diaries to your agent. Storage is pluggable: ChromaDB by default, with sqlite-exact, Qdrant, and pgvector options. As of v3.3.6 (2026-06) the default embedding model moved to a multilingual one, which sharply improves cross-language retrieval.

Install

# Recommended, isolated
uv tool install mempalace
mempalace init ~/projects/myapp

# Or pipx / pip
pipx install mempalace

mempalace mine ~/projects/myapp                 # ingest project files
mempalace mine ~/.claude/projects/ --mode convos  # ingest Claude Code conversations
mempalace search "why did we switch to GraphQL"
mempalace wake-up                               # load context

A Docker image runs it as an MCP server, and pip install mempalace[pgvector] adds the Postgres backend.

The honesty record, and why it matters

Most projects inflate benchmarks. mempalace has publicly walked its own back, and that is the strongest reason to trust it. Per its HISTORY and changelog: an early version conflated retrieval recall (R@5) with competitors’ end-to-end QA accuracy, which is not comparable; after a community audit it rewrote the tables to clearly separate the two and now states 96.6% R@5 on LongMemEval in raw mode (pure semantic, no LLM calls), with a hybrid pipeline reaching higher on a limited set. It retracted a “100%” claim that had been reached by inspecting specific wrong answers (teaching to the test), and retracted an overstated compression headline. The results are reproducible from committed JSONL files. So when the description says “best-benchmarked,” read it as a self-claim that the project has at least been unusually rigorous about qualifying.

When it fits, and when it does not

It fits agents and workflows that need durable, local memory without sending data to a cloud, and anyone who wants verbatim recall rather than lossy summaries. It fits less well if you want a managed, hosted memory service with zero ops (a cloud product is less setup), or if you cannot tolerate the rough edges of a fast-moving local-first project (see below).

How it compares

Project	Approach	Stars (2026-06)
MemPalace/mempalace	Local-first, verbatim, semantic + graph	~55k
mem0ai/mem0	AI memory, cloud option	~58k
letta-ai/letta (ex-MemGPT)	Agent framework with long-term memory	~23k

The cleanest contrast is with mem0: similar mission, but mempalace is local-first and verbatim while mem0 offers a hosted path. mempalace deliberately declines head-to-head benchmark comparisons with mem0 and others, on the grounds that they publish different metrics on different splits. letta is more of an agent framework whose memory is token-compression based.

Gotchas from the issue tracker

It is local-first and fast-moving, so the rough edges are operational:

A ChromaDB compaction error (Rust u64 incompatible with SQL BLOB) could fail all write tools (#1714).
HNSW index files could grow unbounded from idempotency issues (#1712), and a self-repair routine sometimes quarantined healthy small indexes as corrupt (#1716).
A single MCP write could fragment into multiple chunk rows, breaking get/delete by returned id (#1763).

The throughline: the retrieval idea is solid, the storage-engine plumbing (ChromaDB, HNSW, FTS5 rebuilds) is where care is needed. Back up your palace, and run repair knowingly.

FAQ

Is mempalace free? Yes. mempalace is MIT-licensed, local-first, and runs without API keys, so there is no per-call cost in raw mode.

Does mempalace send my data to the cloud? No. mempalace is local-first by design and stores data on your machine; raw-mode retrieval makes zero API calls.

What storage backends does mempalace support? ChromaDB by default, plus sqlite-exact for local exact search, and Qdrant or pgvector for larger or shared deployments.

Is mempalace better than mem0? They make different choices: mempalace is local-first and verbatim, mem0 offers a hosted path. mempalace deliberately declines head-to-head benchmark claims, citing incompatible metrics, so pick by whether you want local control or a managed service.

For the token side of agent efficiency see chopratejas/headroom; for memory used inside a broader agent workflow see obra/superpowers.

MemPalace/mempalace: local-first AI memory that stores verbatim, not summaries

Star growth

The design

Install

The honesty record, and why it matters

When it fits, and when it does not

How it compares

Gotchas from the issue tracker

FAQ

Repository data

MemPalace/mempalace: local-first AI memory that stores verbatim, not summaries

Star growth

The design

Install

The honesty record, and why it matters

When it fits, and when it does not

How it compares

Gotchas from the issue tracker

FAQ

Related reading

Repository data