ThoughtLayer: Memory for AI Agents

Without memory

// Every session starts from zero
context = loadAllFiles()
// 847 files × ~200 tokens each
// = 169,400 tokens per query
// = $0.42 per question
// At 10K files: breaks entirely

agent.ask("what database are we using?")
// Timeout. Context too large.

With ThoughtLayer

// Retrieves exactly what's needed
results = thoughtlayer.query("what database are we using?")
// Vector + keyword search
// 3 results × ~100 tokens
// = 300 tokens per query
// = $0.0006 per question
// Works at 10K, 100K, 1M entries

→ "Postgres with pgvector, chosen for JSON + embeddings"
  score: 0.89 (vec:0.72, fts:0.95)

Architecture

Four stages. One query.

Each stage contributes a different signal. The combination outperforms any single approach.

Stage 1

Vector Search

Semantic similarity via embeddings. Finds conceptually related knowledge.

Stage 2

Keyword Search

FTS5 with BM25 ranking. Catches exact terms vectors miss.

Stage 3

Score Fusion

Weighted blend of vector, keyword, freshness, and importance signals.

Stage 4

Top-K

Ranked results with full score breakdown. Constant time.

Benchmarks

Benchmarked, not claimed

Public dataset. 50 entries, 240 queries (80 held-out validation). Keyword-only, no embeddings, no LLM. Reproduce it yourself.

MRR

81.9%

Recall@3

83.1%

Recall@1

68.8%

Held-out validation set. These numbers are honest: with embeddings +5-10%, with LLM reranking +10-15% more.

Pricing comparison

What memory costs elsewhere

Verified March 2026. ThoughtLayer runs on your machine; you only pay for embedding API calls.

Provider	Cost	What you get
ThoughtLayer	~$0.002/query	Unlimited entries. Runs locally. You own the data.
Mem0 Starter	$19/mo	50K memories, 5K retrievals/mo, cloud-only
Mem0 Pro	$249/mo	Unlimited memories, 50K retrievals/mo
Zep Flex	$475/mo	300K credits, cloud-only

Demo

Try it

Three commands. Real results.

Terminal

Features

What you get

Knowledge graph

Relationships extracted automatically at ingest time. 15 patterns: is-a, reports-to, manages, uses, owns. Graph traversal boosts related entries in search.

Automatic learning

Feed conversations in, get structured memories out. Decisions, facts, preferences, action items extracted without an LLM. No manual entry required.

Local-first storage

SQLite + Markdown files. No external database. Everything works offline, syncs with git, is human-readable.

LLM reranking

Optional second-stage reranker for higher precision. Supports OpenAI, Anthropic, Ollama, OpenRouter. Falls back gracefully if unavailable.

6 retrieval signals

Vector search, keyword match, entity resolution, knowledge graph, freshness decay, importance weighting. Combined via Reciprocal Rank Fusion.

6 framework integrations

MCP (Cursor, Claude Desktop, Windsurf), OpenClaw, LangChain, Vercel AI SDK, OpenAI Agents, CrewAI. Plus a full CLI and TypeScript SDK.

Entity resolution

"John" finds "John Smith, backend engineer". Partial names, aliases, and fuzzy matching are built in.

BYOLLM

Bring your own LLM. OpenAI, Anthropic, OpenRouter, Ollama. Or use nothing at all: keyword retrieval alone hits 68.8% Recall@1.

Integrations

Works where you work

Three integration paths. Pick the one that fits your stack.

OpenClaw

Agent framework

Native plugin: four tools registered directly into OpenClaw. Sub-millisecond queries via library API. Memory survives compaction, restarts, and session boundaries.

# One command to install
thoughtlayer openclaw-install

# Or add to openclaw config:
plugins:
  entries:
    thoughtlayer:
      source: thoughtlayer
      config:
        projectDir: /path/to/project

Four native tools: thoughtlayer_query, thoughtlayer_add, thoughtlayer_ingest, thoughtlayer_health. Uses library API directly: no CLI exec, no shell, sub-millisecond after first load.

MCP Server

Claude Desktop · Cursor · Windsurf

Six tools exposed via the Model Context Protocol. Claude and Cursor can query, add, and curate knowledge directly. Your AI assistant remembers everything you've told it across sessions.

// claude_desktop_config.json
{
  "mcpServers": {
    "thoughtlayer": {
      "command": "thoughtlayer-mcp",
      "env": {
        "THOUGHTLAYER_PROJECT_ROOT": "/path/to/project",
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Tools: thoughtlayer_query, thoughtlayer_add, thoughtlayer_curate, thoughtlayer_search, thoughtlayer_list, thoughtlayer_health. Resources expose all entries as browsable thoughtlayer:// URIs.

TypeScript SDK

Any Node.js agent

Programmatic API for custom agent frameworks. LangChain, AutoGen, CrewAI, or your own. Import, initialise, query.

import { ThoughtLayer } from 'thoughtlayer';

const memory = ThoughtLayer.load('.');

// Retrieve (keyword + vector + graph)
const results = await memory.query(
  "who does the CTO report to?"
);

// Learn from conversations automatically
await memory.learn(
  "What queue should we use?",
  "We decided on Kafka for replication."
);

Also works as a CLI: thoughtlayer query, thoughtlayer add, thoughtlayer curate. Same engine, different interface.

Your AI agent forgets everything between sessions.