npm install thoughtlayer

Your AI agent forgets everything between sessions.

ThoughtLayer is a memory layer that actually retrieves. Vector search, keyword search, freshness decay, importance scoring. Local SQLite, no cloud required. Works with any LLM.

$ npm install -g thoughtlayer
MIT Licensed Local-first Zero config

Without memory

// Every session starts from zero
context = loadAllFiles()
// 847 files × ~200 tokens each
// = 169,400 tokens per query
// = $0.42 per question
// At 10K files: breaks entirely

agent.ask("what database are we using?")
// Timeout. Context too large.

With ThoughtLayer

// Retrieves exactly what's needed
results = thoughtlayer.query("what database are we using?")
// Vector + keyword search
// 3 results × ~100 tokens
// = 300 tokens per query
// = $0.0006 per question
// Works at 10K, 100K, 1M entries

→ "Postgres with pgvector, chosen for JSON + embeddings"
  score: 0.89 (vec:0.72, fts:0.95)

Four stages. One query.

Each stage contributes a different signal. The combination outperforms any single approach.

Stage 1
Vector Search
Semantic similarity via embeddings. Finds conceptually related knowledge.
Stage 2
Keyword Search
FTS5 with BM25 ranking. Catches exact terms vectors miss.
Stage 3
Score Fusion
Weighted blend of vector, keyword, freshness, and importance signals.
Stage 4
Top-K
Ranked results with full score breakdown. Constant time.

Benchmarked, not claimed

Public dataset. 50 entries, 240 queries (80 held-out validation). Keyword-only, no embeddings, no LLM. Reproduce it yourself.

MRR
81.9%
Recall@3
83.1%
Recall@1
68.8%

Held-out validation set. These numbers are honest: with embeddings +5-10%, with LLM reranking +10-15% more.

What memory costs elsewhere

Verified March 2026. ThoughtLayer runs on your machine; you only pay for embedding API calls.

Provider Cost What you get
ThoughtLayer ~$0.002/query Unlimited entries. Runs locally. You own the data.
Mem0 Starter $19/mo 50K memories, 5K retrievals/mo, cloud-only
Mem0 Pro $249/mo Unlimited memories, 50K retrievals/mo
Zep Flex $475/mo 300K credits, cloud-only

Try it

Three commands. Real results.

Terminal

What you get

Knowledge graph
Relationships extracted automatically at ingest time. 15 patterns: is-a, reports-to, manages, uses, owns. Graph traversal boosts related entries in search.
Automatic learning
Feed conversations in, get structured memories out. Decisions, facts, preferences, action items extracted without an LLM. No manual entry required.
Local-first storage
SQLite + Markdown files. No external database. Everything works offline, syncs with git, is human-readable.
LLM reranking
Optional second-stage reranker for higher precision. Supports OpenAI, Anthropic, Ollama, OpenRouter. Falls back gracefully if unavailable.
6 retrieval signals
Vector search, keyword match, entity resolution, knowledge graph, freshness decay, importance weighting. Combined via Reciprocal Rank Fusion.
6 framework integrations
MCP (Cursor, Claude Desktop, Windsurf), OpenClaw, LangChain, Vercel AI SDK, OpenAI Agents, CrewAI. Plus a full CLI and TypeScript SDK.
Entity resolution
"John" finds "John Smith, backend engineer". Partial names, aliases, and fuzzy matching are built in.
BYOLLM
Bring your own LLM. OpenAI, Anthropic, OpenRouter, Ollama. Or use nothing at all: keyword retrieval alone hits 68.8% Recall@1.

Works where you work

Three integration paths. Pick the one that fits your stack.

OpenClaw

Agent framework

Native plugin: four tools registered directly into OpenClaw. Sub-millisecond queries via library API. Memory survives compaction, restarts, and session boundaries.

# One command to install
thoughtlayer openclaw-install

# Or add to openclaw config:
plugins:
  entries:
    thoughtlayer:
      source: thoughtlayer
      config:
        projectDir: /path/to/project

Four native tools: thoughtlayer_query, thoughtlayer_add, thoughtlayer_ingest, thoughtlayer_health. Uses library API directly: no CLI exec, no shell, sub-millisecond after first load.

MCP Server

Claude Desktop · Cursor · Windsurf

Six tools exposed via the Model Context Protocol. Claude and Cursor can query, add, and curate knowledge directly. Your AI assistant remembers everything you've told it across sessions.

// claude_desktop_config.json
{
  "mcpServers": {
    "thoughtlayer": {
      "command": "thoughtlayer-mcp",
      "env": {
        "THOUGHTLAYER_PROJECT_ROOT": "/path/to/project",
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Tools: thoughtlayer_query, thoughtlayer_add, thoughtlayer_curate, thoughtlayer_search, thoughtlayer_list, thoughtlayer_health. Resources expose all entries as browsable thoughtlayer:// URIs.

TypeScript SDK

Any Node.js agent

Programmatic API for custom agent frameworks. LangChain, AutoGen, CrewAI, or your own. Import, initialise, query.

import { ThoughtLayer } from 'thoughtlayer';

const memory = ThoughtLayer.load('.');

// Retrieve (keyword + vector + graph)
const results = await memory.query(
  "who does the CTO report to?"
);

// Learn from conversations automatically
await memory.learn(
  "What queue should we use?",
  "We decided on Kafka for replication."
);

Also works as a CLI: thoughtlayer query, thoughtlayer add, thoughtlayer curate. Same engine, different interface.

Open core. MIT licensed.

The engine is free forever. Cloud adds team features and hosted infrastructure.

Open Source
$0
forever
  • ✓ Full CLI + SDK
  • ✓ MCP server
  • ✓ Unlimited local entries
  • ✓ All retrieval features
  • ✓ BYOLLM
  • ✓ MIT licensed
Get Started Read the docs →
COMING SOON
Pro
$19/mo
for individuals
  • ✓ Everything in Free
  • ✓ Cloud sync across devices
  • ✓ Hosted embeddings (no API key needed)
  • ✓ Web dashboard
  • ✓ Usage analytics
  • ✓ Priority support
COMING SOON
Team
$49/mo
per seat
  • ✓ Everything in Pro
  • ✓ Shared team knowledge bases
  • ✓ Role-based access control
  • ✓ Audit log
  • ✓ SSO (SAML/OIDC)
  • ✓ SLA + dedicated support