npm install thoughtlayer

Your AI agent forgets everything between sessions.

ThoughtLayer is a memory layer that actually retrieves. Vector search, keyword search, freshness decay, importance scoring. Local SQLite, no cloud required. Works with any LLM.

$ npm install -g thoughtlayer

Without memory

// Every session starts from zero
context = loadAllFiles()
// 847 files × ~200 tokens each
// = 169,400 tokens per query
// = $0.42 per question
// At 10K files: breaks entirely

agent.ask("what database are we using?")
// Timeout. Context too large.

With ThoughtLayer

// Retrieves exactly what's needed
results = thoughtlayer.query("what database are we using?")
// Vector + keyword search
// 3 results × ~100 tokens
// = 300 tokens per query
// = $0.0006 per question
// Works at 10K, 100K, 1M entries

→ "Postgres with pgvector, chosen for JSON + embeddings"
  score: 0.89 (vec:0.72, fts:0.95)

Four stages. One query.

Each stage contributes a different signal. The combination outperforms any single approach.

Stage 1
Vector Search
Semantic similarity via embeddings. Finds conceptually related knowledge.
Stage 2
Keyword Search
FTS5 with BM25 ranking. Catches exact terms vectors miss.
Stage 3
Score Fusion
Weighted blend of vector, keyword, freshness, and importance signals.
Stage 4
Top-K
Ranked results with full score breakdown. Constant time.

Benchmarked, not claimed

Public dataset. 50 entries, 40 queries, ground-truth labels. Reproduce it yourself.

MRR
96.3%
Recall@3
93.8%
Precision@1
95.0%

What memory costs elsewhere

Verified March 2026. ThoughtLayer runs on your machine; you only pay for embedding API calls.

Provider Cost What you get
ThoughtLayer ~$0.002/query Unlimited entries. Runs locally. You own the data.
Mem0 Starter $19/mo 50K memories, 5K retrievals/mo, cloud-only
Mem0 Pro $249/mo Unlimited memories, 50K retrievals/mo
Zep Flex $475/mo 300K credits, cloud-only

Try it

Three commands. Real results.

Terminal

What you get

Local-first storage
SQLite + Markdown files. No external database. Everything works offline, syncs with git, is human-readable.
MCP server
Works with Claude Desktop, Cursor, Windsurf, and any Model Context Protocol client. Six tools, zero configuration.
BYOLLM
Bring your own LLM for knowledge extraction. OpenAI, Anthropic, OpenRouter, local models via Ollama. We never lock you in.
Auto-curate
Feed in conversations, meeting notes, or code reviews. The LLM extracts and structures knowledge automatically. Memory that compounds.
Freshness decay
Recent knowledge ranks higher by default. Old knowledge fades gracefully, never disappears. Configurable half-life.
TypeScript SDK
Full programmatic API. init, add, curate, query, search, list, health. Everything the CLI does, your code can do.

Works where you work

Three integration paths. Pick the one that fits your stack.

OpenClaw

Agent framework

Persistent memory across agent sessions. Agents query ThoughtLayer for context before handling tasks, and auto-curate knowledge from conversations. Memory survives compaction, restarts, and session boundaries.

# In your agent's workspace
thoughtlayer init
thoughtlayer add --domain decisions \
  --title "Database Choice" \
  "Team decided on Postgres with pgvector \
for the v2 rewrite. No more MongoDB."

# Agent queries before responding
thoughtlayer query "what database are we using"
→ Database Choice (score: 0.91)

Drop bin/thoughtlayer-query into your agent workspace. Agents call it when they need context for a task, not on every turn. Saves tokens, returns only what's relevant.

MCP Server

Claude Desktop · Cursor · Windsurf

Six tools exposed via the Model Context Protocol. Claude and Cursor can query, add, and curate knowledge directly. Your AI assistant remembers everything you've told it across sessions.

// claude_desktop_config.json
{
  "mcpServers": {
    "thoughtlayer": {
      "command": "thoughtlayer-mcp",
      "env": {
        "THOUGHTLAYER_PROJECT_ROOT": "/path/to/project",
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Tools: thoughtlayer_query, thoughtlayer_add, thoughtlayer_curate, thoughtlayer_search, thoughtlayer_list, thoughtlayer_health. Resources expose all entries as browsable thoughtlayer:// URIs.

TypeScript SDK

Any Node.js agent

Programmatic API for custom agent frameworks. LangChain, AutoGen, CrewAI, or your own. Import, initialise, query.

import { ThoughtLayer } from 'thoughtlayer';

const memory = ThoughtLayer.load('.');

// Retrieve context for a task
const results = await memory.query(
  "current infrastructure setup"
);

// Auto-curate from conversation
await memory.curate(
  "Team decided we're using Postgres \
for the new API backend."
);

Also works as a CLI: thoughtlayer query, thoughtlayer add, thoughtlayer curate. Same engine, different interface.

Open core. MIT licensed.

The engine is free forever. Cloud adds team features and hosted infrastructure.

Open Source
$0
forever
  • ✓ Full CLI + SDK
  • ✓ MCP server
  • ✓ Unlimited local entries
  • ✓ All retrieval features
  • ✓ BYOLLM
  • ✓ MIT licensed
Get Started
COMING SOON
Pro
$19/mo
for individuals
  • ✓ Everything in Free
  • ✓ Cloud sync across devices
  • ✓ Hosted embeddings (no API key needed)
  • ✓ Web dashboard
  • ✓ Usage analytics
  • ✓ Priority support
COMING SOON
Team
$49/mo
per seat
  • ✓ Everything in Pro
  • ✓ Shared team knowledge bases
  • ✓ Role-based access control
  • ✓ Audit log
  • ✓ SSO (SAML/OIDC)
  • ✓ SLA + dedicated support