← back

Cloud Brain

an ambient knowledge network for ai sessions

download full case .md
tldr
initial theory
what if every ai session could silently draw from a network of compiled research, yours and others'?
hypothesis
researchers waste hours re-deriving knowledge that someone else already compiled. an mcp plugin can make this ambient, with micropayments settling on megaeth.
status
paused core infrastructure built, single-player mode validated, multiplayer blocked on cold-start density. v2 reframe + pull-lift evaluation designed as the next step.
thesis reframe
cloud brain is an amortization layer for llm research costs. value = compute cost saved + convergence cost skipped. you're buying a shortcut through iteration, not a document.
built with
almost entirely ai tools (claude code, openai codex). i wrote very little code by hand. the value i added was the product thinking and architecture decisions.
validated by
dogfooding. i used cloud brain on my own wiki across repeated claude code + openai codex sessions to test whether ambient retrieval actually improves research quality.
skills
mcp protocolembeddings / pgvectormicropayment architectureeval frameworkstypescriptsupabaseprompt injection defensecold-start strategyclaude codeopenai codex
mcp connector
Cloud Brain MCP connector configuration
cloud brain running as an mcp connector in claude desktop
inspiration
this project started with andrej karpathy's post about maintaining a personal wiki for llm sessions. the idea: instead of starting every ai conversation from scratch, you maintain a structured markdown knowledge base that your ai can reference...
  • knowledge compounds. every session adds to the wiki, so the next session starts smarter
  • retrieval beats memorization. llms hallucinate less when they can pull from verified sources
  • multiplayer is the goal.if one person's wiki helps their sessions, a network of wikis helps everyone
the problem
80%
re-derivation
0
cross-session sharing
wasted context
how it works

┌──────────────────────────────────┐
│         ai session (claude)      │  ← what the user sees
├──────────────────────────────────┤
│          mcp connector           │  ← the bridge
├──────────────────────────────────┤
│   parser → embeddings → index    │  ← the engine
├──────────────────────────────────┤
│      markdown wiki (55 files)    │  ← the source
└──────────────────────────────────┘
what's built
honesty note:this table reflects actual state, not aspirational state. "works" means tested in real sessions. "mocked" means scaffolded but not connected to real infra.
componentstatusdetails
wiki parserworksreads any markdown folder, obsidian wikilinks, frontmatter extraction, recursive directory walking
retrieval engineworksopenai text-embedding-3-large, supabase pgvector, cosine similarity search with configurable thresholds
mcp serverworks5 tools: search_cloud_brain, pull_knowledge, list_topics, get_document, check_freshness
quote lifecycleworkscreate, approve, settle, deliver. full lifecycle for knowledge quotes with pricing logic
safety layerworksprompt injection scanning, input sanitization, rate limiting per session
payment settlementmockedusdm ledger scaffolded, megaeth settlement designed but not connected to live chain
multi-contributor networknot builtcontributor isolation, reputation scoring, content dedup. designed but blocked on cold-start density
dogfooding validation
  • ambient retrieval works in single-player. claude code sessions with cloud brain produced noticeably better research outputs than sessions without it
  • context convergence is faster. instead of re-explaining project context each session, the mcp server pulls relevant prior work automatically
  • outputs feed back in.each session's results go back into the wiki, so retrieval improves over time
why paused
  • code works, product question unanswered. the technical infrastructure is solid, but the go-to-market for a knowledge network requires critical mass
  • cold-start density problem. a knowledge network with one contributor is just a personal wiki. you need enough contributors for retrieval to beat what you already know
  • micropayment ux unsolved. even if settlement works, the ux of paying per-query for knowledge retrieval is untested in real workflows
  • v2 reframe designed but not built.the pull-lift evaluation framework is the next step, shifting from "can we build it" to "can we bootstrap it"
pull-lift evaluation
pull-lift is a framework for evaluating whether a knowledge contribution actually improves downstream session quality. it measures the delta between a session with and without access to a specific piece of contributed knowledge.

  SESSION WITHOUT           SESSION WITH
  CLOUD BRAIN               CLOUD BRAIN
  ┌──────────┐              ┌──────────┐
  │ question  │              │ question  │
  │     ↓     │              │     ↓     │
  │ raw LLM   │              │ pulled    │
  │ response  │              │ context + │
  │           │              │ response  │
  └──────────┘              └──────────┘
       ↓                         ↓
   BASELINE                  LIFTED
   quality                   quality
       └────── DELTA = LIFT ──────┘
conceptdefinition
pullhow often a piece of knowledge is retrieved across sessions
liftmeasurable improvement in session output quality when knowledge is present vs. absent
pull-lift scorepull frequency x lift magnitude. ranks contributions by actual value delivered
decayknowledge freshness penalty. older contributions score lower unless actively re-validated
reputationcontributor-level aggregate of pull-lift scores across all their contributions
  • solves cold-start pricing. new contributions are priced at cost until pull-lift data accumulates
  • aligns incentives.contributors earn more when their knowledge actually helps, not just when it's retrieved
  • enables quality curation. low pull-lift contributions get deprioritized in retrieval ranking
  • builds retention. high-reputation contributors become hard to replace
  • maps to micropayments. settlement amounts scale with pull-lift, so you pay more for knowledge that actually helps
self-assessment
dimensionscorenotes
technical executionstrongcore infrastructure works end-to-end in single-player mode
product thinkingstrongidentified the cold-start problem before over-investing in multiplayer
kill disciplinemoderatepaused rather than killed. the thesis is valid but timing depends on contributor density
ai-assisted buildingstrongbuilt almost entirely with claude code and openai codex, validating the workflow
evaluation rigormoderatepull-lift framework designed but not yet validated with real multi-contributor data
market timinguncertainmcp ecosystem growing fast but knowledge-network demand unproven at scale
verdict
verdict: paused, not abandoned
the infrastructure works. the product question (can you bootstrap a knowledge network?) is what stopped me. i only realized this after building the initial version, which is exactly how it should work: build to learn, not build to ship.
other projects
ask about ayush