Second Brain
A personal knowledge base that holds what I read, then quizzes me on it.
What this shows: I can turn a growing pile of notes into a system that answers with sources and quizzes me on what matters. The same approach works on a company wiki or policy docs.
Everything I read becomes searchable and gets quizzed back to me. Ask across the whole vault, get answers that cite the actual note.
Every answer cites the note it came from, and no relevant note means no answer. Kale Bot runs this same pipeline live and is scored on it; this vault is young and its counts are real.
Retrieval keeps answers honest: every reply is grounded in a real note, and what I flag gets quizzed back until it sticks.
Answer grounded across the vault
Retrieval, step by step (RAG)
The vault is too big to read in one go, so a question never sees all of it, just the few notes that actually matter. That’s retrieval-augmented generation (RAG): embed everything once, pull the most relevant passages for each question, then answer only from those and cite them. Because the match is on meaning rather than exact words, a question about “forgetting curves” can surface a note that only ever said “spaced repetition”.
- 1Chunk & embed
Each note is split into passages and embedded as vectors, so meaning is searchable, not just keywords
embeddings - 2Retrieve
The question is embedded too; the closest passages are pulled, a handful, not the whole vault
top-k - 3Ground the answer
Claude answers from only those passages, so it can't drift into notes it never retrieved
grounded - 4Cite the source
Every claim links back to the note it came from, so I can check it
citations
The trade-off is real: an embedding index to keep in sync, and a retrieval step that can miss the right note. Kale Bot runs this exact pipeline live on a small public vault (and is scored on it); Second Brain is where it has to scale, across a personal vault that grows every week.
How I learn a new topic
I write rough notes in Obsidian. A Claude coworker runs on a schedule: it tidies and links what I wrote, builds recall prompts, quizzes and grades me over time, and pushes hardest on the topics I keep missing. Reading turns into recall, spaced out so it sticks.
At a glance
- 14
- Notes in the vault
- 30
- Links between notes
- 5
- Topics in rotation
- 5
- Fresh quiz questions a day
How it works
- 1
Capture
Notes, highlights, and ideas land in my Obsidian vault.
- 2
Connect
A scheduled Claude agent links each new note to what is already there.
- 3
Retrieve
I can ask questions across everything and get sourced answers.
- 4
Quiz
It quizzes me on what I flagged to remember, spaced over time.
Quiz mode
Reading turns into recall. The vault asks, I answer. A few real ones:
What is RAG, in one line?
Search first, write second: find the most relevant notes, then answer only from what was found.
Why stream answers token by token?
First words land in under a second instead of after a long pause. Same answer, no frozen screen.
What did I conclude about grounding?
Stricter retrieval beats a bigger model. Most of the hallucinations came from answering when nothing was relevant.
Brier score, in one line?
Mean squared error on probabilities. Lower is better, and it punishes confident misses.
Problem
I read and think more than I can hold in my head. I wanted one place that remembers it all, connects it, and helps it actually stick.
Approach
Obsidian is the store, plain markdown I own. Claude Cowork, an AI agent that works over my files on a schedule, sits on top: it links new notes into the graph, answers questions with retrieval so every answer cites its notes, and runs the spaced quizzing loop. Same pipeline Kale Bot runs live (embed, retrieve, ground the answer), scaled to a vault that keeps growing.
Eval results
Real and early: 14 notes, 30 links, five topics in rotation. Each morning the agent writes five fresh quiz questions from the vault and grades my answers in the evening. Retrieval is already scored for real in Kale Bot, 95% hit-rate on a 48-prompt eval; this vault is where it has to scale.
What broke
Auto-linking first connected everything to everything. Tightening the similarity threshold made the graph readable again.