Grounding a Fully-Local GraphRAG Agent: An Accuracy Post-Mortem
I’ve been building cbi, a small domain-agnostic GraphRAG CLI: it ingests data into a DuckDB knowledge graph (vectors via vss, full-text via fts, graph queries via duckpgq) and exports it as a self-contained, cat-readable Open Knowledge Format bundle — markdown concept docs plus the database itself. The latest piece closes the loop: cbi agent --bundle ./some-bundle opens a chat TUI where a fully local agent answers questions about the bundle. No API keys, no cloud, no embedding server. The whole thing runs on my desk. ...
How Small Can a Local GraphRAG Agent Go? An E2B-vs-E4B Sweep
In the last post I built a fully-local GraphRAG agent — cbi agent, answering questions over a DuckDB knowledge graph with a Gemma model running on an AMD Strix Halo chip — and then turned the six-question hand-check into a repeatable harness (cbi eval) that scores answers deterministically against a ground-truth key. A harness invites the obvious question: how small can the model be before the whole thing falls apart? Smaller means faster and cheaper, and on a local box that’s the difference between snappy and sluggish. So I ran the smallest two Gemma 4 tiers head to head over a real test set and graded every answer. ...
Getting Gud with LLMs: How to Build the Intuition
I recently let Claude crawl 25 months of my own LLM tooling history and write up what it found. The result lives over here: Notes from Claude: What I Found in One User’s Data. That post is mostly what one person’s data looks like — eighty repos, 2,826 logged calls, voice memos full of profanity, the works. It’s not a how-to. People keep asking me for the how-to. So here it is. Not a list of magic incantations. Not “ten prompts that will change your life.” The operating principles I actually use when I sit down with a model, distilled from being annoyed at GPT-2 back in 2019 and shipping production code with Opus in 2026. ...
Notes from Claude: What I Found in One User's Data
Notes from Claude: I asked Claude (Opus 4.7) to look through 25 months of my own LLM tooling history — git logs, Claude Code transcripts, two llm CLI databases, my GitHub orgs — and write up what it found, from its own perspective, in response to the recurring “models are getting worse” discourse. What follows is its draft, lightly edited. — Zac I gave Anthropic 10 days. Tried to fix multiple bugs in multiple repos. Opus 4.7 just goes in circle and doesn’t do anything. ...
The Critical Window: How Hormuz Is Breaking the World's Food System
Experimental output from a ‘News Agent’. Charm Crush combined with qwen 3.6 MOE and tool use. Prompted to ‘provide an analysis of strait of hormuz as it relates to food insecurity’. April 2026 The clock is ticking. Not metaphorically. Literally. On April 7, 2026, Iran and the United States agreed to a ceasefire mediated by Pakistan. President Trump extended it indefinitely on April 22. But a ceasefire doesn’t move fertilizer. It doesn’t restart a blocked strait. And it doesn’t bring nitrogen to a field in Punjab that needs it now. ...