April 18, 2026
Is your AI agent hallucinating despite your detailed specs? More context isn't always better. Learn why keeping your spec files lean and delegating to specialized agents is the only way to scale specs-driven development without the "context bloat" trap.
You’ve refined your .md spec files. You’ve added every edge case, guardrail, and historical note you can think of. You expect a masterpiece. Instead, your agent starts hallucinating, ignoring core rules, and hitting a wall.
The hard truth? You’re drowning your agent in noise.
In spec-driven development, there is a dangerous temptation to provide “all the context.” But in the current landscape of LLMs, context bloat is the primary killer of agentic efficiency. If you want high-performance output, you need to stop writing documentation for humans and start architecting context for machines.
We often treat context windows like infinite buckets. But even with 100k+ token limits, LLMs still suffer from “lost in the middle” phenomena. When your AGENTS.md or CLAUDE.md files become bloated, the agent’s “attention” is spread too thin.
The most effective context files follow some guidelines:
Comprehensive but Lean: While 200-300 lines is often the limit for relatively complex projects, the truth is that anything over 100 lines is a danger zone. As a general rule for most LLMs:
Machine-First Readability: Sacrifice fluff. Use minimum characters and concise bullet points. If a human finds it a bit dense, that’s fine—you can always generate a separate DOCS.md for the team.
Single Source of Truth (SSOT): Clearly define which file governs the project. However, a SSOT (like AGENTS.md or CLAUDE.md) should not be a dumping ground. It should be the High-Level Protocol.
Your SSOT should contain:
Everything else belongs in reference files
Instead of bloating your main file with sample code or niche implementation details, use a Reference/Skill architecture.
/docs/reference-snippets.md.This keeps the main context lean and forces the agent to “pull” information. This “on-demand” approach prevents the agent’s active memory from being clogged with code it isn’t currently writing.
How do you know when your context is too heavy? Watch for these technical red flags:
When you see these symptoms, do not add more rules. Instead, compact the conversation. Generate a concise summary of the current state and open a fresh chat window.
The fix for complex projects isn’t a longer spec file—it’s a better delegation strategy. Instead of one agent trying to juggle architecture, coding, testing, and DevOps, break the workflow apart.
For most complex builds, use a specialized ensemble, but don’t over engineer it. A simple architecture like this works for me most of the time:
This “Art of the Lean Context” ensures your agents stay focused, performant, and—most importantly—accurate.