The Task Is No Longer 'How Do I Prompt.' It's 'How Do I Train My Agents.'

There are two ways to interact with AI.

The first is System 1 thinking — fast, intuitive, cheap. You type a prompt. You get a response. You react. This is how most people use AI today, and it works for most of what they need.

The second is System 2 thinking — deliberate, structural, expensive in cognitive load. You don't type prompts. You architect systems. You define what an agent should do before it exists, build constraints for decisions it hasn't faced yet, and create workflows that run without you in the room.

The shift from the first to the second is the shift that changed everything about how I work. But it took me six months, a lot of money, and one very humbling conversation with a colleague to figure out what was actually going on.

The $100-a-Day Phase

When Claude Code first came out, I burned through $100 a day in credits. Fifteen days straight. I was building everything — agents for contract analysis, property data pipelines, document automation. The excitement was intoxicating. For the first time, I could describe what I wanted and watch something get built in front of me.

But here's what nobody tells you about working with AI agents: the dopamine of seeing instant output is the enemy of building something that actually lasts.

I'd spend a session getting a workflow to work perfectly. Next session, different data, different edge case — it breaks. I'd fix the prompt. It breaks again. I'd add more context to the prompt. The prompt gets so long that Claude starts ignoring half of it. So I'd start over. This cycle repeated for weeks.

The messy reality of deploying AI inside a traditional company made everything harder. The data is never clean. I'd estimate that 80% of my agent's time — and mine — was spent cleaning, normalizing, and wrestling with data formats that no one had thought about in a decade. Rent escalation clauses in PDF contracts that scan differently depending on which regional office created them. Property measurements in three different unit systems across the same portfolio. Before the agent could do anything intelligent, it had to survive the mess.

I was treating the AI like a chat partner — re-explaining context every session, debugging prompt failures instead of building infrastructure. I was deep in System 1 thinking, convinced I was doing System 2 work.

The Joel OS Mistake

The turning point wasn't a success. It was a failure I almost didn't recognize.

Being the systems thinker I am — INTP, fractal-obsessed, wants to see the whole architecture before trusting any part — I did what felt natural. I built a comprehensive system. I called it the Joel OS.

It was elaborate. A digital brain of interconnected nodes — people, projects, processes, preferences. Connected to Claude through CLAUDE.md files, skills, hooks, the full stack. It was designed to give the agent everything it could possibly need to know about how I work, what I care about, how decisions get made in my organization. The idea was: if the agent has complete context, it will make better decisions.

It took weeks to build. And it worked — for me. Barely.

Then I showed it to a colleague. She's in charge of AI compliance at our company. Her workflow is straightforward: meet with business teams, understand their requirements, fill in a business requirement document template, send it for approval, then hand off to development. It's tedious, documentation-heavy, and exactly the kind of work AI should accelerate.

We spent thirty minutes setting up the Joel OS for her. Many moving parts. By the end, I could see it in her face: this was too much. She didn't need a digital brain. She didn't need my personality profile encoded into an AI system. She didn't need nodes of interconnected information.

She needed one thing: a way to go from meeting notes to a completed BRD in the correct template, with an automatic notification to the approval chain when it was done.

I uninstalled the Joel OS from her setup that same day. Then I built something simpler: a single skill that handles the document translation, a CLAUDE.md with just enough context about the template structure, and a hook that sends a Slack notification when the document is ready for review.

It took an afternoon to build. It works every time. And it taught me the most important lesson I've learned about AI: the system doesn't need to know everything. It needs to know exactly what to do right now.

The Principle: Progressive Disclosure

There's a name for what I stumbled into. Anthropic calls it progressive disclosure, and it's the core design principle behind their entire Agent Skills architecture.

The idea comes from UX design. You don't show a user every feature on the first screen. You reveal complexity gradually — only when they need it, only as much as they need. Applied to AI agents, it means the same thing: don't front-load the agent with everything you know. Structure your knowledge in layers, and let the agent pull only what's relevant for the task at hand.

Here's how it works in practice, and why it matters more than any prompting technique you'll learn:

The Core Principle

Progressive Disclosure

The agent doesn't need to know everything. It needs to know exactly what to do right now. Click each layer to see what loads — and when.

Context window12%

System prompt

CLAUDE.md

Skill: brd-generator (name only)

Skill: contract-analysis (name only)

Skill: data-cleanup (name only)

Startup

Session begins

Always

User message arrives

Task triggers skill matching

On prompt

Skill activates

Full SKILL.md loads into context

On match

Reference files load

Templates and examples pulled on demand

As needed

Hook fires

Deterministic action — no LLM judgment

PostToolUse

At startup, only the system prompt, your CLAUDE.md, and the names of available skills are loaded. The agent knows what tools exist but hasn't loaded any of their instructions. Your CLAUDE.md should be ruthlessly pruned — twelve lines, not four pages.

Level 1: The table of contents. Just enough to know where to look.

Layer 1: CLAUDE.md — What the agent always knows. This is your persistent context. The project's architecture. Your naming conventions. The constraints that never change. Anthropic's own best practices warn that the biggest mistake people make is over-specifying this file — "if your CLAUDE.md is too long, Claude ignores half of it because important rules get lost in the noise." The fix is ruthless pruning. If the agent already does something correctly without the instruction, delete it.

My colleague's CLAUDE.md is twelve lines. Project name, template location, output folder, approval chain contacts, and the one formatting rule that matters. That's it. The Joel OS version was four pages. The twelve-line version produces better results.

Layer 2: Skills — What the agent loads when relevant. Skills are folders of instructions and resources that Claude discovers and invokes automatically based on what you're doing. The magic is in the architecture: at startup, only the skill's name and description are loaded. If Claude thinks the skill is relevant, it reads the full instructions. If those instructions reference additional files, Claude reads those only when needed.

This means you can bundle unlimited specialized knowledge into a skill without paying any context cost until it's actually used. For my colleague, the skill contains the BRD template logic, the field mapping rules, and three example documents — none of which consume a single token until she actually asks the agent to generate a document.

Layer 3: Hooks — What must always happen. Hooks are deterministic triggers. Unlike skills, which rely on the agent's judgment about when to activate, hooks execute automatically at specific moments — before a tool runs, after a file is written, when a session starts, when the agent finishes its task. They're guardrails that ensure predictability.

For my colleague's workflow, one hook fires after the document is generated: it sends a notification to the approval channel. This isn't a suggestion the agent might follow. It's a rule that executes every time. That's the difference between an agent you hope will do the right thing and a system you know will.

Layer 4: Subagents — What runs without polluting your main context. When a task requires exploration or heavy computation, spinning it off to a subagent keeps your primary session clean. The subagent inherits the CLAUDE.md, does its work in isolation, and returns only the result. This is how I handle the data cleaning problem — a subagent parses and normalizes the messy input, surfaces only the clean data to the main workflow.

In a recent interview, the Claude Code team described CLAUDE.md as "a shared, durable memory for teams" — a way of encoding compounding knowledge that makes the agent increasingly effective for specific workflows. That's true. But the lesson I learned the hard way is that durable memory is only valuable when it's disciplined. The moment it becomes comprehensive, it becomes noise.

A Real Workflow

From Meeting Notes to Approved BRD

A compliance team's documentation workflow, rebuilt with progressive disclosure. Click each step to see which layer does the work.

"Generate the BRD from today's meeting notes"

User prompt

The user speaks or types the task in plain language.

This is the only input the user provides. Everything else is handled by the system. The user doesn't need to know about skills, hooks, or CLAUDE.md. They just describe what they need.

Agent reads CLAUDE.md

CLAUDE.md

Loads project context: template location, output folder, approval contacts.

Skill match: brd-generator

Skill activation

Agent recognizes the task matches the BRD skill. Loads full instructions.

Reference files pulled

Skill references

Template and example documents loaded from filesystem on demand.

Meeting notes parsed and mapped

Skill execution

Agent extracts requirements from notes, maps to template fields.

BRD document generated

Skill output

Completed document saved to the output folder specified in CLAUDE.md.

Hook fires: notify-approval-chain

Hook — PostToolUse

Slack notification sent automatically. No LLM judgment involved.

What it took to build this

Lines in CLAUDE.md

1 day

To build and test

The Zoom In, Zoom Out Problem

There's a cognitive pattern that matters here, and I haven't seen anyone talk about it.

When you're working with AI agents, you're constantly moving between levels of abstraction. One moment you're looking at the micro level — did this specific document field populate correctly? The next, you're one layer up — is the template logic right? Then another layer — is the whole workflow architecture sound? Then another — should this process even exist, or should we redesign it entirely?

This zoom in/zoom out is the actual skill of working with agents. Not prompting. Not coding. Navigating levels of abstraction while keeping the system consistent across all of them.

The danger is getting pulled into execution. The agent produces something plausible, and the dopamine hits — "it's doing the thing!" But plausible isn't correct, and execution-level satisfaction blinds you to architectural problems. I've learned to treat decision points differently from execution points. If a sub-agent encounters something routine, it executes. If it encounters a decision that requires judgment, it surfaces the question up — to the main agent, then to me. That's the chain of command. The agent does the work. I make the calls.

Using a voice transcriber — SuperWhisper, in my case, though the specific tool doesn't matter — changed this dynamic completely. Instead of typing carefully constructed prompts (System 2 thinking that exhausts me), I talk through what I want (System 1 thinking) and let the transcriber convert it to text. This sounds trivial. It isn't. It shifted me from being an operator to being a manager. I direct the work. The system executes it.

What This Means for You

Don't build a Joel OS.

Build a skill for your colleague's worst Friday. The one where she's drowning in documentation. The one where the process has six steps and four handoffs and nobody's data is in the right format.

Start there. Map the workflow — not the ideal version, the actual one with all the mess. Build a CLAUDE.md with just enough context: twelve lines, not four pages. Create a skill that handles the specific transformation your team does every week. Add a hook that sends the notification everyone always forgets.

That's a system. It's not glamorous. But it works every time, and the person who uses it doesn't need to understand Claude Code or progressive disclosure or agent architecture. They need to say "generate the BRD from today's meeting notes" and get a completed document with a Slack ping to the approval chain.

The Progression

Consumer, Creator, Developer

Three phases. The same journey everyone takes — but the cognitive shift between them is what matters. Click each phase.

Phase 1

Consumer

Weeks 1–8

System 1

You chat with the AI. You type prompts. You get answers. It feels magical.

Tools used

Claude Chat, basic prompts, copy-paste workflows

What you build

Nothing persistent. Every session starts from zero.

Burning $100/day in Claude Code credits chasing dopamine. Fifteen days of excitement, zero durable systems. The AI felt like a productivity multiplier — but every session was a one-off.

Phase 2

Creator

Months 2–4

System 1 → 2

You start customizing. CLAUDE.md files. Custom workflows. The first taste of building systems.

Phase 3

Developer

Months 4–6+

System 2

You build systems for other people. The focus shifts from your workflow to theirs.

The timeline is compressing

6 months

Early 2025

→

Weeks

With Cowork + Plugins

The tools got easier. The cognitive shift is the same.

The progression is always the same: consumer (you chat with the AI) → creator (you customize it for your workflow) → developer (you build systems for other people's workflows). That journey took me six months. With Cowork and plugins lowering the barriers, I'm watching new people compress it to weeks.

But the cognitive shift is identical regardless of speed. Stop optimizing your prompts. Start architecting your agents. The best AI system you build won't be the most complex one. It'll be the one simple enough that a zero-context agent — one who's never seen your data, never talked to your team, never heard of your company — can pick it up and do the work. Because you gave it exactly what it needed. Nothing more.

For the strategic analysis of why Anthropic's platform architecture is a Blue Ocean play against Google and Microsoft, see the companion piece: Anthropic's Blue Ocean: How a company with no moat is building the only one that matters.