What we've learned shipping agents into the messy reality of other people's companies. Specific, opinionated, occasionally wrong.
The agents that endure are the ones that act inside the tools the team was already using. Here's the architecture we now default to — and why we delete the UI in our first project meeting.
Read essayIf you can't grade the output, you can't ship the system. A small ritual that's saved us six months of arguing about quality.
What we run, why we run it, and why no agent of ours has ever gone straight to auto-action.
It will write more of it, faster. Here's why that's actually the win — and how we tune voice without losing it.
Our heuristic for picking between LangGraph and a flat tool-calling loop. (We use the loop more than you'd think.)
What we learned from a year of triage agents: people forgive an AI that admits what it can't do.
An honest accounting of what we lose by scaling slowly. And what we'd lose more of by scaling fast.
The prompts, the rubrics, the false-positive rates, and the moment we throw it all out and grade by hand.
Why we send a one-page daily digest to every client — and why it does more for retention than any feature.
Input validation, confidence scoring, and the dead-man's-switch we put in every agent we own.
"The agents that endure are not the smartest. They're the ones whose authors stayed in the room long enough to learn what 'good' meant for that business."
No promo, no roundups. Just the writing — sent to the operators building with AI.