What’s your working theory?
A software factory model for software engineering should be created. Not only does it make economic sense, it is a bounded unit that architects and CFOs can reason about. Centralizing to a factory model provides the exact fuel that a continuously learning, “ambient” factory needs – endless data, often easily labeled, governance and the network effects that keep reinforcing loops spinning.
There are a lot of “if” conditions between here and there. It will take hard work – but the trajectory is clear, and we can build structural advantage now.
Context Switching
The standard workflow for agentic AI looks something like this: an agent runs, hits something uncertain, and pauses to ask a human. The industry has formalized this pattern under the label “human-in-the-loop” (HITL), and most teams treat these interrupts as a safety feature. They’re not wrong about the safety part – the presence of people in the system is a feature, not a bug. But we can build guardrails and teach the systems to do far better than they do today.
What if we keep pulling at this thread…most interrupts contain signals. They tell you something specific: this is the place where the model, the system’s context, reasoning, or tooling missed the mark.
When a coding agent asks you “should I use pattern A or pattern B?”, it is telling you that it doesn’t have enough architectural context to decide. When a deployment agent asks “proceed with rollout?”, it might mean that the risk model is incomplete. Is this a permanent and inevitable cost to using AI for codegen?
Why This Is Hard (But Not Impossible)
I said there are a lot of conditions stacked here. Let me be specific about what they are:
The models need to be smart enough. Reason about tradeoffs, understand organizational context, and make judgment calls that align with team norms. We’re not fully there today, but the rate of improvement is steep. Gartner predicts that by 2029, agentic AI will autonomously resolve 80% of common service issues without human intervention. That’s a three-year window from “mostly needs help” to “mostly doesn’t.”
The context needs to be rich enough. This is where most teams seem to come up short. You can’t expect an agent to make architectural decisions if your architectural decisions aren’t written down. You can’t expect it to follow team conventions if those conventions live only in people’s heads. What would an “onboarding process” for agents include?
Context engineering, getting the right information to the right agent at the right time, is going to be one of the highest-leverage practices we invest in.
The orchestration needs to be robust enough. The infrastructure to support autonomous operation at scale, including governance, observability, and rollback, is still being built. Gartner’s research found that over 40% of agentic AI projects will be canceled by the end of 2027 due to escalating costs, unclear business value, or inadequate risk controls. The projects that survive will be the ones that built the plumbing right and where perhaps the company had a good “data” starting point.
The solution needs to happen often enough. Let’s talk about measuring agents. If an interrupt is at once the most expensive thing that could happen to an engineer (context switching), AND often preventable, then we should measure this and invest in downward pressure on it. To the point where human review becomes an exception: towards zero interrupts.
Distracted Driving
Every time an agent pauses to ask a human a question, that human has to load the agent’s context into their own working memory, make a decision, and then return to whatever they were doing before. For a single agent, this is manageable. For a team running five, ten, or twenty agents in parallel, humans become the bottleneck. Merge queues grow. QA cycles slow automation. The cost of oversight scales linearly while AI output scales exponentially.
Deloitte’s 2025 research shows where this is heading: only 11% of organizations have agentic AI actively in production, while 42% are still developing their strategy. The gap between “piloting” and “production” is almost entirely an interrupt management problem. Teams can’t scale agents if every agent needs a human babysitter.
The math gets worse as agents get more capable. If your agents are doing ten times more work but still interrupt you at the same rate per task, you just bought yourself ten times more interrupts. Agentic productivity without interrupt reduction is a net-negative for the humans in the system and for the company.
What Actually Reduces Interrupts
If you accept the framing that most interrupts are avoidable, then the engineering response is clear: track them, categorize them, and reduce them systematically.
Make context explicit and machine-readable. Architectural Decision Records (ADRs), team coding standards, deployment policies, risk tolerances. Write them down. Make them structured. Give agents access to them. Every decision that lives only in someone’s head is a future interrupt.
Build feedback loops from interrupt data. When an agent asks a human for help, log it. What did it ask? Why couldn’t it decide on its own? What information would have been sufficient? This is your backlog. The agents will implement it. Every resolved interrupt category is a class of work that now runs at the speed you can afford (value-per-token).
Push context into the system, not into the conversation. The most self-defeating pattern in agentic development is a human who becomes a walking context oracle. If you’re answering the same kinds of questions repeatedly, the problem is that the knowledge hasn’t been externalized into the system. Here we need a system of shared memory with quality gates.
The Human Role Evolves, It Doesn’t Disappear
Stop being synchronous checkpoints and start being asynchronous quality reviewers, system designers, and context engineers.
Think of it this way: the best SRE teams moved from manually approving every deployment to building systems that deploy automatically with monitoring, alerting, and rollback. Humans designed the system that makes those actions safe by default. The same shift is happening with AI agents.
IBM’s Ismael Faro, VP of Quantum and AI at IBM Research, describes this as moving from “vibe coding to Objective-Validation Protocol,” where users define goals and validate while agents autonomously execute, requesting human approval only at critical checkpoints. That’s the trajectory: fewer, higher-quality interrupts with us focusing on the decisions that actually require our judgment.
Why I’m Pushing This Direction
Back to the original question. Why push in this direction?
Because building systems bounded by human attention ultimately prevents both the AI and the engineering team from scaling. Our goal is to enable agentic systems that are fundamentally designed for scale, which benefits everyone on the team.
The organizations that win will be the ones that treat KPIs like interrupt reduction as a first-class engineering discipline, not a quarterly checkbox. They’ll instrument, and obsess over agent trajectories. They’ll build the context infrastructure that makes autonomous operation safe. They’ll design systems where human judgment is reserved for the decisions that genuinely need it, not wasted on questions the system could have answered with better context, or discovered on its own.
The outputs of these teams is not code. It is systems that
produce code. The compounding effect of that distinction is the
entire strategy.
Sources
- Gartner Predicts Agentic AI Will Autonomously Resolve 80% of Common Customer Service Issues Without Human Intervention by 2029 – Gartner, March 2025
- Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027 – Gartner, June 2025
- Human-in-the-Loop Is Out, Agent-in-the-Loop Is In – Analytics India Magazine, December 2025
- Agentic AI Strategy – Deloitte Insights, December 2025
- The Trends That Will Shape AI and Tech in 2026 – IBM Think, January 2026
- Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026 – Gartner, August 2025
Leave a Reply