A good pair pushes back. An AI doesn't.
You can frame any hypothesis with confidence and the AI confirms it. You can push back on its suggestion and it folds. You can commit to a bad architecture and it helps you build it well.
The thing that made pair programming valuable was the disagreement between two priors. With an LLM, there's one prior in the room. The other one is a very high-resolution mirror.
What changed
A pair is two thinkers. The thinking happens in the gap between their priors. One types, the other questions. The questions are rarely polite, and that's fine, because the goal is the code, not the comfort.
Replace one of those thinkers with an LLM and the gap closes. The model has no instinct, no skin in the game, no past project that went sideways for a reason it still remembers. It's helpful by training, which in practice means trained to agree.
So you push back. It folds. You commit. It confirms. The partner that was meant to challenge you turns out to be a yes-machine.
You still know how to ask "what could go wrong here?" Knowing to ask is not asking. It's six o'clock on a Friday. The implementation works. The navigator is the model. The critical question doesn't get asked. The bug ships.
What mob programming does that pairing can't
Three humans and one AI is a different shape entirely. Not three programmers staring at one keyboard. Three independent priors challenging each other before the AI types a line.
The disagreement happens at the human layer. That's the recovery. The AI becomes a tool the team uses, not the partner driving the session. The driver types what the navigators say. The navigators rotate. One of them is allowed to say "I don't trust that, let's check." The LLM never volunteers that sentence.
This isn't nostalgia for ensemble programming. It's a structural fix. A pair with an LLM has one prior. A mob with an LLM has three. The bug that survives one prior often dies on contact with the second.
Where the spec comes in
Spec-driven development is the artifact the mob negotiates over. Three humans arguing about a spec catch what one developer and an LLM miss, because the spec forces the questions the LLM swallows. Without a spec, the AI generates code, the human nods along, and confirmation bias scales. With a spec, disagreement becomes structural.
The spec is the step that gets cut first. It feels like ceremony when the AI is happy to start typing without one. So it stays in someone's head, and the LLM agrees with whatever leaks out of it.
The honest cost
Mob programming was already expensive. Adding a spec workflow doesn't make it cheaper. For a CSS tweak, pairing with an AI is fine. The consequences of confirmation bias on a margin value are nothing.
The mob earns its cost on the work where being confidently wrong is the worst outcome. Architecture decisions. Security boundaries. Anything irreversible. That's where one yes-machine and one tired developer is the dangerous configuration.
The point
Pairing with an AI feels like pair programming. It isn't. The shape that produced good thinking needed two priors and the disagreement between them. Swap one for a mirror and you get something quieter and faster, and you ship exactly the bug you brought into the session.
If the work matters, get another human in the room. If you can't, write the spec first and argue with it.