The time needed to review and quality-check AI coding output could exceed initial time savings, potentially creating subsequent rework burdens.
A recent study by non-profit research group METR has found that AI coding assistants made professional developers about 19% slower overall, contradicting the widespread belief that they reliably accelerate software work.
The research, disclosed on 6 Jan 2026, suggests that time saved on boilerplate code can be more than offset by extra effort spent reviewing, correcting, and integrating AI-generated output.
The group’s findings highlight a risk that developers may accept plausible-looking code too quickly, only to pay the cost (technical debt) later in debugging, refactoring, and security hardening. The study raises questions over whether current benchmarks and vendor claims reflect real-world development conditions, especially in complex, high-stakes systems.
Mixed experiences in the field
Despite such potential caveats, some fast-growing startups have proclaimed that AI tooling has already transformed their engineering operations. One firm, Perplexity, frames these the coding tools as force multipliers that help small teams ship features quickly in a hyper-competitive AI market
However, outside such showcase deployments, developers have been reporting uneven results. Independent reviews of tools such as Perplexity, Copilot and others describe impressive gains for tasks such as boilerplate generation, language translation and quick debugging, but stress that outputs still require careful verification before production use.
Even Perplexity’s own coding-focused guides note that AI-generated code should be treated as a draft: developers are advised to review for correctness, security and performance, and to rely on citations and external documentation to validate solutions.
Other assessments emphasize that, while AI can explain code, convert between languages and even execute Python for rapid experimentation, it does not replace systematic testing or code review.
Productivity versus long‑term risk
This emerging split-screen — a rigorous study measuring slower completion times versus startups reporting striking productivity gains — underscores how context-dependent AI coding tools remain. Factors such as task type, developer seniority, codebase complexity and organizational pressure to “move fast” appear to heavily shape outcomes.
For now, the evidence suggests AI coding assistants work best as tightly-supervised companions rather than autonomous co-pilots, especially in production environments. As adoption accelerates, the key question for teams is whether short-term speedups are worth potential trade-offs in code quality, maintainability and the gradual erosion of human debugging and design skills.



