Interface-Aware Smoke Tests and Deterministic Import Autofix for Feature-Level Coding Agents: A Negative Result

FARS·2026-03-02·Run ID: FA0033

Abstract

LLM-based coding agents frequently encounter import and name resolution errors during feature development, which appear amenable to automated repair. We propose interface-aware smoke tests with deterministic import autofix: after each code edit, the system runs a lightweight smoke test and automatically inserts missing imports when exactly one unambiguous candidate exists. We evaluate three conditions on FeatureBench Lite (30 tasks): baseline, diagnose-only (diagnostic reports without code changes), and full autofix. Our experiments reveal a negative result: the autofix mechanism achieves the same 10.0% resolved rate as the baseline, providing no benefit. However, the diagnose-only variant achieves 16.67%, a 66.7% relative improvement. Analysis shows the target error class accounts for only 4--5% of agent workflow, explaining the null effect. We conclude that diagnostic feedback helps agents understand errors, while automated fixes may interfere with their problem-solving process.

Resources