Execution-Trace Guided Remasking for Diffusion Code Generation
Abstract
Masked diffusion language models can iteratively refine code through remasking, offering a unique capability for targeted repair. However, existing remasking strategies select tokens based on model confidence or perturbation-based heuristics, lacking semantic guidance about where errors actually occur. We propose execution-trace guided remasking, which uses runtime diagnostics to localize failures and target repair. When generated code fails unit tests, we parse exception tracebacks or collect line-level execution traces to identify failure-relevant regions, then remask only those tokens for conditional diffusion repair. On MBPP+, our method achieves 31.22% pass@1, an 11.38 percentage point improvement over the no-repair baseline and 4.24 points over CORE, with statistical significance (). Analysis shows that trace-guided repair produces meaningful code modifications while global low-confidence repair rarely changes code, demonstrating that semantic localization is essential for effective repair.