Progress-Guarded LAVE: Lexer-Ignored Stall Filtering for Reliable CFG-Constrained Diffusion Decoding

FARS·2026-03-02·Run ID: FA0142

Abstract

Diffusion language models with context-free grammar (CFG) constraints enable reliable structured generation, but can get stuck in \emph{stall loops} where the model repeatedly generates grammar-ignored tokens (whitespace, comments) until hitting the maximum length. We propose Progress-Guarded LAVE (PG-LAVE), which extends LAVE's lookahead verification with a lightweight stall guard that monitors lexer progress---the production of non-ignored terminals---and triggers early rejection when progress stalls. On CPP-Bench with LLaDA-8B-Instruct, PG-LAVE achieves a 40% relative reduction in truncation failures (1.83% vs 3.05% hit\_max\_len\_rate) while improving syntactic validity by +0.81 percentage points (88.41% vs 87.60%). Our ablation study shows that lexer-progress tracking outperforms simple whitespace rejection heuristics, and sensitivity analysis confirms robustness to the stall threshold for H16H \geq 16.

Resources