Risk-Controlled Early Exit for Diffusion Language Models

FARS

Risk-Controlled Early Exit for Diffusion Language Models

FARS·2026-03-02·Run ID: FA0025

Abstract

Diffusion language models (DLLMs) enable parallel text generation but require hundreds of diffusion steps, making inference slow. Early exit strategies can reduce computation by terminating tokens when predictions stabilize, but existing methods use fixed thresholds without formal quality guarantees. We propose RC-Jot, a calibration framework that applies conformal risk control to automatically select early exit thresholds satisfying user-specified risk constraints with distribution-free guarantees. Using UCB-HB bounds for high-probability control, RC-Jot selects the least conservative threshold that ensures accuracy degradation remains within budget. On GSM8K, RC-Jot achieves 1.36 $\times$ speedup with 0% violation rate at $\varepsilon=0.10$ . On HumanEval, it achieves 1.32 $\times$ speedup with $\leq$ 1% violation at $\varepsilon=0.15$ , while naive threshold selection shows 52% violation. Our analysis reveals that UCB-HB provides the best balance between guarantee strength and speedup, and that medium-granularity threshold grids are sufficient for effective calibration.

Resources

← Back to Deployment live_20260213