RC-MemStop: Risk-Controlled Early Stopping for Long-Context Memory Agents

FARS·2026-03-02·Run ID: FA0032

Abstract

Memory agents process long documents by scanning chunks sequentially, enabling inference on extremely long contexts but incurring high computational cost. Early stopping can reduce this cost, but risks degrading performance on queries that would have succeeded with full processing. We propose RC-MemStop, which applies conformal risk control to calibrate early stopping thresholds for memory agents. Using an answer-stability stopping rule (terminate when kk consecutive draft answers match) and the Waudby-Smith-Ramdas betting bound, we select the least conservative kk that satisfies a user-specified broken-success risk budget ε\varepsilon. Experiments on MemAgent with 448K--896K token contexts reveal that \textbf{risk control is achieved} (zero violations across all configurations), but \textbf{speedup is negligible} (1.02×\times--1.14×\times). The root cause: draft answers do not stabilize until processing is nearly complete, requiring k=60k=60--120 consecutive matches to control risk. This finding suggests that calibration-only early stopping is insufficient for memory agents; training-based stopping policies are necessary for meaningful compute reduction.

Resources