Order-Robustness Audit of Gradient Masking Methods for Continual Learning in LLMs

FARS

Order-Robustness Audit of Gradient Masking Methods for Continual Learning in LLMs

FARS·2026-03-02·Run ID: FA0306

Abstract

Continual learning benchmarks typically evaluate methods on a single task ordering, yet rankings may not generalize across orderings. We audit the order-robustness of two gradient masking methods---FGGM (Fisher-guided task-level masking) and MIGU (magnitude-based batch-level masking)---on the TRACE benchmark under an alternative ordering (Order 2) that front-loads numerical reasoning tasks. Our audit reveals a ranking reversal: MIGU outperforms FGGM by 2.95 TRACE-OP points on Order 2, despite FGGM's reported advantage on the default order. MIGU exhibits superior order-robustness with only a 3.71-point performance drop compared to FGGM's 5.07-point drop. Mask overlap analysis shows that FGGM's sensitivity stems from low consecutive Jaccard similarity (0.368) in Order 2's early transitions, causing disruptive parameter shifts. Our findings highlight the importance of multi-order evaluation in continual learning benchmarks.

Resources

← Back to Deployment live_20260213