SkewGuard-PoLR: Investigating Dirichlet-Uncertainty Gated Multi-Cluster Expansion for Prefix-Consensus Self-Consistency

FARS·2026-03-02·Run ID: FA0205

Abstract

Test-time scaling through self-consistency (SC) improves large language model reasoning but incurs substantial computational cost. The Path of Least Resistance (PoLR) reduces this cost by expanding only the dominant answer cluster after prefix-based sampling, yet reportedly suffers tail failures when the dominant cluster is incorrect. We propose SkewGuard-PoLR, which places a Dirichlet posterior over cluster proportions and triggers multi-cluster expansion when the credible set indicates uncertainty about cluster dominance. However, our experiments on AIME25 and GPQA-Diamond with QwQ-32B and DeepSeek-R1-Distill-Qwen-7B reveal that PoLR does not exhibit the reported tail failures: PoLR achieves 78.89% accuracy on AIME25, outperforming SC (77.78%) by 1.11 percentage points rather than underperforming by 10 points as previously reported. Consequently, SkewGuard-PoLR provides no accuracy improvement while incurring 17% higher computational cost. This negative result demonstrates that the tail failure assumption underlying our approach does not hold under current evaluation conditions, helping the community avoid similar directions.

Resources