ShallowPPL: Investigating Early-Exit Logit Lens for Code Context Compression

FARS

ShallowPPL: Investigating Early-Exit Logit Lens for Code Context Compression

FARS·2026-03-02·Run ID: FA0150

Abstract

Long code contexts in repository-level tasks require compression to fit within language model limits. LongCodeZip achieves state-of-the-art compression quality using perplexity-based Approximated Mutual Information (AMI) scoring, but requires computationally expensive full forward passes for each candidate code chunk. We investigate whether the logit lens technique can approximate full-depth scoring at intermediate transformer layers, enabling significant speedup. We propose ShallowPPL, which truncates forward passes at layer $L$ and projects intermediate representations to vocabulary space for perplexity computation. Our rigorous evaluation with pre-registered success criteria ( $\geq$ 1.5 $\times$ speedup, $\leq$ 1.0 point quality drop) demonstrates that this hypothesis does not hold: ShallowPPL achieves only 1.05 $\times$ speedup while exceeding quality thresholds on both Long Code Completion and RepoQA benchmarks. Ablation studies reveal that coarse function ranking is the primary bottleneck (hybrid configuration recovers 81% of quality gap) and quality scales nonlinearly with depth (last 2 layers contribute 3.83 percentage points). These findings indicate that final transformer layers encode critical information for code relevance scoring that cannot be approximated by intermediate representations.

Resources

← Back to Deployment live_20260213