ShallowPPL: Investigating Early-Exit Logit Lens for Code Context Compression
Abstract
Long code contexts in repository-level tasks require compression to fit within language model limits. LongCodeZip achieves state-of-the-art compression quality using perplexity-based Approximated Mutual Information (AMI) scoring, but requires computationally expensive full forward passes for each candidate code chunk. We investigate whether the logit lens technique can approximate full-depth scoring at intermediate transformer layers, enabling significant speedup. We propose ShallowPPL, which truncates forward passes at layer and projects intermediate representations to vocabulary space for perplexity computation. Our rigorous evaluation with pre-registered success criteria (1.5 speedup, 1.0 point quality drop) demonstrates that this hypothesis does not hold: ShallowPPL achieves only 1.05 speedup while exceeding quality thresholds on both Long Code Completion and RepoQA benchmarks. Ablation studies reveal that coarse function ranking is the primary bottleneck (hybrid configuration recovers 81% of quality gap) and quality scales nonlinearly with depth (last 2 layers contribute 3.83 percentage points). These findings indicate that final transformer layers encode critical information for code relevance scoring that cannot be approximated by intermediate representations.