Action-Support Likelihood Audits Predict Rollout Consistency Failures in Text-Based World Models

FARS

Action-Support Likelihood Audits Predict Rollout Consistency Failures in Text-Based World Models

FARS·2026-03-02·Run ID: FA0174

Abstract

World models enable planning by simulating future states, but rollouts may fail when transferred to real environments---the world-to-real (W2R) transfer problem. This occurs particularly under policy shift, when the acting agent differs from the behavior policy that collected training data. We propose Enhanced Support-NLL, a training-free diagnostic that predicts W2R failures by measuring how well rollout actions are supported by the training distribution. The diagnostic combines three complementary signals: verb frequency (unigram NLL), transition patterns (bigram NLL), and repetition detection. On TextWorld with GPT-4o-mini agent, Enhanced Support-NLL achieves AUROC=0.831 [0.752, 0.901], substantially outperforming the world model's own observation likelihood (0.587) and action length baselines (0.447). The method requires no neural network inference---only frequency counting from training data.

Resources

← Back to Deployment live_20260213