Hypothesis

With gated LoRA, ISD enables bit-for-bit lossless acceleration. Why Introspective Consistency? Key Insight: AR training unifies generation and introspection in one forward pass. Existing DLMs miss this — they learn to denoise but not to introspect.

作者揭示了自回归训练的核心优势：在一个前向传播中统一了生成和内省过程。现有DLMs只能学习去噪而不能内省，这是它们性能落后的根本原因。这一洞察不仅解释了I-DLM的设计哲学，也为未来语言模型架构设计提供了重要启示。

core-insight fundamental-limitation

Tags

Annotators

URL