1 Matching Annotations
  1. Last 7 days
    1. The NLA consists of the AV and AR, which, together, form a round trip: original activation → text explanation → reconstructed activation. We score the NLA on how similar the reconstructed activation is to the original.

      NLA通过激活解释器(AV)和激活重构器(AR)形成闭环,通过重构质量评估解释准确性,这种创新方法为AI内部表示的可解释性提供了新范式。