1 Matching Annotations
  1. Jan 2025
    1. LLM2Attn (Figure 1 (c)). We replace the language model with a single randomly-initializedmulti-head attention layer.• LLM2Trsf (Figure 1 (d)). We replace the language model with a single randomly-initializedtransformer block

      what is the parameters of the attention layer and the transformer? Seems like no mentioning