Significant probabilitychanges are observed in deeper layers
Why is it tending so much towards yes (before becoming uncertain)? Maybe the model tends to favor positive responses, as mentioned earlier being the case for another model. But I hypothesized there to be relation hallucinations with the yes and no flipped compared to the Figure. Since the plot is averaged, are those cases so rare that they do not get represented after averaging anymore?
Why does it become uncertain only once the last layers are reached? They call it sharp change in Figure 8b.