Hypothesis

7 Matching Annotations

Mar 2025
proceedings.mlr.press proceedings.mlr.press

Show, Attend and Tell: Neural Image CaptionGeneration with Visual Attention

1
1. mark.crowley 05 Mar 2025
  
  in Public
  
  Examples of mistakes where we can use attention to gain intuition into what the model saw.
  
  Perhaps the best use of this approach is for looking for mistakes or understanding why a model does badly on certain data instances.
  
  attention interpretability
Visit annotations in context

Tags

attention

interpretability

Annotators

mark.crowley

URL

proceedings.mlr.press/v37/xuc15.pdf
Aug 2023
arxiv.org arxiv.org

2308.09543.pdf

1
1. mark.crowley 22 Aug 2023
  
  in Public
  
  Title: Delays, Detours, and Forks in the Road: Latent State Models of Training Dynamics Authors: Michael Y. Hu1 Angelica Chen1 Naomi Saphra1 Kyunghyun Cho Note: This paper seems cool, using older interpretable machine learning models, graphical models to understand what is going on inside a deep neural network
  
  Link: https://arxiv.org/pdf/2308.09543.pdf
  
  deep-learning machine-learning hidden-markov-models graphical-models interpretability visualization regularization
Visit annotations in context

Tags

deep-learning

machine-learning

regularization

hidden-markov-models

graphical-models

visualization

interpretability

Annotators

mark.crowley

URL

arxiv.org/pdf/2308.09543.pdf
Feb 2023
clementneo.com clementneo.com

We Found An Neuron in GPT-2

1
1. mshook 15 Feb 2023
  
  in Public
  
  The code to reproduce our results can be found here.
  
  https://github.com/UFO-101/an-neuron
  
  https://colab.research.google.com/github/UFO-101/an-neuron/blob/main/an_neuron_investigation.ipynb
  
  code transformer gpt interpretability colab ipynb cool
Visit annotations in context

Tags

transformer

cool

gpt

colab

code

ipynb

interpretability

Annotators

mshook

URL

clementneo.com/posts/2023/02/11/we-found-an-neuron
Jan 2023
ar5iv.labs.arxiv.org ar5iv.labs.arxiv.org

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

1
1. mshook 17 Jan 2023
  
  in Public
  
  This input embedding is the initial value of the residual stream, which all attention layers and MLPs read from and write to.
  
  transformer interpretability ml nlp attention circuit
Visit annotations in context

Tags

ml

attention

transformer

nlp

circuit

interpretability

Annotators

mshook

URL

ar5iv.labs.arxiv.org/html/2211.00593
Apr 2022
distill.pub distill.pub

Feature Visualization

1
1. mshook 02 Apr 2022
  
  in Public
  
  Starting from random noise, we optimize an image to activate a particular neuron (layer mixed4a, unit 11).
  
  And then we use that image as a kind of variable name to refer to the neuron in a way that more helpful than the the layer number and neuron index within the layer. This explanation is via one of Chris Olah's YouTube videos (https://www.youtube.com/watch?v=gXsKyZ_Y_i8)
  
  ml feature visualization colah good nn cnn inception interpretability
Visit annotations in context

Tags

cnn

nn

visualization

good

ml

colah

feature

inception

interpretability

Annotators

mshook

URL

distill.pub/2017/feature-visualization
Jun 2020
psyarxiv.com psyarxiv.com

Assessing Change in Intervention Research: The Benefits of Composite Outcomes

1
1. katietaylor_99 03 Jun 2020
  
  in BehSci
  
  Moreau, David, and Kristina Wiebels. ‘Assessing Change in Intervention Research: The Benefits of Composite Outcomes’, 2 June 2020. https://doi.org/10.31234/osf.io/t9hw3.
  
  is:preprint lang:en intervention research resource-intensive outcome measures combining assessments evaluate effectiveness pooling information composite scores theory transparency interpretability preregistration psychology recommendations
Visit annotations in context

Tags

outcome measures

psychology

pooling information

combining assessments

recommendations

intervention

composite scores

evaluate effectiveness

preregistration

is:preprint

lang:en

resource-intensive

transparency

research

theory

interpretability

Annotators

katietaylor_99

URL

psyarxiv.com/t9hw3/
Jun 2019
towardsdatascience.com towardsdatascience.com

Interpretable Machine Learning – Towards Data Science

1
1. intelligence.refinery 26 Jun 2019
  
  in Public
  
  To interpret a model, we require the following insights :Features in the model which are most important.For any single prediction from a model, the effect of each feature in the data on that particular prediction.Effect of each feature over a large number of possible predictions
  
  Machine learning interpretability
  
  Machine learning Interpretability
Visit annotations in context

Tags

Machine learning

Interpretability

Annotators

intelligence.refinery

URL

towardsdatascience.com/interpretable-machine-learning-1dec0f2f3e6b