Hypothesis

10 Matching Annotations

Feb 2023
colah.github.io colah.github.io

Neural Networks, Manifolds, and Topology -- colah's blog

1
1. mshook 10 Feb 2023
  
  in Public
  
  The manifold hypothesis is that natural data forms lower-dimensional manifolds in its embedding space.
  
  colah manifold hypothesis dimension
Visit annotations in context

Tags

manifold

dimension

hypothesis

colah

Annotators

mshook

URL

colah.github.io/posts/2014-03-NN-Manifolds-Topology/
Jan 2023
transformer-circuits.pub transformer-circuits.pub

A Mathematical Framework for Transformer Circuits

2
1. mshook 26 Jan 2023
  
  in Public
  
  One of the main features of the high level architecture of a transformer is that each layer adds its results into what we call the “residual stream.”Constructing models with a residual stream traces back to early work by the Schmidhuber group, such as highway networks and LSTMs, which have found significant modern success in the more recent residual network architecture . In transformers, the residual stream vectors are often called the “embedding.” We prefer the residual stream terminology, both because it emphasizes the residual nature (which we believe to be important) and also because we believe the residual stream often dedicates subspaces to tokens other than the present token, breaking the intuitions the embedding terminology suggests. The residual stream is simply the sum of the output of all the previous layers and the original embedding. We generally think of the residual stream as a communication channel, since it doesn't do any processing itself and all layers communicate through it.
  
  transformer residual architecture explanation colah attention image
2. mshook 26 Jan 2023
  
  in Public
  
  A transformer starts with a token embedding, followed by a series of “residual blocks”, and finally a token unembedding. Each residual block consists of an attention layer, followed by an MLP layer. Both the attention and MLP layers each “read” their input from the residual stream (by performing a linear projection), and then “write” their result to the residual stream by adding a linear projection back in. Each attention layer consists of multiple heads, which operate in parallel.
  
  transformer residual architecture alternative colah explanation
Visit annotations in context

Tags

transformer

explanation

alternative

colah

attention

image

architecture

residual

Annotators

mshook

URL

transformer-circuits.pub/2021/framework/index.html
Sep 2022
transformer-circuits.pub transformer-circuits.pub

Toy Models of Superposition

1
1. mshook 16 Sep 2022
  
  in Public
  
  Consider a toy model where we train an embedding of five features of varying importanceWhere “importance” is a scalar multiplier on mean squared error loss. in two dimensions, add a ReLU afterwards for filtering, and vary the sparsity of the features.
  
  colah autoencoder toy model ml nn
Visit annotations in context

Tags

toy

nn

autoencoder

model

colah

ml

Annotators

mshook

URL

transformer-circuits.pub/2022/toy_model/index.html
Apr 2022
distill.pub distill.pub

Feature Visualization

1
1. mshook 02 Apr 2022
  
  in Public
  
  Starting from random noise, we optimize an image to activate a particular neuron (layer mixed4a, unit 11).
  
  And then we use that image as a kind of variable name to refer to the neuron in a way that more helpful than the the layer number and neuron index within the layer. This explanation is via one of Chris Olah's YouTube videos (https://www.youtube.com/watch?v=gXsKyZ_Y_i8)
  
  ml feature visualization colah good nn cnn inception interpretability
Visit annotations in context

Tags

inception

good

cnn

feature

interpretability

colah

nn

ml

visualization

Annotators

mshook

URL

distill.pub/2017/feature-visualization
Nov 2021
distill.pub distill.pub

The Building Blocks of Interpretability

1
1. mshook 17 Nov 2021
  
  in Public
  
  The cube of activations that a neural network for computer vision develops at each hidden layer. Different slices of the cube allow us to target the activations of individual neurons, spatial positions, or channels.
  
  This is first explanation of
  
  colah ml nn image gmlp
Visit annotations in context

Tags

gmlp

nn

image

colah

ml

Annotators

mshook

URL

distill.pub/2018/building-blocks
Oct 2021
colah.github.io colah.github.io

Visualizing Representations: Deep Learning and Human Beings - colah's blog

2
1. mshook 31 Oct 2021
  
  in Public
  
  This approach, visualizing high-dimensional representations using dimensionality reduction, is an extremely broadly applicable technique for inspecting models in deep learning.
  
  dimension ml colah nn
2. mshook 31 Oct 2021
  
  in Public
  
  These layers warp and reshape the data to make it easier to classify.
  
  nn layer ml example colah
Visit annotations in context

Tags

dimension

colah

nn

example

layer

ml

Annotators

mshook

URL

colah.github.io/posts/2015-01-Visualizing-Representations/
Sep 2021
colah.github.io colah.github.io

Visualizing MNIST: An Exploration of Dimensionality Reduction - colah's blog

1
1. mshook 24 Sep 2021
  
  in Public
  
  One popular theory among machine learning researchers is the manifold hypothesis: MNIST is a low dimensional manifold, sweeping and curving through its high-dimensional embedding space. Another hypothesis, more associated with topological data analysis, is that data like MNIST consists of blobs with tentacle-like protrusions sticking out into the surrounding space.
  
  mnist dimension colah manifold
Visit annotations in context

Tags

manifold

dimension

mnist

colah

Annotators

mshook

URL

colah.github.io/posts/2014-10-Visualizing-MNIST/
Aug 2021
colah.github.io colah.github.io

Deep Learning, NLP, and Representations - colah's blog

1
1. mshook 06 Aug 2021
  
  in Public
  
  Recursive Neural Networks
  
  rnn colah ml nn
Visit annotations in context

Tags

rnn

nn

colah

ml

Annotators

mshook

URL

colah.github.io/posts/2014-07-NLP-RNNs-Representations/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL