Hypothesis

12 Matching Annotations

Jun 2017
arxiv.org arxiv.org

1612.01175v1.pdf

1
1. pranava 21 Jun 2017
  
  in Public
  
  Who is Mistaken?Benjamin EysenbachMITbce@mit.eduCarl VondrickMITvondrick@mit.eduAntonio TorralbaMITtorralba@csail.mit.eduFigure 1: Can you determine who has a false belief about this scene? In this paper, we study how to recognize when a person in a short sequence is mistaken. Above, the woman is mistaken about the chair being pulled away from her.TimeFigure 1:Can you determine who believes something incorrectly in this scene?In this paper, we study how to recognizewhen a person in a scene is mistaken. Above, the woman is mistaken about the chair being pulled away from her in the thirdframe, causing her to fall down. Thered arrowindicates false belief. We introduce a new dataset of abstract scenes to studywhen people have false beliefs. We propose approaches to learn to recognizewhois mistaken andwhenthey are mistaken.AbstractRecognizing when people have false beliefs is crucial forunderstanding their actions. We introduce the novel prob-lem of identifying when people in abstract scenes have in-correct beliefs. We present a dataset of scenes, each visuallydepicting an 8-frame story in which a character has a mis-taken belief. We then create a representation of characters’beliefs for two tasks in human action understanding: pre-dicting who is mistaken, and when they are mistaken. Ex-periments suggest that our method for identifying mistakencharacters performs better on these tasks than simple base-lines. Diagnostics on our model suggest it learns importantcues for recognizing mistaken beliefs, such as gaze. We be-lieve models of people’s beliefs will have many
  
  Interesting
  
  Vision
Visit annotations in context

Tags

Vision

Annotators

pranava

URL

arxiv.org/pdf/1612.01175v1.pdf
arxiv.org arxiv.org

()

1
1. pranava 21 Jun 2017
  
  in Public
  
  The analysis showsthat, although they are superficially similar, NCE is a general parameter estimation technique that is asymp-totically unbiased, while negative sampling is best understood as a family of binary classification modelsthat are useful for learning word representations but not asa general-purpose estimator
  
  I think NCE is slightly different from CE. Unfortunately, Chris sort of ignores Noah's work on CE in this explanation. Although, the connection between NCE and NS is nicely explained.
  
  NegSampling
Visit annotations in context

Tags

NegSampling

Annotators

pranava

URL

arxiv.org/pdf/1410.8251.pdf
pdfs.semanticscholar.org pdfs.semanticscholar.org

e2c498be92fa7074738b15a48808fa48cece.pdf

1
1. pranava 21 Jun 2017
  
  in Public
  
  We present an extension to Jaynes’ maximum entropy principle that handles latent variables. Theprinciple oflatent maximum entropywe propose is different from both Jaynes’ maximum entropy principleand maximum likelihood estimation, but often yields better estimates in the presence of hidden variablesand limited training data. We first show that solving for a latent maximum entropy model poses a hardnonlinear constrained optimization problem in general. However, we then show that feasible solutions tothis problem can be obtained efficiently for the special case of log-linear models—which forms the basisfor an efficient approximation to the latent maximum entropy principle. We derive an algorithm thatcombines expectation-maximization with iterative scaling to produce feasible log-linear solutions. Thisalgorithm can be interpreted as an alternating minimization algorithm in the information divergence, andreveals an intimate connection between the latent maximum entropy and maximum likelihood principles.To select a final model, we generate a series of feasible candidates, calculate the entropy of each, andchoose the model that attains the highest entropy. Our experimental results show that estimation basedon the latent maximum entropy principle generally gives better results than maximum likelihood whenestimating latent variable models on small observed data samples.
  
  Towards intelligent negative sampling
  
  NegSampling
Visit annotations in context

Tags

NegSampling

Annotators

pranava

URL

pdfs.semanticscholar.org/f2a3/e2c498be92fa7074738b15a48808fa48cece.pdf
pdfs.semanticscholar.org pdfs.semanticscholar.org

smith+eisner.acl05.pdf

5
1. pranava 21 Jun 2017
  
  in Public
  
  Wang et al. (2002) discuss the latent maximumentropy principle. They advocate running EM manytimes and selecting the local maximum that maxi-mizes entropy. One might do the same for the localmaxima of any CE objective, though theoretical andexperimental support for this idea remain for futurework.
  
  Interesting proposal, quite similar to the neg. sampling with 'exploration / exploitation'.
  
  Definitely, worth atleast a couple reads!
  
  NegSampling
2. pranava 21 Jun 2017
  
  in Public
  
  One can envision amixedobjective function that tries to fit the labeledexamples while discriminating unlabeled examplesfrom their neighborhoods.
  
  Interesting - a mixed objective function -> this seems like a multi-task framework!
  
  --> Re-read and understand
  
  NegSampling
3. pranava 21 Jun 2017
  
  in Public
  
  We have presentedcontrastive estimation, a newprobabilistic estimation criterion that forces a modelto explain why the given training data were betterthan bad data implied by the positive examples.
  
  This is again an interesting way to see it: "... forces a model to explain why the given training data were better than bad data implied by the positive examples."
  
  NegSampling
4. pranava 21 Jun 2017
  
  in Public
  
  Viewed as a CE method, this approach (though ef-fective when there are few hypotheses) seems mis-guided; the objective says to move mass to each ex-ample at the expense of all other training examples
  
  A very cool remark and makes sense!!
  
  NegSampling
5. pranava 21 Jun 2017
  
  in Public
  
  An alternative is to restrict theneighborhood to the set of observed training exam-ples rather than all possible examples (Riezler, 1999;Johnson et al., 1999; Riezler et al., 2000):
  
  This equation is reminiscent of the equation proposed by Nickel et al., 2017 - the Poincare Embeddings paper. Especially, look for Negative Sampling.
  
  NegSampling
Visit annotations in context

Tags

NegSampling

Annotators

pranava

URL

pdfs.semanticscholar.org/29c3/4a034f6f35915a141dac98cabf625bea2b3c.pdf
beamandrew.github.io beamandrew.github.io

You can probably use deep learning even if your data isn't that big

1
1. pranava 05 Jun 2017
  
  in Public
  
  Data v/s Deep Learning
Visit annotations in context

Annotators

pranava

URL

beamandrew.github.io//deeplearning/2017/06/04/deep_learning_works.html
smerity.com smerity.com

Peeking into the neural network architecture used for Google's Neural Machine Translation

1
1. pranava 05 Jun 2017
  
  in Public
  
  Google's NMT.
Visit annotations in context

Annotators

pranava

URL

smerity.com/articles/2016/google_nmt_arch.html
news.ycombinator.com news.ycombinator.com

Foundations of deep learning | Hacker News

1
1. pranava 05 Jun 2017
  
  in Public
  
  On Foundations
Visit annotations in context

Annotators

pranava

URL

news.ycombinator.com/item
www.alexirpan.com www.alexirpan.com

On The Perils of Batch Norm

1
1. pranava 05 Jun 2017
  
  in Public
  
  Implementational issues in batch norm.
Visit annotations in context

Annotators

pranava

URL

alexirpan.com/2017/04/26/perils-batch-norm.html

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL