8 Matching Annotations
  1. Jul 2021
    1. Recommendations DON'T use shifted PPMI with SVD. DON'T use SVD "correctly", i.e. without eigenvector weighting (performance drops 15 points compared to with eigenvalue weighting with (p = 0.5)). DO use PPMI and SVD with short contexts (window size of (2)). DO use many negative samples with SGNS. DO always use context distribution smoothing (raise unigram distribution to the power of (lpha = 0.75)) for all methods. DO use SGNS as a baseline (robust, fast and cheap to train). DO try adding context vectors in SGNS and GloVe.
  2. Jun 2017
  3. Apr 2017
    1. arg maxvw;vcP(w;c)2Dlog11+evcvw

      maximise the log probability.

    2. p(D= 1jw;c)the probability that(w;c)came from the data, and byp(D= 0jw;c) =1p(D= 1jw;c)the probability that(w;c)didnot.

      probability of word,context present in text or not.

    3. Loosely speaking, we seek parameter values (thatis, vector representations for both words and con-texts) such that the dot productvwvcassociatedwith “good” word-context pairs is maximized.
    4. In the skip-gram model, each wordw2Wisassociated with a vectorvw2Rdand similarlyeach contextc2Cis represented as a vectorvc2Rd, whereWis the words vocabulary,Cis the contexts vocabulary, anddis the embed-ding dimensionality.

      Factors involved in the Skip gram model

  4. Jun 2016
    1. Neural Word Embedding Methods
    2. dimension of embedding vectors strongly dependson applications and uses, and is basically determinedbased on the performance and memory space (orcalculation speed) trade-of