9 Matching Annotations
  1. Feb 2025
    1. Las incrustaciones de palabras, o word embeddings, son un método que consiste en derivar, por medio de algoritmos diversos, representaciones numéricas vectoriales de los términos presentes en un corpus, de acuerdo con sus relaciones contextuales7Michael Gavin et al., «Spaces of Meaning: Conceptual HIstory, Vector Semantics, and Close Reading», Debates in the Digital Humanities 2019, ed. Matthew K. Gold y Lauren F. Klein (Minneapolis London: University of Minnesota Press, 2019), 243-67.. Este método sigue los principios de la semántica distribucional8
  2. Jul 2021
    1. Recommendations DON'T use shifted PPMI with SVD. DON'T use SVD "correctly", i.e. without eigenvector weighting (performance drops 15 points compared to with eigenvalue weighting with (p = 0.5)). DO use PPMI and SVD with short contexts (window size of (2)). DO use many negative samples with SGNS. DO always use context distribution smoothing (raise unigram distribution to the power of (lpha = 0.75)) for all methods. DO use SGNS as a baseline (robust, fast and cheap to train). DO try adding context vectors in SGNS and GloVe.
  3. Jun 2017
  4. Apr 2017
    1. arg maxvw;vcP(w;c)2Dlog11+evcvw

      maximise the log probability.

    2. p(D= 1jw;c)the probability that(w;c)came from the data, and byp(D= 0jw;c) =1p(D= 1jw;c)the probability that(w;c)didnot.

      probability of word,context present in text or not.

    3. Loosely speaking, we seek parameter values (thatis, vector representations for both words and con-texts) such that the dot productvwvcassociatedwith “good” word-context pairs is maximized.
    4. In the skip-gram model, each wordw2Wisassociated with a vectorvw2Rdand similarlyeach contextc2Cis represented as a vectorvc2Rd, whereWis the words vocabulary,Cis the contexts vocabulary, anddis the embed-ding dimensionality.

      Factors involved in the Skip gram model

  5. Jun 2016
    1. Neural Word Embedding Methods
    2. dimension of embedding vectors strongly dependson applications and uses, and is basically determinedbased on the performance and memory space (orcalculation speed) trade-of