Hypothesis

230 Matching Annotations

Dec 2018
en.wikipedia.org en.wikipedia.org

Treebank - Wikipedia

2
1. ildar 02 Dec 2018
  
  in Public
  
  nlp syntax semantic
2. ildar 02 Dec 2018
  
  in Public
  
  A semantic treebank is a collection of natural language sentences annotated with a meaning representation. These resources use a formal representation of each sentence's semantic structure.
  
  nlp semantic
Visit annotations in context

Tags

semantic

nlp

syntax

Annotators

ildar

URL

en.wikipedia.org/wiki/Treebank
en.wikipedia.org en.wikipedia.org

Parse tree - Wikipedia

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp syntax
Visit annotations in context

Tags

syntax

nlp

Annotators

ildar

URL

en.wikipedia.org/wiki/Parse_tree
en.wikipedia.org en.wikipedia.org

Sentence diagram - Wikipedia

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp syntax
Visit annotations in context

Tags

syntax

nlp

Annotators

ildar

URL

en.wikipedia.org/wiki/Sentence_diagram
en.wikipedia.org en.wikipedia.org

Lexical functional grammar - Wikipedia

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp syntax
Visit annotations in context

Tags

syntax

nlp

Annotators

ildar

URL

en.wikipedia.org/wiki/Lexical_functional_grammar
elifesciences.org elifesciences.org

ScienceBeam - using computer vision to extract PDF data

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp cv extracting
Visit annotations in context

Tags

extracting

cv

nlp

Annotators

ildar

URL

elifesciences.org/labs/5b56aff6/sciencebeam-using-computer-vision-to-extract-pdf-data
github.com github.com

х-Ким/nlp_with_pytorch

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp deep learning korean
Visit annotations in context

Tags

deep learning

korean

nlp

Annotators

ildar

URL

github.com/kh-kim/nlp_with_pytorch
shop.oreilly.com shop.oreilly.com

Natural Language Processing with PyTorch

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp deep learning @find
Visit annotations in context

Tags

deep learning

@find

nlp

Annotators

ildar

URL

shop.oreilly.com/product/0636920063445.do
github.com github.com

astorfi/Deep-Learning-NLP

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp deep learning
Visit annotations in context

Tags

deep learning

nlp

Annotators

ildar

URL

github.com/astorfi/Deep-Learning-NLP
github.com github.com

rguthrie3/DeepLearningForNLPInPytorch

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp deep learning
Visit annotations in context

Tags

deep learning

nlp

Annotators

ildar

URL

github.com/rguthrie3/DeepLearningForNLPInPytorch
github.com github.com

rouseguy/DeepLearning-NLP

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp deep learning
Visit annotations in context

Tags

deep learning

nlp

Annotators

ildar

URL

github.com/rouseguy/DeepLearning-NLP
github.com github.com

brianspiering/awesome-dl4nlp

1
1. ildar 02 Dec 2018
  
  in Public
  
  nlp deep learning
Visit annotations in context

Tags

deep learning

nlp

Annotators

ildar

URL

github.com/brianspiering/awesome-dl4nlp
Nov 2018
iphysresearch.github.io iphysresearch.github.io

A Paper A Day

1
1. Herb 20 Nov 2018
  
  in Public
  
  Language GANs Falling Short
  
  此文的一个 Paper Summary 写的特别棒！
  
  This paper’s high-level goal is to evaluate how well GAN-type structures for generating text are performing, compared to more traditional maximum likelihood methods. In the process, it zooms into the ways that the current set of metrics for comparing text generation fail to give a well-rounded picture of how models are performing.
  
  In the old paradigm, of maximum likelihood estimation, models were both trained and evaluated on a maximizing the likelihood of each word, given the prior words in a sequence. That is, models were good when they assigned high probability to true tokens, conditioned on past tokens. However, GANs work in a fundamentally new framework, in that they aren’t trained to increase the likelihood of the next (ground truth) word in a sequence, but to generate a word that will make a discriminator more likely to see the sentence as realistic. Since GANs don’t directly model the probability of token t, given prior tokens, you can’t evaluate them using this maximum likelihood framework.
  
  This paper surveys a range of prior work that has evaluated GANs and MLE models on two broad categories of metrics, occasionally showing GANs to perform better on one or the other, but not really giving a way to trade off between the two.
  
  The first type of metric, shorthanded as “quality”, measures how aligned the generated text is with some reference corpus of text: to what extent your generated text seems to “come from the same distribution” as the original. BLEU, a heuristic frequently used in translation, and also leveraged here, measures how frequently certain sets of n-grams occur in the reference text, relative to the generated text. N typically goes up to 4, and so in addition to comparing the distributions of single tokens in the reference and generated, BLEU also compares shared bigrams, trigrams, and quadgrams (?) to measure more precise similarity of text.
  
  The second metric, shorthanded as “diversity” measures how different generated sentences are from one another. If you want to design a model to generate text, you presumably want it to be able to generate a diverse range of text - in probability terms, you want to fully sample from the distribution, rather than just taking the expected or mean value. Linguistically, this would be show up as a generator that just generates the same sentence over and over again. This sentence can be highly representative of the original text, but lacks diversity. One metric used for this is the same kind of BLEU score, but for each generated sentence against a corpus of prior generated sentences, and, here, the goal is for the overlap to be as low as possible
  
  The trouble with these two metrics is that, in their raw state, they’re pretty incommensurable, and hard to trade off against one another.
  
  更多需要阅读 Paper Summary 了。。。。
  
  GAN NLP
Visit annotations in context

Tags

NLP

GAN

Annotators

Herb

URL

iphysresearch.github.io/paper_summary/APaperADay.html
en.wikipedia.org en.wikipedia.org

Kneser–Ney smoothing - Wikipedia

1
1. ildar 15 Nov 2018
  
  in Public
  
  nlp language modeling
Visit annotations in context

Tags

language modeling

nlp

Annotators

ildar

URL

en.wikipedia.org/wiki/Kneser–Ney_smoothing
github.com github.com

mitlm/mitlm

1
1. ildar 14 Nov 2018
  
  in Public
  
  nlp language modeling
Visit annotations in context

Tags

language modeling

nlp

Annotators

ildar

URL

github.com/mitlm/mitlm
github.com github.com

renepickhardt/generalized-language-modeling-toolkit

1
1. ildar 14 Nov 2018
  
  in Public
  
  nlp language modeling
Visit annotations in context

Tags

language modeling

nlp

Annotators

ildar

URL

github.com/renepickhardt/generalized-language-modeling-toolkit
www.coursera.org www.coursera.org

Языковые модели - Анализ текстов | Coursera

1
1. ildar 13 Nov 2018
  
  in Public
  
  NLP Language modeling
Visit annotations in context

Tags

Language modeling

NLP

Annotators

ildar

URL

coursera.org/lecture/data-analysis-applications/iazykovyie-modieli-UOmij
Jul 2017
infoscience.epfl.ch infoscience.epfl.ch

Agile in-litero experiments: how can semi-automated information extraction from neuroscientific literature help neuroscience model building?

1
1. Gardener 15 Jul 2017
  
  in Public
  
  This third research question led to the formulation of agile text mining, a new methodologyagile textminingto support the development of efficient TMAs. Agile text mining copes with the unpredictablerealities of creating text-mining applications.
  
  NLP Agile Text Mining
Visit annotations in context

Tags

Agile Text Mining

NLP

Annotators

Gardener

URL

infoscience.epfl.ch/record/215862/files/EPFL_TH6809.pdf
Jun 2017
w4nderlu.st w4nderlu.st

Word Embeddings | w4nderlust

1
1. taniki 12 Jun 2017
  
  in Public
  
  word embeddings machine learning NLP
Visit annotations in context

Tags

NLP

machine learning

word embeddings

Annotators

taniki

URL

w4nderlu.st/teaching/word-embeddings
Apr 2017
levyomer.files.wordpress.com levyomer.files.wordpress.com

dependency-based-word-embeddings-acl-2014.pdf

1
1. akcool123 19 Apr 2017
  
  in Public
  
  In the skip-gram model, each wordw2Wisassociated with a vectorvw2Rdand similarlyeach contextc2Cis represented as a vectorvc2Rd, whereWis the words vocabulary,Cis the contexts vocabulary, anddis the embed-ding dimensionality.
  
  Factors involved in the Skip gram model
  
  Skip-gram dependency-word-embeddings-paper NLP
Visit annotations in context

Tags

NLP

Skip-gram

dependency-word-embeddings-paper

Annotators

akcool123

URL

levyomer.files.wordpress.com/2014/04/dependency-based-word-embeddings-acl-2014.pdf
www.tensorflow.org www.tensorflow.org

Vector Representations of Words | TensorFlow

2
1. akcool123 17 Apr 2017
  
  in Public
  
  Algorithmically, these models are similar, except that CBOW predicts target words (e.g. 'mat') from source context words ('the cat sits on the'), while the skip-gram does the inverse and predicts source context-words from the target words. This inversion might seem like an arbitrary choice, but statistically it has the effect that CBOW smoothes over a lot of the distributional information (by treating an entire context as one observation)
  
  Bag_of_Words_model Skip-gram NLP Language_Modelling
2. akcool123 17 Apr 2017
  
  in Public
  
  Word2vec is a particularly computationally-efficient predictive model for learning word embeddings from raw text. It comes in two flavors, the Continuous Bag-of-Words model (CBOW) and the Skip-Gram model (Section 3.1 and 3.2 in Mikolov et al.).
  
  Word embedding NLP
Visit annotations in context

Tags

NLP

Skip-gram

Bag_of_Words_model

Language_Modelling

Word embedding

Annotators

akcool123

URL

tensorflow.org/tutorials/word2vec/
demo.clab.cs.cmu.edu demo.clab.cs.cmu.edu

nce_notes.pdf

1
1. teffland 01 Apr 2017
  
  in Public
  
  if your goal is word representation learning,you should consider both NCE and negative sampling
  
  Wonder if anyone has compared these two approaches
  
  research nlp ml language modeling
Visit annotations in context

Tags

ml

research

language modeling

nlp

Annotators

teffland

URL

demo.clab.cs.cmu.edu/cdyer/nce_notes.pdf
Oct 2016
www.tensorflow.org www.tensorflow.org

Vector Representations of Words

1
1. buluzhai 20 Oct 2016
  
  in Public
  
  Distributional Hypothesis, which states that words that appear in the same contexts share semantic meaning
  
  分布假说，上下文相同的词，语义也相同。
  
  nlp
Visit annotations in context

Tags

nlp

Annotators

buluzhai

URL

tensorflow.org/versions/master/tutorials/word2vec/index.html
www.quora.com www.quora.com

What are the continuous bag of words and skip-gram architectures, in layman's terms?

1
1. buluzhai 07 Oct 2016
  
  in Public
  
  CBOW: The input to the model could be wi−2,wi−1,wi+1,wi+2wi−2,wi−1,wi+1,wi+2w_{i-2}, w_{i-1}, w_{i+1}, w_{i+2}, the preceding and following words of the current word we are at. The output of the neural network will be wiwiw_i. Hence you can think of the task as "predicting the word given its context"Note that the number of words we use depends on your setting for the window size.Skip-gram: The input to the model is wiwiw_i, and the output could be wi−1,wi−2,wi+1,wi+2wi−1,wi−2,wi+1,wi+2w_{i-1}, w_{i-2}, w_{i+1}, w_{i+2}. So the task here is "predicting the context given a word". Also, the context is not limited to its immediate context, training instances can be created by skipping a constant number of words in its context, so for example, wi−3,wi−4,wi+3,wi+4wi−3,wi−4,wi+3,wi+4w_{i-3}, w_{i-4}, w_{i+3}, w_{i+4}, hence the name skip-gram.
  
  CBOW和Skip-gram模型
  
  nlp
Visit annotations in context

Tags

nlp

Annotators

buluzhai

URL

quora.com/What-are-the-continuous-bag-of-words-and-skip-gram-architectures-in-laymans-terms
Jul 2016
fit.uet.vnu.edu.vn fit.uet.vnu.edu.vn

Summer School 2016

2
1. tung 10 Jul 2016
  
  in Public
  
  TS. Lê Hồng Phương
  
  Học Sâu NLP
2. tung 10 Jul 2016
  
  in Public
  
  PGS.TS. Nguyễn Lê Minh
  
  lẽ ra nên để thầy đồng hướng dẫn, ít ra đã tốt nghiệp dù là NLP chưa bao giờ nằm trong sự quan tâm, nhiều khi cần chịu đựng làm việc mình k thích lắm để được làm điều mình thích nhất sau này
  
  Học Sâu Ngôn Ngữ NLP
Visit annotations in context

Tags

Ngôn Ngữ

NLP

Học Sâu

Annotators

tung

URL

fit.uet.vnu.edu.vn/dmss2016/
Apr 2016
www.isb-sib.ch www.isb-sib.ch

Biocuration 2016

2
1. bgood 12 Apr 2016
  
  in Public
  
  TextpressoCentral
  
  Could this be used as a front end to adding content to wikidata ?
  
  wikidata nlp gui interface
2. bgood 12 Apr 2016
  
  in Public
  
  Described DARPA-funded NLP research. 'Big Mechanism'. Crowd annoyed that they used untrained humans in the study, thus setting up the machines to look better.
  
  NLP
Visit annotations in context

Tags

gui

NLP

interface

wikidata

nlp

Annotators

bgood

URL

isb-sib.ch/events/biocuration2016/oral-presentations
Oct 2015
www.nationalgeographic.com www.nationalgeographic.com

Presenting NG Traveler photography seminars in multiple cities!

1
1. ept1961 13 Oct 2015
  
  in Public
  
  I might use something like it as the basis for an NLP passage but at a much much easier level
  
  nlp reading text non-linear prose
Visit annotations in context

Tags

text

reading

non-linear prose

nlp

Annotators

ept1961

URL

nationalgeographic.com/ngtseminars/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL