Hypothesis

9 Matching Annotations

Dec 2016
www.aclweb.org www.aclweb.org

Using Universal Linguistic Knowledge to Guide Grammar Induction

1
1. fraserisverycool 09 Dec 2016
  
  in Public
  
  generative story
  
  I've seen this term around, and I'm sure one paper a few weeks ago had it too, but what exactly does it mean "generative story"? Is it a general term for inducing something like a grammar or is it something more exact?
Visit annotations in context

Annotators

fraserisverycool

URL

aclweb.org/anthology/D10-1120
Nov 2016
www.aclweb.org www.aclweb.org

Simple Unsupervised Grammar Induction from Raw Text with Cascaded Finite State Models

4
1. fraserisverycool 26 Nov 2016
  
  in Public
  
  thereisinournow
  
  How does it know where a pseudoword appears in a chunk? "is no asbestos" can't have appeared that many times in the training data
2. fraserisverycool 26 Nov 2016
  
  in Public
  
  The PRLG chunker systematically getsDT JJ NN trigrams as chunks.
  
  So the adjective is never assigned the "B" tag - it's like saying that adjectives can't be at the end of constituent. And the transition probability from a previous determiner word would make sure this is the case - did I understand that right?
3. fraserisverycool 26 Nov 2016
  
  in Public
  
  That is, themodels will learn to associate terms liketheanda,which often occur at the beginnings of sentences andrarely at the end, with the tagB, which cannot occurat the end of a sentence. Likewise common nounslikecompanyorasset, which frequently occur at theends of sentences, but rarely at the beginning, willcome to be associated with theItag, which cannotoccur at the beginning
  
  OK! Does this mean that each word in the sequence will be given one of these four tags? And each constituent will have a B and I, kind of like in the previous paper, where they predicted how likely it would be to be at the beginning/end of a constituent?
  
  I eat the cake B O B I STOP
  
  something like that?
4. fraserisverycool 26 Nov 2016
  
  in Public
  
  However, without making this independence as-sumption, we can model right linear rules directly
  
  All the emission/transition probabilities for each word x and their hidden state y are independent? I suppose it doesn't matter
Visit annotations in context

Annotators

fraserisverycool

URL

aclweb.org/anthology/P11-1108
asv.informatik.uni-leipzig.de asv.informatik.uni-leipzig.de

HaenigBordagQuasthoffFinal

2
1. fraserisverycool 18 Nov 2016
  
  in Public
  
  Using POS tags and positional preferences
  
  Is their data tagged? Is it worth automatically tagging, will this reduce accuracy if the tagging isn't perfect? Also just how useful are tags, when nouns adjective verbs can all appear at the beginning/end/middle of constituents?
2. fraserisverycool 18 Nov 2016
  
  in Public
  
  the new separation value sep(i) is than the minimum of the separation values of all pairs of words where the one word is anything from n0 to ni and the other word from ni+1 to nm
  
  So am I right in thinking that it looks at each word in the whole sentence and compares it with the words at the boundary you're calculating the separation for? And it does it by picking the minimum? I don't see how this helps rare words be a part of a consituent
Visit annotations in context

Annotators

fraserisverycool

URL

asv.informatik.uni-leipzig.de/publication/file/132/lrec_unsuparse.pdf
www.aclweb.org www.aclweb.org

A Bayesian Mixture Model for PoS Induction Using Multiple Features

2
1. fraserisverycool 11 Nov 2016
  
  in Public
  
  In addition to monolingual context features, wealso explore the use of alignment features for thoselanguages where we have parallel corpora
  
  I don't really understand how features from parallel corpora help decide syntactic structure?
2. fraserisverycool 11 Nov 2016
  
  in Public
  
  This property is not strictly true of linguisticdata, but is a good approximation: as Lee et al.(2010) note, assigning each word type to its mostfrequent part of speech yields an upper bound ac-curacy of 93% or more for most languages
  
  But if we assign each word type only one tag, it'll never be perfect! 93% is a lot but the word "run" will never be perfectly represented. I suppose it's the drawback of "unsupervised" methods
Visit annotations in context

Annotators

fraserisverycool

URL

aclweb.org/website/old_anthology/D/D11/D11-1059.pdf

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL