 Aug 2018

Significant variations of ngrams exist, for example smoothing (words that never show up in the corpus are modified to have nonzero probability)
Seems like you need the probability of a frequency when estimating from data. You would obtain counts of word transitions, then estimate transition probabilities from this data, with uncertainty on the transition probability.
Do you have different ngram models per text?

 Aug 2017

The other consists in making a subject out of a predicate. Instead of saying, Opium puts people to sleep, you say it has dormitive virtue. This is an important proceeding in mathematics. For example, take all "symbolic" methods, in which operations are operated upon. This may be called subjectal abstractions.
Peirce also calls this "hypostatic abstraction." Mathematical category theory applies this notion throughout  a functor is an operation on functions, etc.
