With respect to the predictive text portion of ChatGPT, a good non-technical (non-mathematical) description of a related mathematical model is described in chapter 3 of:
Pierce, John Robinson. An Introduction to Information Theory: Symbols, Signals and Noise. Second, Revised. Dover Books on Mathematics. 1961. Reprint, Mineola, N.Y: Dover Publications, Inc., 1980. https://www.amazon.com/Introduction-Information-Theory-Symbols-Mathematics/dp/0486240614.