This trend has borne out historically: before the deep learning era, the amount of compute used by AI models doubled in about 21.3 months; since deep learning as a paradigm took hold around 2010, the amount of compute used by models started doubling in only 5.7 months12. Since 2015 however, trends in compute growth have split into two: the amount of compute used in large-scale models has been doubling in roughly 9.9 months, while the amount of compute used in regular-scale models has been doubling in only about 5.7 months
If something is doubling faster in small models, how long before they Ive take the larger models? I can’t do the maths in my head