1 Matching Annotations
  1. Jan 2021
    1. The model is trained on 147M multi-turn dialogue from Reddit discussion thread. The largest model can be trained in several hours on a 8 V100 machines (however this is not required), with distributed training and FP16 option.

      github-dialogpt traning problem