1 Matching Annotations
  1. Nov 2021
    1. Bulk synchronous parallels (BSP): Workers sync data at the end of every minibatch. It prevents model weights staleness and good learning efficiency but each machine has to halt and wait for others to send gradients.

      BSP 是什么以及其优缺点?