TG Telegram Group & Channel
Data Science Archive | United States America (US)
Create: Update:

massive GPU cluster 上训练技巧,看起来是对 mini-batch size 有一个比较好的 control,以及 2D-Torus all-reduce 来做各个 GPU 梯度更新同步问题。刚刚提交到 arxiv,来自 SONY 团队。paper 题目也很有意思:ImageNet/ResNet-50 Training in 224 Seconds.

This work Tesla V100 x1088, Infiniband EDR x2, 91.62% GPU scaling efficiency

https://arxiv.org/abs/1811.05233

massive GPU cluster 上训练技巧,看起来是对 mini-batch size 有一个比较好的 control,以及 2D-Torus all-reduce 来做各个 GPU 梯度更新同步问题。刚刚提交到 arxiv,来自 SONY 团队。paper 题目也很有意思:ImageNet/ResNet-50 Training in 224 Seconds.

This work Tesla V100 x1088, Infiniband EDR x2, 91.62% GPU scaling efficiency

https://arxiv.org/abs/1811.05233


>>Click here to continue<<

Data Science Archive




Share with your best friend
VIEW MORE

United States America Popular Telegram Group (US)