Microsoft’s ZeRO-2 with DeepSpeed trains neural networks with up to 170 billion parameters

At its all-digital Build conference, Microsoft announced ZeRO-2, which allows for the distributed training of models with up to 170 billion parameters.Read More

Leave a Reply