How to Achieve High-Performance, Scalable and Distributed DNN Training on Modern HPC Systems

Rich Brueckner

from Inside HPC & AI News | High-Performance Computing & Artificial Intelligence on 2020-05-03 09:55 (#52ZT4)

DK Panda from Ohio State University gave this talk at the Stanford HPC Conference. "This talk will focus on a range of solutions being carried out in my group to address these challenges. The solutions will include: 1) MPI-driven Deep Learning, 2) Co-designing Deep Learning Stacks with High-Performance MPI, 3) Out-of- core DNN training, and 4) Hybrid (Data and Model) parallelism. Case studies to accelerate DNN training with popular frameworks like TensorFlow, PyTorch, MXNet and Caffe on modern HPC systems will be presented."

The post How to Achieve High-Performance, Scalable and Distributed DNN Training on Modern HPC Systems appeared first on insideHPC.

Source	RSS or Atom Feed
Feed Location	http://insidehpc.com/feed/
Feed Title	Inside HPC & AI News \| High-Performance Computing & Artificial Intelligence
Feed Link	https://insidehpc.com/