Abstract: It remains challenging to train billion-scale DNN models on a single modern multi-GPU server due to the GPU memory wall. Unfortunately, existing memory-saving techniques such as GPU-CPU swap ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results