Paper¶
Abstract
这里用来放读论文的笔记,写的不好还请见谅>_<|||
- BiS-KM: Enabling Any-Precision K-Means on FPGAs
- ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
- Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
- GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
- Mixed Precision Training
- FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding
- Softmax Acceleration with Adaptive Numeric Format for both Training and Inference
最后更新:
2024年4月5日 00:24:16
创建日期: 2024年2月4日 19:47:15
创建日期: 2024年2月4日 19:47:15