Modalities:一个用于大规模大语言模型训练与研究的PyTorch原生框架 / Modalities, a PyTorch-native Framework For Large-scale LLM Training and Research
1️⃣ 一句话总结
这篇论文提出了一个名为Modalities的PyTorch原生框架,它通过集成先进的并行化策略和模块化设计,旨在高效支持万亿级数据、十亿级参数的大语言模型预训练与系统性消融实验,从而解决现有框架在实验管理和可复现性方面的不足。
Today's LLM (pre-) training and research workflows typically allocate a significant amount of compute to large-scale ablation studies. Despite the substantial compute costs of these ablations, existing open-source frameworks provide limited tooling for these experiments, often forcing researchers to write their own wrappers and scripts. We propose Modalities, an end-to-end PyTorch-native framework that integrates data-driven LLM research with large-scale model training from two angles. Firstly, by integrating state-of-the-art parallelization strategies, it enables both efficient pretraining and systematic ablations at trillion-token and billion-parameter scale. Secondly, Modalities adopts modular design with declarative, self-contained configuration, enabling reproducibility and extensibility levels that are difficult to achieve out-of-the-box with existing LLM training frameworks.
Modalities:一个用于大规模大语言模型训练与研究的PyTorch原生框架 / Modalities, a PyTorch-native Framework For Large-scale LLM Training and Research
这篇论文提出了一个名为Modalities的PyTorch原生框架,它通过集成先进的并行化策略和模块化设计,旨在高效支持万亿级数据、十亿级参数的大语言模型预训练与系统性消融实验,从而解决现有框架在实验管理和可复现性方面的不足。
源自 arXiv: 2602.08387