LoongForge-A high-performance training framework for LLM, VLM, DIT, VLA models
LoongForge is a high-performance training framework designed for various AI models, including LLMs and VLMs. It offers significant speed improvements and supports both NVIDIA GPUs and Kunlun XPUs. The framework is part of Baidu Baige's open-source series and aims to enhance training efficiency across multiple domains.
- ▪LoongForge provides up to 5.04× training speedup compared to mainstream open-source baselines.
- ▪It supports production training for enterprise customers in sectors like Education and Computer Vision.
- ▪The framework includes features like flexible multi-modal composition and heterogeneous parallelism.
Opening excerpt (first ~120 words) tap to expand
English | 中文 A modular, scalable, high-performance training framework for LLMs, VLMs, diffusion, and embodied models. 🚀 Up to 5.04× training speedup · 🌐 Native NVIDIA GPU & Kunlun XPU support 💡 Why LoongForge? 🐉 LoongForge is part of Baidu Baige's Loong open-source series — named after the traditional Chinese loong boat (龙舟), a symbol of coordinated power and forward momentum. LoongForge is a unified training framework for LLMs, VLMs, VLAs, and diffusion models, covering pre-training, continued pre-training, and SFT. Built upon Megatron-LM with deep systemic enhancements across model coverage, training performance, and hardware support, it delivers significant speedups over mainstream open-source baselines.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.