Projects
MegatronAPP
Toolchain built around Megatron-LM for Distributed Training
MegatronApp is a toolchain built around the Megatron-LM training framework, designed to give practitioners a suite of value-added capabilities such as performance tuning, slow-node detection, and training-process visualization.
InstantTensor
Source Code2026
An Ultra-fast, Distributed Safetensors Loader
InstantTensor is an ultra-fast, distributed Safetensors loader designed to maximize I/O throughput when moving model weights from Safetensors files to GPU memory. It can be installed via pip and is natively supported in vLLM and SGLang.