Publications
* indicates equal contribution
- [Preprint] Unleashing Efficient Asynchronous RL Post-Training via Staleness-Constrained Rollout Coordination Haoyang Li*, Sheng Lin*, Fangcheng Fu, Yuming Zhou, Xiaodong Ji, Yanfeng Zhao, Lefeng Wang, Jie Jiang, Bin Cui
- [Preprint, ChinaSys 2025 Oral] Hetu v2: A General and Scalable Deep Learning System with Hierarchical and Heterogeneous Single Program Multiple Data Annotations Haoyang Li, Fangcheng Fu, Hao Ge, Sheng Lin, Xuanyu Wang, Jiawen Niu, Xupeng Miao, Bin Cui
- [PPoPP 2026] Elastor: Elastic and Efficient Model Partitioning and Checkpointing for Fault-tolerant Distributed DL Training Xuanyu Wang, Fangcheng Fu, Haoyang Li, Hao Ge, Sheng Lin, Jiawen Niu, Bin Cui
- [SIGMOD 2026] Hydraulis: Balancing Large Transformer Model Training via Co-designing Parallel Strategies and Data Assignment Haoyang Li, Fangcheng Fu, Sheng Lin, Hao Ge, Xuanyu Wang, Jiawen Niu, Jinbao Xue, Yangyu Tao, Di Wang, Jie Jiang, Bin Cui
- [SIGMOD 2025] Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization Haoyang Li*, Fangcheng Fu*, Hao Ge, Sheng Lin, Xuanyu Wang, Jiawen Niu, Yujie Wang, Hailin Zhang, Xiaonan Nie, Bin Cui
- [VLDB 2025] LobRA: Multi-tenant Fine-tuning over Heterogeneous Data Sheng Lin*, Fangcheng Fu*, Haoyang Li, Hao Ge, Xuanyu Wang, Jiawen Niu, Yaofeng Tu, Bin Cui
- [SOSP 2024] Enabling Parallelism Hot Switching for Efficient Training of Large Language Models Hao Ge*, Fangcheng Fu*, Haoyang Li, Xuanyu Wang, Sheng Lin, Yujie Wang, Xiaonan Nie, Hailin Zhang, Xupeng Miao, Bin Cui
- [AAAI 2024, MLSys Workshop NeurIPS 2023] Accelerating text-to-image editing via cache-enabled sparse diffusion inference Zihao Yu, Haoyang Li, Fangcheng Fu, Xupeng Miao, Bin Cui
