Publications

You can also find my articles on my Google Scholar profile.

Conference Papers


Toppings: CPU-Assisted, Rank-Aware Adapter Serving for LLM Inference

Published in 2025 USENIX Annual Technical Conference (USENIX ATC 25), 2025

Suyi Li*, Hanfeng Lu*, Tianyuan Wu, Minchen Yu, Qizhen Weng, Xusheng Chen, Yizhou Shan, Binhang Yuan, Wei Wang (* Equal contribution).

Recommended citation: Suyi Li, Hanfeng Lu, Tianyuan Wu, Minchen Yu, Qizhen Weng, Xusheng Chen, Yizhou Shan, Binhang Yuan, and Wei Wang. "CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference." arXiv preprint arXiv:2401.11240 (2024).
Download Paper

Greyhound: Hunting Fail-Slows in Hybrid-Parallel Training at Scale

Published in 2025 USENIX Annual Technical Conference (USENIX ATC 25), 2025

Tianyuan Wu, Wei Wang, Yinghao Yu, Siran Yang, Wenchao Wu, Qinkai Duan, Guodong Yang, Jiamang Wang, Lin Qu, Liping Zhang.

Recommended citation: Tianyuan Wu, Wei Wang, Yinghao Yu, Siran Yang, Wenchao Wu, Qinkai Duan, Guodong Yang, Jiamang Wang, Lin Qu, and Liping Zhang. "FALCON: Pinpointing and Mitigating Stragglers for Large-Scale Hybrid-Parallel Training." arXiv preprint arXiv:2410.12588 (2024).
Download Paper

A Data Optimizer for Region-Aware Self-describing Files in Scientific Computing

Published in 2024 ACM Symposium on Cloud Computing (SoCC 24), 2024

Yanjie Song*, Tianyuan Wu*, Yuanhao Li, Guancheng Li, Yuchen Liu, Shu Yin, Wei Xue, Junchao Wang (* Equal contribution).

Recommended citation: Yanjie Song, Tianyuan Wu, Yuanhao Li, Guancheng Li, Yuchen Liu, Shu Yin, Wei Xue, and Junchao Wang. "A Data Optimizer for Region-Aware Self-describing Files in Scientific Computing." In Proceedings of the 15th ACM Symposium on Cloud Computing, pp. 431-446. 2024.
Download Paper

Portus: Efficient DNN Checkpointing to Persistent Memory with Zero-Copy

Published in 44th IEEE International Conference on Distributed Computing Systems (ICDCS 24), 2024

Yuanhao Li*, Tianyuan Wu*, Guancheng Li, Yanjie Song, Shu Yin (* Equal contribution).

Recommended citation: Li, Yuanhao, Tianyuan Wu, Guancheng Li, Yanjie Song, and Shu Yin. "Portus: Efficient DNN Checkpointing to Persistent Memory with Zero-Copy." In 2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS), pp. 59-70. IEEE, 2024.
Download Paper

Journal Articles


Reproducibility: Performance Evaluation of MemXCT on Azure CycleCloud Platform

Published in IEEE Transactions on Parallel and Distributed Systems, 2021

Yuchen Liu*, Yixuan Meng*, Kaiyuan Xu*, Zijun Xu*, Tianyuan Wu*, Yiwei Yang*, Shu Yin* (* All authors contributed equally).

Recommended citation: Yuchen Liu, Yixuan Meng, Kaiyuan Xu, Zijun Xu, Tianyuan Wu, Yiwei Yang, and Shu Yin. "Reproducibility: Performance Evaluation of MemXCT on Azure CycleCloud Platform." IEEE Transactions on Parallel and Distributed Systems 33, no. 9 (2021): 2047-2049.
Download Paper

Preprints


Adaptra: Straggler-Resilient Hybrid-Parallel Training with Pipeline Adaptation

Published in arXiv preprint, 2025

Tianyuan Wu*, Lunxi Cao*, Hangeng Lu, Xiaoxiao Jiang, Yinghao Yu, Siran Yang, Guodong Yang, Jiamang Wang, Lin Qu, Liping Zhang, Wei Wang.

Recommended citation: Tianyuan Wu, Lunxi Cao, Hangeng Lu, Xiaoxiao Jiang, Yinghao Yu, Siran Yang, Guodong Yang, Jiamang Wang, Lin Qu, Liping Zhang, and Wei Wang. "Adaptra: Straggler-Resilient Hybrid-Parallel Training with Pipeline Adaptation." arXiv preprint arXiv:2504.19232 (2025).
Download Paper