Publications
Publications by categories in reversed chronological order. 1 represents co-first author.
2025
-
ASPLOSHelix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUsProceedings of ASPLOS Conference 2025
-
ICLRNetMoE: Accelerating MoE Training through Dynamic Sample Placement (Spotlight)Proceedings of ICLR Conference 2025
2024
-
SOSPEnabling Parallelism Hot Switching for Efficient Training of Large Language ModelsProceedings of SOSP Conference 2024
-
SCAtlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUsProceedings of SC Conference 2024
-
ASPLOSSpotServe: Serving Generative Large Language Models on Preemptible Instances (Distinguished Artifact Award), (IEEE Micro Top Picks Honorable Mention)Proceedings of ASPLOS Conference 2024
-
ASPLOSSpecInfer: Accelerating Generative Large Language Model Serving with Speculative Inference and Token Tree VerificationProceedings of ASPLOS Conference 2024
-
ASPLOSOptimal Kernel Orchestration for Tensor Programs with KorchProceedings of ASPLOS Conference 2024
-
VLDBExperimental Analysis of Large-scale Learnable Vector Storage CompressionProc. VLDB Endow. 2024
-
ICDEMFIX: An Efficient and Reliable Index Advisor via Multi-Fidelity Bayesian OptimizationProceedings of ICDE Conference 2024
2023
-
OSDIEinNet: Optimizing Tensor Programs with Derivation-Based TransformationsProceedings of OSDI Conference 2023
-
VLDBSDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel TrainingProc. VLDB Endow. 2023
-
VLDBGalvatron: Efficient Transformer Training over Multiple GPUs Using Automatic ParallelismProc. VLDB Endow. 2023
-
SIGMODFlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device PlacementProceedings of SIGMOD Conference 2023
2022
-
VLDBHET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework (Best Scalable Data Science Paper Award)Proc. VLDB Endow. 2022
-
VLDBTowards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local UpdatesProc. VLDB Endow. 2022
-
SIGMODHET-GMP: A Graph-based System Approach to Scaling Large Embedding Model TrainingIn Proceedings of SIGMOD Conference 2022
-
VLDBJP2CG: A Privacy Preserving Collaborative Graph Neural Network Training FrameworkThe VLDB Journal 2022