Di Chai

Logo

Ph.D. Students of Computer Science & Engineering at HKUST

View My GitHub Profile

Biography

Di Chai is a Ph.D. student of computer science and engineering at the Hong Kong University of Science and Technology. He is under the supervision of Prof. Kai Chen and Prof. Qiang Yang. His research interest is high-performance machine learning systems, such as efficient distributed matrix decompositions and training systems for large language models.

Email: dchai[at]cse[dot]ust[dot]hk

Curriculum Vitae and Research Statement

Undergoing Project

Enhancing Token Filtering Efficiency in Large Language Model Training with Collider

Token filtering aims to enhance the utility of large language models (LLMs) by removing inconsequential tokens during training. However, previous methods have not achieved significant improvements due to limited sparsity from filtering only in output layers and inefficient sparse GEMM processes. We address these issues by filtering tokens across all layers and optimizing GEMM operations, resulting in notable reductions in backpropagation time (more than 30%) and overall training time (more than 20%). Evaluations show that our system improves model utility compared to standard training while reducing training time significantly, and it can be easily integrated into existing frameworks. [Paper]

Publications

Conference Paper

Journal Paper

Workshop Paper