About Me
Hello! I am Zhenglin Cheng, a Ph.D. student of LINs lab, Westlake University (through joint program with ZJU), advised by Prof. Tao LIN. Before that, I received my bachelorโs degree in Software Engineering from Zhejiang University (ZJU).
I love to write and post something (from technical notes to life stuff). I also practice Chinese traditional calligraphy to relax occasionally.
Research Interests
My research interests are generally within the scope of efficiency and effectiveness of AI systems (such as large language models, diffusion models), specifically in model architectural optimization, training accelerating techniques, and efficient inference paradigms. Actively exploring! ๐
Publications/Manuscripts (* denotes equal contribution)
๐ Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Yongxin Guo*, Zhenglin Cheng*, Xiaoying Tang, Zhaopeng Tu, Tao Lin
๐ DynMoE frees the burden of pivotal hyper-parameter selection for MoE training by enabling each token to activate different number of experts, and adjusting the number of experts automatically during training.
๐ Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model (inherited from my undergrad thesis)
Wenqi Zhang*, Zhenglin Cheng*, Yuanyu He, Mengna Wang, โฆ , Weiming Lu, Yueting Zhuang
๐ Multimodal self-instruct utilizes LLMs and their code capabilities to synthesize massive abstract images and visual reasoning instructions across daily scenarios such as charts, graphs, visual puzzles, etc.
News
- 2024/09: ๐ฅณ Mutimodal Self-Instruct is accepted by EMNLPโ24 (Main) as Oral!
- 2024/07: ๐ Excited to intern at Baichuan AI on multimodal LLM pretraining.
- 2024/06: ๐ฎ Successfully defended my undergraduate thesis, ready to graduate.
Educations
- 2024/09 - 2029/06, Westlake University, College of Engineering.
- 2020/09 - 2024/06, Zhejiang University, College of Computer Science and Technology.