About Me
 lead you to my latest work!
 lead you to my latest work!Hello! I am Zhenglin Cheng, a second-year Ph.D. student of  LINs lab, Westlake University (through joint program with ZJU), advised by Prof. Tao LIN. I am also honored to be affiliated with
 LINs lab, Westlake University (through joint program with ZJU), advised by Prof. Tao LIN. I am also honored to be affiliated with  Shanghai Innovation Institute (SII), a new force in the GenAI era. Before that, I received my bachelorโs degree in Software Engineering from Zhejiang University (ZJU).
 Shanghai Innovation Institute (SII), a new force in the GenAI era. Before that, I received my bachelorโs degree in Software Engineering from Zhejiang University (ZJU).
I love to write and post something (from technical notes to life stuff). I also practice Chinese traditional calligraphy to relax occasionally.
News
- 2025/01, ๐ฅณ Dynamic Mixture of Experts (DynMoE) is accepted to ICLRโ25, see you in Singapore ๐ธ๐ฌ !
Research Interests
My long-term research goal is to build efficient multimodal agents that can understand the physical world, reason on real-world problems, and generate novel ideas, which could also learn from experience and evolve themselves in the constantly changing environment.
Looking at the present, I put my focus on:
- Unified multimodal models: how to effectively and efficiently combine diffusion and autoregressive paradigm?
- Few-step generation: how can we effectively train/distill continuous diffusion generators into 1-NFE onesโand can the same be done for dLLMs?
Publications/Manuscripts (* denotes equal contribution)

๐ Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Yongxin Guo*, Zhenglin Cheng*, Xiaoying Tang, Zhaopeng Tu, Tao Lin
๐ DynMoE frees the burden of pivotal hyper-parameter selection for MoE training by enabling each token to activate different number of experts, and adjusting the number of experts automatically, acheiving stronger sparsity well maintaining performance.

๐ GMem: A Modular Approach for Ultra-Efficient Generative Models
Yi Tang*, Peng Sun*, Zhenglin Cheng*, Tao Lin
๐ GMem decouples diffusion modeling by network for generalization and external memory bank for memorization, achieving 50ร training speedup compared to SiT, 25ร speed up to REPA.

๐ Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model (inherited from my undergrad thesis)
Wenqi Zhang*, Zhenglin Cheng*, Yuanyu He, โฆ , Weiming Lu, Yueting Zhuang
๐ Multimodal self-instruct utilizes LLMs and their code capabilities to synthesize massive abstract images and visual reasoning instructions across daily scenarios such as charts, graphs, visual puzzles, etc.
Experiences
- 2025/07 - Present, Ant Research (Advisor: Dr. Jianguo Li).
Academic Services
- Conference Reviewer: ICLR.
Educations
- 2024/09 - 2029/06, Westlake University, College of Engineering.
- 2020/09 - 2024/06, Zhejiang University, College of Computer Science and Technology.
