About Me

May oneko lead you to my latest work!

Hello! I am Zhenglin Cheng, a second-year Ph.D. student of LINs lab, Westlake University (through joint program with ZJU), advised by Prof. Tao LIN. I am also honored to be affiliated with Shanghai Innovation Institute (SII), a new force in the GenAI era. Before that, I received my bachelorโ€™s degree in Software Engineering from Zhejiang University (ZJU).

I love to write and post something (from technical notes to life stuff). I also practice Chinese traditional calligraphy to relax occasionally.

News

Research Interests

My long-term research goal is to build efficient multimodal agents that can understand the physical world, reason on real-world problems, and generate novel ideas, which could also learn from experience and evolve themselves in the constantly changing environment.

Looking at the present, I put my focus on:

  • Unified multimodal models: how to effectively and efficiently combine diffusion and autoregressive paradigm?
  • Few-step generation: how can we effectively train/distill continuous diffusion generators into 1-NFE onesโ€”and can the same be done for dLLMs?

Publications/Manuscripts (* denotes equal contribution)

ICLR'25
sym

๐Ÿ“– Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

Yongxin Guo*, Zhenglin Cheng*, Xiaoying Tang, Zhaopeng Tu, Tao Lin

GitHub Repo stars HF Checkpoints

๐Ÿ‘‰ DynMoE frees the burden of pivotal hyper-parameter selection for MoE training by enabling each token to activate different number of experts, and adjusting the number of experts automatically, acheiving stronger sparsity well maintaining performance.

arXiv'24
sym

๐Ÿ“– GMem: A Modular Approach for Ultra-Efficient Generative Models

Yi Tang*, Peng Sun*, Zhenglin Cheng*, Tao Lin

GitHub Repo stars HF Checkpoints

๐Ÿ‘‰ GMem decouples diffusion modeling by network for generalization and external memory bank for memorization, achieving 50ร— training speedup compared to SiT, 25ร— speed up to REPA.

EMNLP'24 (Main)
sym

๐Ÿ“– Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model (inherited from my undergrad thesis)

Wenqi Zhang*, Zhenglin Cheng*, Yuanyu He, โ€ฆ , Weiming Lu, Yueting Zhuang

Project GitHub Repo stars HF Datasets HF Datasets

๐Ÿ‘‰ Multimodal self-instruct utilizes LLMs and their code capabilities to synthesize massive abstract images and visual reasoning instructions across daily scenarios such as charts, graphs, visual puzzles, etc.

Experiences

Academic Services

  • Conference Reviewer: ICLR.

Educations

  • 2024/09 - 2029/06, Westlake University, College of Engineering.
  • 2020/09 - 2024/06, Zhejiang University, College of Computer Science and Technology.