Yean Cheng

cya2024cut.png

chengyean1999[at]163.com

Haidian, Beijing, China

I am a research engineer at Zhipu.AI, focusing on improving the performance of VLM’s on video, alignment, reasoning, etc. My research interets lies visual understanding and generation, language modeling and world modeling.

I received my Master’s degree from the School of Computer Science at Peking University, at CILab & AIIC, advised by Prof. Boxin Shi and Mr. Ming Lei. I received my Bachelor of Engineering degree in Automation and Bachelor of Arts degree in Economics from Tsinghua University in 2021. My academic research topic involves 3D modeling with neural implicit representations, computational photography, and image quality enhancement.

Deep learning is a useful tool (arguably more useful than most people think) for real-world applications. I enjoy tackling various tasks (e.g., molecule design, quantitative trading, interior design, recommendation system, image quality enhancement) with AI techniques and seeing the real-world impact of my work. I have worked (mostly interned) at wonderful AI start-ups (Collov, QuanMol), corporate companies (Alibaba, ByteDance), and a quantitative investment firm (Definitive Capital Managment). Along the way, I have met many mentors and peers in the field of intelligence. Please feel free to contact me if interested.

news

Mar 3, 2024 Our paper “MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models” is accepted by CVPR 2025.
Jan 22, 2024 I will join Zhipu.AI as a Research Engineer.
Dec 9, 2023 Our paper “Colorizing Monochromatic Radiance Fields” is accepted by The 38th AAAI Conference on Artificial Intelligence and selected for oral presentation.
Oct 21, 2023 Our paper “SPLiT: Single Portrait Lighting Estimation Via a Tetrad of Face Intrinsics” is accepted by T-PAMI.

selected publications

  1. motionbench.png
    MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
    Wenyi Hong*, Yean Cheng*, Zhuoyi Yang*, and 6 more authors
    CVPR, 2025
  2. cogvideox.png
    CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
    Zhuoyi Yang, Jiayan Teng, Wendi Zheng, and 8 more authors
    ICLR, 2025
  3. cogvlm2.png
    CogVLM2: Visual Language Models for Image and Video Understanding
    Wenyi Hong, Weihan Wang, Ming Ding, and 8 more authors
    Technical Report, 2024
  4. colornerf.png
    [Oral] Colorizing Monochromatic Radiance Fields
    Yean Cheng, Renjie Wan, Shuchen Weng, and 3 more authors
    AAAI, 2024
  5. split.png
    SPLiT: Single Portrait Lighting Estimation Via a Tetrad of Face Intrinsics
    Fei Fan*, Yean Cheng*, Yongjie Zhu, and 4 more authors
    IEEE T-PAMI, 2023
  6. GAT_GRU_MASSIVE.png
    Fault Diagnosis of Energy Networks Based on Improved Spatial–Temporal Graph Neural Network With Massive Missing Data
    Jingfei Zhang, Yean Cheng, and Xiao He
    IEEE Transactions on Automation Science and Engineering, 2023
  7. GAT_GRU.png
    Fault Diagnosis of Energy Networks: A Graph Embedding Learning Approach
    Jingfei Zhang, Yean Cheng, and Xiao He
    IEEE Transactions on Instrumentation and Measurement, 2022
  8. SPSR.png
    Structure-Preserving Super Resolution With Gradient Guidance
    Cheng Ma, Yongming Rao, Yean Cheng, and 3 more authors
    In CVPR, 2020