Zhenfang Chen   陈振方

I am a researcher at MIT-IBM Watson AI Lab in Cambridge, MA, USA. I received my Ph.D. degree from the Department of Computer Science at The University of Hong Kong, where I was advised by Prof. Kenneth K.Y. Wong.

My interests are centered around machine learning and its applications to vision and language. My ultimate research goal is to develop an autonomous agent that can perceive, reason and plan about the physical world, and communicate with humans in natural language.

Email  /  Github  /  LinkedIn  /  DBLP  /  Twitter  /  Google Scholar

profile photo
News
    [July 2024] Happy to serve as an Area Chair (AC) for WACV 2025 .
    [July 2024] Happy to serve as a Senior Program Committee (SPC) Member for AAAI 2025.
    [July 2024] Papers are accepted in ICML 2024, ICLR 2024, ECCV 2024, CVPR 2024 and AAAI 2024.
    [January 2023] A paper about competion-level code generation has been accepted by ICLR 2023.
    [January 2023] A paper about face video inpainting has been accepted by TIP 2023.
    [September 2022] Serve as a Senior Program Committee (SPC) Member for AAAI 2023.
    [September 2022] A paper (S3-NeRF) about photometric stereo has been accepted by NeurIPS 2022.
    [September 2022] A paper about Embodied Concept Learning has been accepted by CoRL 2022.
    [July 2022]  We are organizing a workshop about Machine Visual Common Sense on ECCV 2022.
    [July 2022]  A paper about Multi-View Photometric Stereo was accepted by ECCV 2022. A paper for Mask-Free Face Recognition was accepted by ICIP 2022.
    [January 2022]  A paper (ComPhy) about physical reasoning has been accepted by ICLR 2022.
Recent Research Highlight

ICML 2024: ContPhy

ICLR 2024: GENOME

ICLR 2024: CoVLM

AAAI 2024: Visual CoT

ICLR 2022: ComPhy

ICLR 2023: PG-TD

ICLR 2021: DCL

CoRL22: ECL

NeurIPS22: S3-NeRF

Publications (  * denotes equal contribution, † denotes corresponding author.)

FlexAttention for Efficient High-Resolution Vision-Language Models

Junyan Li, Delin Chen, Tianle Cai, Peihao Chen, Yining Hong, Zhenfang Chen , Yikang Shen, Chuang Gan

ECCV2024

Paper Project

ContPhy: Continuum Physical Concept Learning and Reasoning from Videos

Zhicheng Zheng*, Xin Yan*, Zhenfang Chen*, Jingzhou Wang, Qin Zhi Eddie Lim, Joshua B. Tenenbaum, Chuang Gan

ICML2024

Paper Project

SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge

Andong Wang*, Bo Wu*, Sunli Chen, Zhenfang Chen, Haotian Guan, Wei-Ning Lee, Li Erran Li, Chuang Gan

CVPR2024

Paper Project

GENOME: Generative Neuro-Symbolic Visual Reasoning by Growing and Reusing Modules

Zhenfang Chen*, Rui Sun*, Wenjun Liu*, Yining Hong, Chuang Gan

ICLR2024

Paper Project

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Junyan Li, Delin Chen, Yining Hong, Zhenfang Chen, Peihao Chen, Yikang Shen, Chuang Gan

ICLR2024

Paper Project

SALMON: Self-Alignment with Principle-Following Reward Models

Zhiqing Sun, Yikang Shen, Qinhong Zhou, Hongxin Zhang, Zhenfang Chen, David Cox, Yiming Yang, Chuang Gan

ICLR2024

Paper Project

Visual Chain-of-Thought Prompting for Knowledge-based Visual Reasoning

Zhenfang Chen*, Qinhong Zhou*, Yikang Shen, Yining Hong, Zhiqing Sun, Dan Gutfreund, Chuang Gan

AAAI2024

Paper Project

Sparse Universal Transformer

Shawn Tan*, Yikang Shen*, Zhenfang Chen, Aaron Courville, Chuang Gan

EMNLP2023

Paper Project

Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties

Hsiao-Yu Tung*, Mingyu Ding*, Zhenfang Chen, Daniel M. Bear, Chuang Gan, Joshua B. Tenenbaum, Daniel L. K. Yamins, Judith Fan, Kevin A. Smith

NeurIPS2023 (Datasets and Benchmarks Track)

Paper Project

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Zhiqing Sun, Yikang Shen, Qinhong Zhou, Hongxin Zhang, Zhenfang Chen, David Cox, Yiming Yang, and Chuang Gan

NeurIPS2023 (Spotlight)

Paper Project

3D-LLM: Injecting the 3D World into Large Language Models

Yining Hong, Haoyu Zhen, Peihao Chen, Shuhong Zheng, Yilun Du, Zhenfang Chen, Chuang Gan

NeurIPS2023 (Spotlight)

Paper Project

TextPSG: Panoptic Scene Graph Generation from Textual Descriptions

Chengyang Zhao, Yikang Shen, Zhenfang Chen, Mingyu Ding, Chuang Gan

ICCV2023

Paper Project

Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners

Zitian Chen, Yikang Shen, Mingyu Ding, Zhenfang Chen, Hengshuang Zhao, Erik Learned-Miller, Chuang Gan

CVPR2023

Paper Project

Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention

Mingyu Ding, Yikang Shen, Lijie Fan, Zhenfang Chen, Zitian Chen, Ping Luo, Joshua B. Tenenbaum, Chuang Gan

CVPR2023

Paper Project

3D Concept Learning and Reasoning from Multi-View Images

Yining Hong, Chunru Lin, Yilun Du, Zhenfang Chen, Joshua B. Tenenbaum, Chuang Gan

CVPR2023

Paper Project

Planning with Large Language Models for Code Generation

Shun Zhang, Zhenfang Chen, Yikang Shen, Mingyu Ding, Joshua B. Tenenbaum, and Chuang Gan

ICLR2023

Paper Project

Deep Face Video Inpainting via UV Mapping

Wenqi Yang, Zhenfang Chen, Chaofeng Chen, Guanying Chen, Kwan-Yee K. Wong

IEEE Transactions on Image Processing (TIP), 2023

Paper Project

S3-NeRF: Neural Reflectance Field from Shading and Shadow under a Single Viewpoint

Wenqi Yang, Guanying Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong

NeurIPS2022

Paper Project Code

Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following

Mingyu Ding, Yan Xu, Zhenfang Chen, David Daniel Cox, Ping Luo, Joshua B. Tenenbaum, Chuang Gan

CoRL2022

Paper Project Code

PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo

Wenqi Yang, Guanying Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong

ECCV2022

Paper Project Code

A Unified Framework for Masked and Mask-Free Face Recognition via Feature Rectification

Shaozhe Hao, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong

ICIP2022

Paper Github

ComPhy: Compositional Physical Reasoning of Objects and Events from Videos

Zhenfang Chen, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

ICLR2022

Paper Project Code

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language

Mingyu Ding, Zhenfang Chen, Tao Du, Ping Luo, Joshua B. Tenenbaum, Chuang Gan

NeurIPS2021

Paper Project Code

STAR: A Benchmark for Situated Reasoning in Real-World Videos

Bo Wu, Shoubin Yu, Zhenfang Chen , Joshua B. Tenenbaum, Chuang Gan

NeurIPS2021 (Datasets and Benchmarks Track)

Paper Project

The Blessings of Unlabeled Background in Untrimmed Videos

Yuan Liu, Jingyuan Chen, Zhenfang Chen, Bing Deng, Jianqiang Huang, Hanwang Zhang

CVPR2021

Paper Code

Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning

Zhenfang Chen, Jiayuan Mao, Jiajun Wu, Kwan-Yee K. Wong, Joshua B. Tenenbaum, Chuang Gan

ICLR2021

Paper Project Code

Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension

Zhenfang Chen, Peng Wang, Lin Ma, Kwan-Yee K. Wong, Qi Wu

CVPR2020

Paper Code & Data

Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video

Zhenfang Chen, Lin Ma, Wenhan Luo, Kwan-Yee K. Wong

ACL2019 (Oral Presentation)

Paper Code

Learning Local Similarity with Spatial Relations for Object Retrieval

Zhenfang Chen, Zhanghui Kuang, Wayne Zhang, Kwan-Yee K. Wong

MM2019

Paper

Boosting up scene text detectors with guided CNN.

Xiaoyu Yue, Zhanghui Kuang, Zhaoyang Zhang, Zhenfang Chen, Pan He, Yu Qiao and Wayne Zhang

BMVC2018 (Oral Presentation)

Paper

Aggregated deep feature from activation clusters for particular object retrieval.

Zhenfang Chen, Zhanghui Kuang, Kwan-Yee K. Wong, Wayne Zhang

MM17 (Thematic Workshop)

Paper
Preprints

Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble

Shun Zhang, Zhenfang Chen, Sunli Chen, Yikang Shen, Zhiqing Sun, and Chuang Gan

Arxiv

Paper Project

ModuleFormer: Learning Modular Large Language Models From Uncurated Data

Yikang Shen, Zheyu Zhang, Tianyou Cao, Shawn Tan, Zhenfang Chen, Chuang Gan

Arxiv

Paper Project

Look Closer to Ground Better: Weakly-Supervised Temporal Grounding of Sentence in Video

Zhenfang Chen, Lin Ma, Wenhan Luo, Peng Tang, Kwan-Yee K. Wong

Arxiv.

Paper
PhD Dissertation

Deep learning for visual retrieval, visual grounding and visual reasoning

Zhenfang Chen

Dept. of Computer Science, The University of Hong Kong, 2021


HKU Thesis Online
Professional Services
  • Workshop Organizers:
    Machine Visual Common Sense, CVPR 2023.
    Machine Visual Common Sense, ECCV 2022.
  • Area Chair:
      WACV 2025
  • Senior Program Committee (SPC) Member:
      AAAI 2025
  •   AAAI 2024
      AAAI 2023
  • Conference Reviewer:
      CVPR, ICCV, ACL, EMNLP, IJCAI, AAAI, NeurIPS, ICME, ICLR
  • Journal Reviewer:
      Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
      International Journal of Computer Vision (IJCV)
      Transactions on Image Processing (TIP)
      Transactions on Multimedia Computing Communications and Applications (TOMM)
      Neurocomputing
      Pattern Recognition (PR)
      Transactions on Neural Networks and Learning Systems (TNNLS)
Honors and Awards
  • M. Braun Postgraduate Prizes, HKU 2019-2020
  • Postgraduate Scholarships (PGS), HKU 2016-2020
Teaching Experience at HKU
  • [Spring, 2020]: COMP3270 Artificial Intelligence
  • [Spring, 2019]: COMP7404 Computational Intelligence and Machine Learning
  • [Spring, 2018]: COMP7404 Computational Intelligence and Machine Learning
  • [Summer, 2017]: COMP7502 Image Processing and Computer Vision 
Previous Experience
The University of Adelaide, 2019 Visiting student, working with Dr. Qi Wu and Dr. Peng Wang
Tencent AI lab, 2018 Research intern, working with Dr. Lin Ma and Dr. Wenhan Luo
Sensetime, 2015 Research intern, working with Dr. Zhanghui Kuang and Dr. Wayne Zhang.
Microsoft Research Asia, 2015 Research intern, working with Dr. lei Sun
Visitors

Website adapted from Jon Barron