Zhenfang Chen   陈振方
I am a researcher at MIT-IBM Watson AI Lab in Cambridge, MA, USA.
I received my Ph.D. degree from the Department of Computer Science at The University of Hong Kong, where I was advised by Prof. Kenneth K.Y. Wong.
My interests are centered around machine learning and its applications to vision and language. My ultimate research goal is to develop an autonomous agent that can perceive, reason and plan about the physical world, and communicate with humans in natural language.
Email  / 
Github  / 
LinkedIn  / 
DBLP  / 
Twitter  / 
Google Scholar
|
|
[July 2024] Happy to serve as an Area Chair (AC) for WACV 2025 .
[July 2024] Happy to serve as a Senior Program Committee (SPC) Member for AAAI 2025.
[July 2024] Papers are accepted in ICML 2024, ICLR 2024, ECCV 2024, CVPR 2024 and AAAI 2024.
[January 2023] A paper about competion-level code generation has been accepted by ICLR 2023.
[January 2023] A paper about face video inpainting has been accepted by TIP 2023.
[September 2022] Serve as a Senior Program Committee (SPC) Member for AAAI 2023.
[September 2022] A paper (S3-NeRF) about photometric stereo has been accepted by NeurIPS 2022.
[September 2022] A paper about Embodied Concept Learning has been accepted by CoRL 2022.
[July 2022] We are organizing a workshop about Machine Visual Common Sense on ECCV 2022.
[July 2022] A paper about Multi-View Photometric Stereo was accepted by ECCV 2022. A paper for Mask-Free Face Recognition was accepted by ICIP 2022.
[January 2022] A paper (ComPhy) about physical reasoning has been accepted by ICLR 2022.
Recent Research Highlight
|
ICLR 2024: GENOME
ICLR 2024: CoVLM
AAAI 2024: Visual CoT
ICLR 2022: ComPhy
ICLR 2023: PG-TD
ICLR 2021: DCL
CoRL22: ECL
NeurIPS22: S3-NeRF
|
Publications (  * denotes equal contribution, † denotes corresponding author.)
|
FlexAttention for Efficient High-Resolution Vision-Language Models
Junyan Li, Delin Chen, Tianle Cai, Peihao Chen, Yining Hong, Zhenfang Chen , Yikang Shen, Chuang Gan
ECCV2024
Paper
Project
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos
Zhicheng Zheng*, Xin Yan*, Zhenfang Chen*†, Jingzhou Wang, Qin Zhi Eddie Lim, Joshua B. Tenenbaum, Chuang Gan
ICML2024
Paper
Project
SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge
Andong Wang*, Bo Wu*, Sunli Chen, Zhenfang Chen, Haotian Guan, Wei-Ning Lee, Li Erran Li, Chuang Gan
CVPR2024
Paper
Project
GENOME: Generative Neuro-Symbolic Visual Reasoning by Growing and Reusing Modules
Zhenfang Chen*, Rui Sun*, Wenjun Liu*, Yining Hong, Chuang Gan
ICLR2024
Paper
Project
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Junyan Li, Delin Chen, Yining Hong, Zhenfang Chen, Peihao Chen, Yikang Shen, Chuang Gan
ICLR2024
Paper
Project
SALMON: Self-Alignment with Principle-Following Reward Models
Zhiqing Sun, Yikang Shen, Qinhong Zhou, Hongxin Zhang, Zhenfang Chen, David Cox, Yiming Yang, Chuang Gan
ICLR2024
Paper
Project
Visual Chain-of-Thought Prompting for Knowledge-based Visual Reasoning
Zhenfang Chen*, Qinhong Zhou*, Yikang Shen, Yining Hong, Zhiqing Sun, Dan Gutfreund, Chuang Gan
AAAI2024
Paper
Project
Sparse Universal Transformer
Shawn Tan*, Yikang Shen*, Zhenfang Chen, Aaron Courville, Chuang Gan
EMNLP2023
Paper
Project
Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties
Hsiao-Yu Tung*, Mingyu Ding*, Zhenfang Chen, Daniel M. Bear, Chuang Gan, Joshua B. Tenenbaum, Daniel L. K. Yamins, Judith Fan, Kevin A. Smith
NeurIPS2023 (Datasets and Benchmarks Track)
Paper
Project
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
Zhiqing Sun, Yikang Shen, Qinhong Zhou, Hongxin Zhang, Zhenfang Chen, David Cox, Yiming Yang, and Chuang Gan
NeurIPS2023 (Spotlight)
Paper
Project
3D-LLM: Injecting the 3D World into Large Language Models
Yining Hong, Haoyu Zhen, Peihao Chen, Shuhong Zheng, Yilun Du, Zhenfang Chen, Chuang Gan
NeurIPS2023 (Spotlight)
Paper
Project
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
Chengyang Zhao, Yikang Shen, Zhenfang Chen, Mingyu Ding, Chuang Gan
ICCV2023
Paper
Project
Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners
Zitian Chen, Yikang Shen, Mingyu Ding, Zhenfang Chen, Hengshuang Zhao, Erik Learned-Miller, Chuang Gan
CVPR2023
Paper
Project
Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
Mingyu Ding, Yikang Shen, Lijie Fan, Zhenfang Chen, Zitian Chen, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
CVPR2023
Paper
Project
3D Concept Learning and Reasoning from Multi-View Images
Yining Hong, Chunru Lin, Yilun Du, Zhenfang Chen, Joshua B. Tenenbaum, Chuang Gan
CVPR2023
Paper
Project
Planning with Large Language Models for Code Generation
Shun Zhang, Zhenfang Chen, Yikang Shen, Mingyu Ding, Joshua B. Tenenbaum, and Chuang Gan
ICLR2023
Paper
Project
Deep Face Video Inpainting via UV Mapping
Wenqi Yang, Zhenfang Chen, Chaofeng Chen, Guanying Chen, Kwan-Yee K. Wong
IEEE Transactions on Image Processing (TIP), 2023
Paper
Project
S3-NeRF: Neural Reflectance Field from Shading and Shadow under a Single Viewpoint
Wenqi Yang, Guanying Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong
NeurIPS2022
Paper
Project
Code
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
Mingyu Ding, Yan Xu, Zhenfang Chen, David Daniel Cox, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
CoRL2022
Paper
Project
Code
PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo
Wenqi Yang, Guanying Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong
ECCV2022
Paper
Project
Code
A Unified Framework for Masked and Mask-Free Face Recognition via Feature Rectification
Shaozhe Hao, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong
ICIP2022
Paper
Github
ComPhy: Compositional Physical Reasoning of Objects and Events from Videos
Zhenfang Chen, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan
ICLR2022
Paper
Project
Code
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
Mingyu Ding, Zhenfang Chen, Tao Du, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
NeurIPS2021
Paper
Project
Code
STAR: A Benchmark for Situated Reasoning in Real-World Videos
Bo Wu, Shoubin Yu, Zhenfang Chen , Joshua B. Tenenbaum, Chuang Gan
NeurIPS2021 (Datasets and Benchmarks Track)
Paper
Project
The Blessings of Unlabeled Background in Untrimmed Videos
Yuan Liu, Jingyuan Chen, Zhenfang Chen, Bing Deng, Jianqiang Huang, Hanwang Zhang
CVPR2021
Paper
Code
Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning
Zhenfang Chen, Jiayuan Mao, Jiajun Wu, Kwan-Yee K. Wong, Joshua B. Tenenbaum, Chuang Gan
ICLR2021
Paper
Project
Code
Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension
Zhenfang Chen, Peng Wang, Lin Ma, Kwan-Yee K. Wong, Qi Wu
CVPR2020
Paper
Code & Data
Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video
Zhenfang Chen, Lin Ma, Wenhan Luo, Kwan-Yee K. Wong
ACL2019 (Oral Presentation)
Paper
Code
Learning Local Similarity with Spatial Relations for Object Retrieval
Zhenfang Chen, Zhanghui Kuang, Wayne Zhang, Kwan-Yee K. Wong
MM2019
Paper
Boosting up scene text detectors with guided CNN.
Xiaoyu Yue, Zhanghui Kuang, Zhaoyang Zhang, Zhenfang Chen, Pan He, Yu Qiao and Wayne Zhang
BMVC2018 (Oral Presentation)
Paper
Aggregated deep feature from activation clusters for particular object retrieval.
Zhenfang Chen, Zhanghui Kuang, Kwan-Yee K. Wong, Wayne Zhang
MM17 (Thematic Workshop)
Paper
Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble
Shun Zhang, Zhenfang Chen, Sunli Chen, Yikang Shen, Zhiqing Sun, and Chuang Gan
Arxiv
Paper
Project
ModuleFormer: Learning Modular Large Language Models From Uncurated Data
Yikang Shen, Zheyu Zhang, Tianyou Cao, Shawn Tan, Zhenfang Chen, Chuang Gan
Arxiv
Paper
Project
Look Closer to Ground Better: Weakly-Supervised Temporal Grounding of Sentence in Video
Zhenfang Chen, Lin Ma, Wenhan Luo, Peng Tang, Kwan-Yee K. Wong
Arxiv.
Paper
Deep learning for visual retrieval, visual grounding and visual reasoning
Zhenfang Chen
Dept. of Computer Science, The University of Hong Kong, 2021
HKU Thesis Online
- Workshop Organizers:
Machine Visual Common Sense, CVPR 2023.
Machine Visual Common Sense, ECCV 2022.
- Area Chair:
WACV 2025
- Senior Program Committee (SPC) Member:
AAAI 2025
AAAI 2024
AAAI 2023
- Conference Reviewer:
CVPR, ICCV, ACL, EMNLP, IJCAI, AAAI, NeurIPS, ICME, ICLR
- Journal Reviewer:
Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
International Journal of Computer Vision (IJCV)
Transactions on Image Processing (TIP)
Transactions on Multimedia Computing Communications and Applications (TOMM)
Neurocomputing
Pattern Recognition (PR)
Transactions on Neural Networks and Learning Systems (TNNLS)
- M. Braun Postgraduate Prizes, HKU 2019-2020
- Postgraduate Scholarships (PGS), HKU 2016-2020
Teaching Experience at HKU
|
- [Spring, 2020]: COMP3270 Artificial Intelligence
- [Spring, 2019]: COMP7404 Computational Intelligence and Machine Learning
- [Spring, 2018]: COMP7404 Computational Intelligence and Machine Learning
- [Summer, 2017]: COMP7502 Image Processing and Computer Vision
|