Kaifeng Zhang

I am a Ph.D. student in Computer Science at Columbia University, advised by Yunzhu Li.

I am interested in teaching robots to interact with the world by understanding not only how it looks, but also how it moves and responds to actions. My work explores this through digital twins, physics-informed world models, and sim-to-real transfer.

Before Columbia, I received my Bachelor's degree from Tsinghua University (Yao Class), and spent a year in Urbana-Champaign persuing my Ph.D. before transferring. Along the way, I have been fortunate to receive mentorship from Kris Hauser, Xiaolong Wang, Yang Gao, and Li Yi.

My research is partially supported by the Qualcomm Innovation Fellowship. I'm currently a research intern at World Labs, and previously interned at SceniX.

Email &nbsp/&nbsp Google Scholar &nbsp/&nbsp GitHub &nbsp/&nbsp Twitter &nbsp/&nbsp LinkedIn &nbsp/&nbsp CV

Publications

* indicates equal contribution. Representative papers are highlighted.

Real-to-Sim Robot Policy Evaluation with Gaussian Splatting Simulation of Soft-Body Interactions
Kaifeng Zhang*, Shuo Sha*, Hanxiao Jiang, Matthew Loper, Hyunjong Song, Guangyan Cai, Zhuo Xu, Xiaochen Hu, Changxi Zheng, Yunzhu Li
International Conference on Robotics & Automation (ICRA), 2026
website / arXiv / pdf / code

We propose a framework for robot policy evaluation in simulation environments, using Gaussian Splatting for rendering and soft-body digital twin for dynamics.

BoxTwin: Learning Elastoplastic Articulated Object Dynamics from Videos
Heng Zhang*, Gehan Zheng*, Kaifeng Zhang, Hyunjong Song, Shivansh Patel, Xiaochen Hu, Yunzhu Li, Changxi Zheng, Peter Yichen Chen
IROS 2025 RoDGE Workshop

We present an interactive digital twin construction (real-to-sim) framework that learns the full dynamics of elastoplastic articulated objects from videos.

PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos
Hanxiao Jiang, Hao-Yu Hsu, Kaifeng Zhang, Hsin-Ni Yu, Shenlong Wang, Yunzhu Li
International Conference on Computer Vision (ICCV), 2025
website / arXiv / pdf / code

We optimize a spring-mass physics model of deformable objects and integrate the model with 3D Gaussian Splatting for real-time re-simulation with rendering.

Particle-Grid Neural Dynamics for Learning Deformable Object Models from RGB-D Videos
Kaifeng Zhang, Baoyu Li, Kris Hauser, Yunzhu Li
Robotics: Science and Systems (RSS), 2025
website / arXiv / pdf / code / demo

We propose a neural particle-grid model for training dynamics model with real-world sparse-view RGB-D videos, enabling high-quality future prediction and rendering.

Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling
Mingtong Zhang*, Kaifeng Zhang*, Yunzhu Li
Conference on Robot Learning (CoRL), 2024
website / arXiv / pdf / code / demo

We learn neural dynamics models of objects from real perception data and combine the learned model with 3D Gaussian Splatting for action-conditioned predictive rendering.

AdaptiGraph: Material-Adaptive Graph-Based Neural Dynamics for Robotic Manipulation
Kaifeng Zhang*, Baoyu Li*, Kris Hauser, Yunzhu Li
Robotics: Science and Systems (RSS), 2024
ICRA 2024 RMDO Workshop (Best Abstract Award)
website / arXiv / pdf / code

We learn a material-conditioned neural dynamics model using graph neural network to enable predictive modeling of diverse real-world objects and achieve efficient manipulation via model-based planning.

4DRecons: 4D Neural Implicit Deformable Objects Reconstruction from a single RGB-D Camera with Geometrical and Topological Regularizations
Xiaoyan Cong, Haitao Yang, Liyan Chen, Kaifeng Zhang, Li Yi, Chandrajit Bajaj, Qixing Huang
arXiv, 2024

We achieve 4D neural implicit reconstruction from only a single-view scan using deformation and topology regularizations.

Self-Supervised Geometric Correspondence for Category-Level 6D Object Pose Estimation in the Wild
Kaifeng Zhang, Yang Fu, Shubhankar Borse, Hong Cai, Fatih Porikli, Xiaolong Wang
International Conference on Learning Representations (ICLR), 2023
website / arXiv / pdf / code

We propose a fully self-supervised method for category-level 6D object pose estimation by learning dense 2D-3D geometric correspondences. Our method can train on image collections without any 3D annotations.

Semantic-Aware Fine-Grained Correspondence
Yingdong Hu, Renhao Wang, Kaifeng Zhang, Yang Gao
European Conference on Computer Vision (ECCV), 2022 (Oral)
arXiv / pdf / code

We show that fusing fine-grained features learned with low-level contrastive objectives and semantic features from image-level objectives can improve SSL pretraining.

Template borrowed from Jon Barron