I am interested in teaching robots to interact with the world by understanding not only how it looks, but also how it moves and responds to actions.
My work explores this through digital twins, physics-informed world models, and sim-to-real transfer.
We propose a framework for robot policy evaluation in simulation environments,
using Gaussian Splatting for rendering and soft-body digital twin for dynamics.
We present an interactive digital twin construction (real-to-sim) framework that learns
the full dynamics of elastoplastic articulated objects from videos.
We optimize a spring-mass physics model of deformable objects and
integrate the model with 3D Gaussian Splatting for real-time re-simulation with rendering.
We propose a neural particle-grid model for training dynamics model with real-world sparse-view RGB-D videos, enabling
high-quality future prediction and rendering.
We learn neural dynamics models of objects from real perception data
and combine the learned model with 3D Gaussian Splatting for action-conditioned predictive rendering.
We learn a material-conditioned neural dynamics model using graph neural network to
enable predictive modeling of diverse real-world objects and achieve efficient manipulation via model-based planning.
We propose a fully self-supervised method for category-level 6D object pose estimation
by learning dense 2D-3D geometric correspondences. Our method can train on image collections
without any 3D annotations.
We show that fusing fine-grained features learned with low-level contrastive objectives and semantic features
from image-level objectives can improve SSL pretraining.