Vid2Sim bridges the sim-to-real gap by converting monocular videos into photorealistic, physically interactable 3D simulation environments. It enables RL training of visual navigation agents in complex urban environments using neural 3D scene reconstruction and simulation.
Addresses the major challenge of sim-to-real transfer for robot learning by creating realistic digital twins from minimal video input. Enables scalable, cost-efficient training for urban navigation applications like food delivery bots and assistive vehicles.