Pre-Trained Video Generative Models as World Simulators (DWS) (2025)
Overview
He et al. propose DWS, a method to convert pre-trained video generative models into action-conditioned world simulators. It adds a light action-conditioned module and introduces motion-reinforced loss for better dynamic consistency. Applications demonstrated across games and robotics, with improvements in action controllability. :contentReference[oaicite:6]Why it matters
Repurposing existing generative models reduces the need to train from scratch, leverages massive internet-scale pretraining, while adding controllability important for real-world tasks (e.g. robotics, planning, simulation).Key trade-offs / limitations
- Pre-trained models may still have limitations in fine detail or domain mismatch.
- The added action module may have limited influence on complex dynamics.
- Trade-off between visual quality and controllability / dynamic correctness.