How Far is Video Generation from World Model: A Physical Law Perspective (2024)

Overview

Kang et al. assess video generative models on principles of classical mechanics (collision, motion laws etc.) via synthetic 2D testbeds. They evaluate generalization in-distribution, out-of-distribution, combinatorial setups, and find models often fail to truly abstract physical laws.

Why it matters

Tests if world models do more than memorize or overfit; whether they generalize fundamental dynamics. This is essential for physical plausibility in robotics, simulation, or any safety-critical system.

Key trade-offs / limitations

  • Synthetic environments may not capture the complexity of natural physics (texture, lighting, unmodeled forces etc.).
  • Results might differ in noisy real video conditions.
  • The testbeds are simplified, limiting the number of physics phenomena tested.
arXiv:2411.02385