ProphetDWM: An End-to-End Driving World Model for Joint Video and Action Prediction (2025)
Overview
ProphetDWM is an autonomous driving world model that jointly predicts future video frames and driving actions. It features a diffusion-based transition module and an action learning module, trained jointly for alignment.Why it matters
Brings world models closer to real-world use cases by combining video imagination with action prediction, useful for self-driving and planning systems.Key trade-offs / limitations
- Specialized for driving; generalization to other domains untested.
- Requires large datasets like NuScenes for training.