Overview
BEVControl introduces a two-stage generative method that separates geometric control from appearance by using Bird-Eye View (BEV) sketch layouts. It aims to produce realistic street-view images consistent with both foreground and background, and supports human-editable sketch input. The method also proposes a multi-level evaluation protocol to fairly assess scene, object, and background geometry fidelity. :contentReference[oaicite:0]Why it matters
This work offers fine control in synthesis for autonomous driving and scene understanding, leveraging BEV layouts so that downstream perception models (e.g. for detection or segmentation) can be trained on data that is both controllable and consistent. It shows improved performance over BEVGen especially in terms of foreground object consistency.Key trade-offs / limitations
- Mostly suited to street / road scenarios; not tested on indoors or unstructured environments.
- High fidelity detail degrades for more complex or far-field parts of scenes.
- Sketch style input is flexible but may still require nontrivial editing by human users.