Object scene flow for autonomous vehicles

Menze, Moritz; Geiger, Andreas · 2015 · OpenAlex-citations

DOI: 10.1109/cvpr.2015.7298925

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper addresses the challenge of estimating dense 3D scene flow for autonomous driving, a task critical for scene understanding and obstacle avoidance. The authors identify a gap in existing research: while previous methods often rely on synthetic data or static scenes, there is a lack of realistic benchmarks with ground truth for dynamically moving objects. To address this, the paper proposes a novel "Object Scene Flow" model that exploits the structural prior that outdoor scenes decompose into a small number of rigidly moving objects and a background. Additionally, the authors introduce a new dataset derived from the KITTI raw data collection to enable rigorous quantitative evaluation. The proposed method models the 3D structure of the scene as a collection of planar superpixels and the motion as a set of rigidly moving objects. This representation reduces the parameter space, requiring only four parameters per superpixel (three for geometry, one for object index) and a few parameters per object. The approach formulates the estimation as a discrete-continuous Conditional Random Field (CRF), where the data term decomposes into pairwise potentials between superpixels and objects. The data term combines stereo, optical flow, and cross-term matching costs using both dense Census descriptors and sparse feature correspondences. The smoothness term encourages coherence in depth, orientation, and motion, with weights that account for 3D discontinuities. Optimization is performed using max-product particle belief propagation. To support this model, the authors created a dataset comprising 400 dynamic scenes (200 training, 200 test) from KITTI. Ground truth was generated by correcting laser scans for rolling shutter effects, removing dynamic objects, and fitting detailed 3D CAD models to the remaining point clouds. This process yielded semi-dense disparity and optical flow ground truth. The authors also defined a novel evaluation metric that jointly considers depth and motion errors, accounting for annotation inaccuracies at image boundaries. Experiments demonstrate that the Object Scene Flow model significantly outperforms state-of-the-art baselines, including variational approaches, sparse methods, and piece-wise rigid scene flow models. On the proposed dataset, the method achieved lower RMS errors in 2D flow, disparity, and combined scene flow compared to competitors. Ablation studies confirmed the importance of the object decomposition assumption, showing that performance degrades as the number of allowed object hypotheses decreases. The results indicate that explicitly modeling objects improves robustness in textureless or ambiguous regions and provides intrinsic segmentation of the scene into dynamic components. The paper concludes that this approach offers a more realistic and effective solution for scene flow estimation in autonomous driving contexts.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	OpenAlex-citations	—	—	1	2026-06-25
archive	success	semantic_scholar	—	—	6	2026-06-26
extract	success	cached	—	—	2	2026-06-26
clean	success	clean	—	—	1	2026-06-25
chunk	success	chunk	—	—	1	2026-06-25
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-25
promote	success	—	—	—	1	2026-06-25
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	1	2026-06-26
tag	success	vector_similarity	—	—	6	2026-06-25
verify	success	—	—	—	1	2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

situational awareness