Robust Multimodal and Multi-Object Tracking for Autonomous Driving Applications
DOI: 10.1109/icar58858.2023.10406433
archive: archived pipeline: cataloged verified
Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)
Summary
This paper presents a robust Multi-Object Tracking (MOT) method designed for autonomous driving applications, addressing the challenges of integrating unsynchronized multimodal sensor data. The primary motivation is the need for accurate environmental perception in Advanced Driver-Assistance Systems (ADAS), where existing tracking-by-detection methods often fail to handle localization errors, object misclassifications, and partial bounding-box detections inherent in real-world sensor data. The authors propose a pipeline that fuses detections from cameras, radars, and lidars without requiring temporal synchronization, leveraging a Kalman filter and specific tracklet management logic to improve robustness. The methodology processes raw sensor data through modality-specific detectors to generate 3D detections, which are then fused in the MOT module. A key innovation is the estimation of location error for camera-based depth estimation, allowing the system to account for increased frontal positioning errors at greater distances. The system handles misclassifications by associating detections from similar classes (identified via confusion matrices) and using temporal consistency to determine the correct class. To address partial bounding boxes, particularly from radars and lidars, the method reconstructs complete boxes by assuming the closest detected point remains fixed while resizing dimensions to meet class-specific minimums. The tracking state includes position, velocity, and acceleration, with acceleration inferred by filtering velocity differences over time. Data association utilizes the Mahalanobis distance to account for detection covariance, and tracklets are managed based on confidence thresholds and time-outs. Experimental validation was conducted in two real-world scenarios: a traffic jam chauffeur function on proving grounds and highway traffic monitoring. In the traffic jam tests, the system used a front radar and camera, comparing results against the CBMOT baseline. The proposed method achieved lower mean absolute errors in position and velocity across five scenarios, including rainy conditions where camera performance degraded. The fusion output was smoother and more accurate than individual sensors or CBMOT, which failed to account for sensor-specific errors. In the highway monitoring experiment, utilizing cameras, radars, and two lidars, the system successfully tracked vehicles in a truck platoon scenario. Qualitative results demonstrated that the method effectively reconstructed complete bounding boxes and corrected misclassifications (e.g., trucks vs. buses), whereas CBMOT produced erroneous tracklets and higher positional errors due to its inability to handle partial detections and sensor noise. The significance of this work lies in its ability to provide a generic, robust MOT solution that improves upon existing baselines by explicitly handling sensor imperfections. By incorporating acceleration estimation and correcting for partial detections and misclassifications, the method enhances the reliability of perception systems for critical ADAS functions. The approach’s independence from sensor synchronization and its adaptability to various sensor configurations make it suitable for diverse real-world driving applications, offering a practical improvement for deploying autonomous driving technologies in challenging environments.
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | Crossref | — | — | 1 | 2026-06-18 |
| archive | success | unpaywall | — | — | 2 | 2026-06-25 |
| extract | success | cached | — | — | 2 | 2026-06-26 |
| clean | success | clean | — | — | 1 | 2026-06-20 |
| chunk | success | chunk | — | — | 1 | 2026-06-20 |
| embed | success | embed | Qwen/Qwen3-Embedding-8B | — | 1 | 2026-06-20 |
| enrich | success | openalex | — | — | 1 | 2026-06-20 |
| promote | success | — | — | — | 1 | 2026-06-18 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 1 | 2026-06-26 |
| tag | success | vector_similarity | — | — | 6 | 2026-06-20 |
| verify | success | — | — | — | 1 | 2026-06-26 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.