Building a Vision-Based Mixed-Reality Framework for Autonomous Driving Navigation
DOI: 10.1109/codit58514.2023.10284251
archive: archived pipeline: cataloged verified
Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)
Summary
This paper addresses the challenge of validating autonomous driving (AV) systems, specifically the "reality gap" that occurs when transferring models from simulation to real-world hardware. While simulation allows for safe testing of critical scenarios, it lacks the fidelity of real-world conditions, whereas real-world testing is costly, risky, and requires hundreds of millions of miles to prove reliability. The authors propose a vision-based mixed-reality (MR) framework that combines real and virtual environments to enable safer, more efficient testing. The study focuses on the perception block of this framework, aiming to augment an agent’s visual input by integrating virtual objects into real-world scenes using depth information from RGB-D cameras, thereby avoiding the high costs and interference issues associated with LiDAR. The methodology involves a three-step image fusion process to combine a real image ($I_1$) with a virtual image ($I_2$) into a final augmented image ($I_{final}$). First, histogram equalization is applied to normalize pixel intensity distributions between the real and virtual depth maps, ensuring comparability. Second, Otsu thresholding is used on the virtual image to segment objects of interest from the background. Third, a depth comparison algorithm determines the spatial positioning of objects, handling occlusions by retaining pixels from the closer image (real or virtual) at each coordinate. The system was implemented using the ZED2 stereo camera for real-world data and tested against virtual environments generated by Gazebo and Unity3D simulators. Experiments were conducted using the KITTI dataset and real-time data from the ZED2 sensor. The fusion algorithm successfully integrated virtual elements, such as pedestrians, robots, and stop signs, into real-world images while correctly managing occlusions. For instance, in tests involving overlapping objects, the algorithm correctly placed a closer virtual robot in front of a real robot based on depth data. To evaluate the effectiveness of the augmented scenes, the authors applied a pre-trained Faster R-CNN object detection model. The results demonstrated that the detection algorithm could accurately identify both real and virtual objects in the fused images, confirming that the augmented scenes provide valid perceptual information for AV agents. The significance of this work lies in providing a viable alternative to pure simulation or expensive real-world testing for AV development. By demonstrating that vision-based MR can successfully merge virtual and physical data, the paper validates a method for training and testing autonomous systems in safer, controlled conditions. The authors conclude that while the current approach shows promise, future work must address limitations regarding depth map accuracy, particularly when cameras are positioned close to the ground. Long-term goals include integrating real robots to evaluate whether MR training reduces collision rates and improves overall system performance.
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | Crossref | — | — | 1 | 2026-06-20 |
| archive | success | unpaywall | — | — | 2 | 2026-06-26 |
| extract | success | cached | — | — | 2 | 2026-06-26 |
| clean | success | clean | — | — | 1 | 2026-06-20 |
| chunk | success | chunk | — | — | 1 | 2026-06-20 |
| embed | success | embed | Qwen/Qwen3-Embedding-8B | — | 1 | 2026-06-20 |
| enrich | success | openalex | — | — | 1 | 2026-06-20 |
| promote | success | — | — | — | 1 | 2026-06-20 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 1 | 2026-06-26 |
| tag | success | vector_similarity | — | — | 6 | 2026-06-20 |
| verify | success | — | — | — | 1 | 2026-06-26 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Methodological Resource: tool software