Data-Driven Traffic Simulation: A Comprehensive Review

Chen, Di; Zhu, Meixin; Yang, Hao; Wang, Xuesong; Wang, Yinhai · 2024 · OpenAlex-citations

DOI: 10.1109/tiv.2024.3367919

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper presents a comprehensive review of data-driven microscopic traffic simulation, addressing the critical challenge of validating autonomous vehicle (AV) algorithms efficiently and safely. While on-road tests, track tests, and driving simulators are standard validation methods, they suffer from high costs, poor scalability, and an inability to replicate the unpredictable nature of real-world driving. Data-driven simulation offers a solution by enabling large-scale testing and realistic, reactive background traffic behavior. The authors identify a gap in existing literature, as no prior review sufficiently encompasses the scope and depth of data-driven methods, which have shifted focus from rule-based models due to the latter’s limited accuracy, poor generalization, and reliance on expert knowledge. The review is structured around the traffic simulation framework: input data, core modeling, and output evaluation. It details problem formulations using Markov Decision Processes (MDPs) and non-MDP approaches. The authors analyze input modalities (camera, LiDAR, radar, GNSS, HD maps) and compare datasets based on view types: Field of View (FOV), which provides immersive, egocentric perspectives, and Bird’s Eye View (BEV), which offers precise semantic localization. Context representation methods are categorized into rasterized, vectorized, and graph-based approaches, each with distinct trade-offs in computational efficiency and geometric precision. Agent modeling is examined through physics-based, statistical-based, and learning-based methods, with learning-based approaches leveraging neural networks for adaptability despite higher computational costs. Interaction modeling is divided into implicit methods, which use latent variables for adaptability, and explicit methods, which offer better interpretability through defined relationships like pass-and-yield. The paper evaluates prevalent learning models, including imitation learning, reinforcement learning, deep generative models, and deep learning, summarizing their advantages and limitations. It also reviews evaluation metrics essential for assessing simulation performance, categorized into realism (reconstruction ability), reactivity (safe response to dynamic environments), and diversity (coverage of agent policies). The authors distinguish between open-loop evaluation, where predictions do not drive the system forward, and closed-loop evaluation, which tests long-term stability. The significance of this work lies in its systematic organization of the rapidly evolving field of data-driven traffic simulation. By providing a critical analysis of current methodologies, datasets, and evaluation metrics, the paper establishes a foundational reference for researchers. It highlights existing challenges, such as the lack of transparency in learning-based models and the difficulty of simulating long-horizon interactions, and outlines future research directions. This review aims to guide the development of more realistic, reactive, and diverse simulation environments, thereby accelerating the safe deployment of autonomous vehicles.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	OpenAlex-citations	—	—	1	2026-06-18
archive	success	semantic_scholar	—	—	6	2026-06-25
extract	success	cached	—	—	2	2026-06-26
clean	success	clean	—	—	1	2026-06-18
chunk	success	chunk	—	—	1	2026-06-18
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-18
promote	success	—	—	—	1	2026-06-18
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	1	2026-06-26
tag	success	vector_similarity	—	—	6	2026-06-18
verify	success	—	—	—	1	2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

driverless ads

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).

Methodological Resource: tool software