Imitating driver behavior with generative adversarial networks

Kuefler, Alex; Morton, Jeremy; Wheeler, Tim A.; Kochenderfer, Mykel J. · 2017 · OpenAlex-citations

DOI: 10.1109/ivs.2017.7995721

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper addresses the challenge of accurately simulating human driving behavior for intelligent transportation systems, specifically focusing on overcoming the "cascading errors" inherent in traditional Behavioral Cloning (BC). BC methods often fail in long-horizon simulations because small prediction inaccuracies compound, leading the model into states underrepresented in training data, such as off-road scenarios. To resolve this, the authors apply Generative Adversarial Imitation Learning (GAIL), which trains a policy to mimic expert behavior by deceiving a discriminator, thereby generalizing better to unseen states without requiring an explicit reward function. The study extends GAIL to recurrent neural networks, specifically using Gated Recurrent Units (GRUs), to handle the partial observability and temporal dependencies of driving. The authors compare four neural network policies—GAIL and BC trained on both GRU and Multilayer Perceptron (MLP) architectures—against three baselines: a static Gaussian model, a Mixture Regression BC model, and a rule-based controller combining the Intelligent Driver Model (IDM) and MOBIL. The experiments utilize the Next-Generation Simulation (NGSIM) dataset, comprising real-world trajectories from US Highway 101 and Interstate 80. The input features include vehicle odometry, lane-relative states, and LIDAR-like beams measuring distance and range rate to surrounding vehicles. Policies were optimized using Trust Region Policy Optimization (TRPO) and evaluated in a simulation environment where non-ego vehicles followed recorded trajectories. The results demonstrate that GAIL-trained models, particularly the GAIL GRU, outperform BC and baseline models in realistic highway simulations. While BC models exhibited superior short-horizon accuracy, they accumulated significant error over time, leading to higher rates of collisions and off-road driving. In contrast, GAIL policies maintained stable trajectories and realistic control over long time horizons. Quantitative metrics showed that GAIL GRU achieved lower Kullback-Leibler divergence for speed, acceleration, and inverse time-to-collision compared to other models, indicating better distribution matching. Furthermore, GAIL policies successfully reproduced emergent human behaviors, such as lane change rates, while significantly reducing undesirable outcomes like hard braking and collisions compared to BC approaches. The significance of this work lies in demonstrating that GAIL can effectively learn robust driving policies that generalize to unseen states, mitigating the compounding errors that plague supervised learning approaches. By extending GAIL to recurrent architectures, the authors show that models can capture the stochastic and temporal nature of human driving more faithfully than feedforward networks. This approach provides a viable method for creating realistic driver simulators that are essential for testing autonomous vehicle systems and advancing automotive safety research, offering a balance between the generalization capabilities of reinforcement learning and the data-driven efficiency of imitation learning.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	OpenAlex-citations	—	—	1	2026-06-18
archive	success	semantic_scholar	—	—	6	2026-06-25
extract	success	cached	—	—	2	2026-06-26
clean	success	clean	—	—	1	2026-06-18
chunk	success	chunk	—	—	1	2026-06-18
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-18
promote	success	—	—	—	1	2026-06-18
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	1	2026-06-26
tag	success	vector_similarity	—	—	6	2026-06-18
verify	success	—	—	—	1	2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

mental model of traffic

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).

Methodological Resource: tool software
Theoretical Contribution: computational model