Modeling the Effects of Autonomous Vehicles on Human Driver Car-Following Behaviors Using Inverse Reinforcement Learning

Wen, Xiao; Jian, Sisi; He, Dengbo · 2023 · IEEE Transactions on Intelligent Transportation Systems

DOI: 10.1109/tits.2023.3298150

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This study addresses the critical challenge of modeling human driver behaviors during the transition period where human-driven vehicles (HVs) share roads with autonomous vehicles (AVs). Previous research relied on traffic simulations or field experiments, which often simplified complex interactions and yielded biased results due to a lack of empirical data. To overcome these limitations, the authors utilize the high-resolution (10Hz) Waymo Open Dataset to extract real-world car-following events, specifically comparing HVs following AVs against HVs following other HVs. The primary objective is to realistically model these microscopic interactions and understand how human drivers adapt their longitudinal control strategies when interacting with AVs. The methodology employs Inverse Reinforcement Learning (IRL) to infer the underlying reward functions that drive human behavior, rather than merely mimicking trajectories. The authors propose an Inverse soft-Q Learning (IQ-Learn) algorithm combined with a Deep Reinforcement Learning (DRL) approach known as Soft Actor-Critic (SAC). Unlike adversarial IRL methods like GAIL or AIRL, which suffer from sensitivity to hyperparameters, IQ-Learn approximates a single Q-function that represents both reward and policy, converting the problem into a simpler minimization task. The car-following problem is formulated as a Markov Decision Process where the state includes vehicle speed, inter-vehicle spacing, and relative speed, while the action is longitudinal acceleration. The proposed model is benchmarked against conventional physics-based models (Intelligent Driver Model) and other data-driven approaches (LSTM, GAIL, and AIRL). The results demonstrate significant differences in human driving behavior depending on the leading vehicle type. Statistical tests on calibrated Intelligent Driver Model parameters reveal that HVs exhibit distinct car-following characteristics when following AVs compared to HVs. Furthermore, the proposed IQ-Learn with SAC model achieves significantly more accurate trajectory predictions than the benchmark models. Crucially, the recovered reward functions indicate that human drivers have different preferences and strategic goals when following AVs, suggesting that AVs fundamentally alter the decision-making landscape for surrounding human drivers. The significance of this work lies in its ability to capture the nuanced behavioral adaptations of human drivers in mixed-autonomy traffic using real-world data. By recovering the specific reward functions associated with HV-following-AV interactions, the study provides a deeper understanding of driver preferences that physics-based models cannot capture. These insights are vital for improving the performance of AV controllers, as understanding human reward structures allows AVs to predict and react to human behavior more effectively, thereby enhancing overall traffic safety and efficiency during the transition to full automation.

Key finding

Human drivers exhibit significantly different car-following behaviors and reward function preferences when following autonomous vehicles compared to human-driven vehicles, as demonstrated by calibrated model parameters and superior trajectory predictions using the proposed inverse reinforcement learning approach.

Methodology

modeling

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	—	—	—	1	2026-05-28
archive	success	canonical_url	—	—	1	2026-06-06
extract	success	cached	—	—	3	2026-06-10
clean	success	clean	—	—	1	2026-06-07
chunk	success	chunk	—	—	1	2026-06-07
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-07
enrich	success	semantic_scholar	—	—	4	2026-06-15
promote	success	—	—	—	1	2026-06-04
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	2	2026-06-10
tag	success	vector_similarity	—	—	15	2026-06-11
verify	success	—	—	—	2	2026-06-10

Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

following distance

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).

Methodological Resource: tool software
Theoretical Contribution: computational model