Towards Driver Behavior Understanding: Weakly-Supervised Risk Perception in Driving Scenes

Agarwal, Nakul; Chen, Yi-Ting; Dariush, Behzad · 2026 · arXiv

archive: archived pipeline: cataloged verified

Get this paper ↗ (full text — opens at the source; we link to it, we don't host it)

Summary

This paper addresses the challenge of modeling driver risk perception, a critical component for developing intelligent vehicle systems capable of achieving zero-collision mobility. While existing methods often define risk through collision prediction, they fail to capture the complex cognitive processes drivers use to interpret surrounding agents. The authors argue that prior datasets lack diversity and key behavioral cues, particularly pedestrian attentiveness, which is essential for understanding non-verbal communication and mutual intention between drivers and pedestrians. To bridge this gap, the paper introduces RAID (Risk Assessment In Driving scenes), a large-scale dataset curated specifically for research on driver-centric risk perception and contextual risk assessment. The study utilizes the RAID dataset, comprising 4,691 annotated video clips collected in the San Francisco Bay Area using an instrumented vehicle equipped with cameras, LiDAR, and CAN bus data. The dataset features a four-layer annotation scheme covering driver action, road topology, risk situations, and driver responses, alongside detailed pedestrian attention and face annotations. The authors propose a weakly supervised framework for risk object identification that models the causal relationship between a driver’s intended maneuver and their behavioral response (e.g., stopping or deviating). The method employs a graph convolutional network to model interactions among traffic agents and a temporal encoder-decoder LSTM structure to predict driver actions. Additionally, the paper introduces a face-based detection method to assess pedestrian attentiveness and explores its integration into joint risk assessment. Experimental results demonstrate that the proposed method significantly outperforms prior state-of-the-art approaches. On the RAID dataset, the model achieves a 20.6% performance gain, while on the HDDS dataset, it achieves a 23.1% improvement. The analysis reveals that incorporating driver action predictions enhances both risk object identification and response prediction accuracy. Furthermore, the study shows that using face crops for pedestrian attention classification yields superior results compared to body-based methods, with an average precision of 83.76%. Qualitative analysis confirms that pedestrian attentiveness meaningfully influences risk scores; for instance, pedestrians looking toward the ego-vehicle are associated with lower risk scores than those who are not, validating the importance of joint attention in risk perception. The significance of this work lies in its comprehensive approach to modeling human-centric risk perception by integrating driver behavior with pedestrian attentiveness. By providing the first large-scale dataset with naturalistic driving scenes that includes diverse risk situations and pedestrian attention annotations, RAID enables more realistic and task-driven research. The proposed weakly supervised framework offers a robust baseline for identifying risk sources without requiring explicit collision labels, advancing the field toward more holistic human-AI interaction models. This work highlights the value of considering cognitive factors like attentiveness in autonomous driving systems, paving the way for improved safety and better understanding of driver decision-making processes.

Key finding

A weakly-supervised model linking driver intended maneuver, driver response, and pedestrian attentiveness improved risk-object identification by 20.6% on RAID and 23.1% on HDDS over prior state-of-the-art.

Methodology

naturalistic

Sample size: 4,691 annotated video clips (no human-participant sample)

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via discover_arxiv on 2026-05-04 (3 acquisition events logged).

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	arxiv	—	—	3	2026-05-04
archive	success	—	—	—	1	2026-05-04
extract	success	cached	—	—	2	2026-06-10
clean	success	—	—	—	1	2026-06-01
chunk	success	—	—	—	1	2026-06-01
embed	success	—	—	—	1	2026-06-02
enrich	success	—	—	—	1	2026-05-04
promote	success	—	—	—	1	2026-05-04
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	2	2026-06-10
tag	success	vector_similarity	—	—	16	2026-06-11
verify	success	—	—	—	2	2026-06-10

Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).

Methodological Resource: tool software
Theoretical Contribution: computational model, theory or model