RAIST: Learning Risk Aware Traffic Interactions via Spatio-Temporal Graph Convolutional Networks

Suman, Videsh; Pham, Phu; Bera, Aniket · 2023 · Crossref

DOI: 10.1109/iros55552.2023.10341578

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper introduces RAIST, a novel framework for learning risk-aware traffic interactions using Spatio-Temporal Graph Convolutional Networks (ST-GCNs). The research addresses the challenge of enabling autonomous vehicles to make safe, tactical driving decisions by understanding complex traffic scenes and identifying potential risks, particularly involving vulnerable road users like pedestrians and cyclists. Existing methods often fail to model the temporal intentions of individual agents or lack interpretability in risk assessment. RAIST aims to bridge this gap by modeling both spatial interactions and temporal behaviors through egocentric video inputs, thereby improving the identification of causal risk objects. The methodology constructs a dynamic traffic graph where nodes represent road agents and edges represent their interactions. The system extracts features from video sequences using Faster R-CNN for detection, Deep SORT for tracking, and depth estimation to convert 2D pixel positions into 3D coordinates. Appearance features are extracted via I3D and ROIAlign, then combined with static scene context obtained by inpainting dynamic agents from the frames. The graph edges are formulated using parameterized functions of 3D positions and scene-aware appearance features, with specific parameters learned for vulnerable versus non-vulnerable agents. An ST-GCN processes these graphs to learn spatio-temporal representations, aggregating node-wise influence across time steps. For risk assessment, the framework employs a causal inference approach: it predicts tactical behaviors (Stop/Go) and identifies risk objects by iteratively removing agents from the graph to observe changes in the prediction, a process that requires only frame-level annotations rather than detailed object-level risk labels. Experiments were conducted on the HDD dataset, which contains 104 hours of real-world driving data with tactical behavior and cause annotations. The results demonstrate that RAIST outperforms several baseline methods in risk object identification, achieving an accuracy of 87.9% in congestion scenarios and 41.3% in parked vehicle scenarios, surpassing previous state-of-the-art methods like Li et al. [26]. The framework showed particular improvement in identifying objects with vulnerable interactions, such as pedestrians and cyclists. The study confirms that the proposed spatio-temporal modeling effectively captures the causal influence of specific road agents on driver behavior, offering a more interpretable and accurate approach to risk assessment in autonomous driving systems.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

StageOutcomeToolModelPromptAttemptsCompleted
discover success Crossref 1 2026-06-25
archive success semantic_scholar 6 2026-06-26
extract success cached 2 2026-06-26
clean success clean 1 2026-06-26
chunk success chunk 1 2026-06-26
embed success embed Qwen/Qwen3-Embedding-8B 1 2026-06-26
enrich success openalex 1 2026-06-26
promote success 1 2026-06-25
summarize success llm qwen3.6-27b-prismaquant summ-v5 1 2026-06-26
tag success vector_similarity 6 2026-06-26
verify success 1 2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).