Safety-Critical Learning for Long-Tail Events: The TUM Traffic Accident Dataset

Zimmer, Walter; Greer, Ross; Zhou, Xingcheng; Song, Rui; Marc, Pavel,; Lehmberg, Daniel; Ghita, Ahmed; Gopalkrishnan, Akshay; Trivedi, Mohan M.; Knoll, Alois · 2025 · ArXiv.org

DOI: 10.48550/arxiv.2508.14567

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper addresses the critical challenge of detecting rare, safety-critical "long-tail" events, specifically high-speed traffic accidents, which are difficult to collect and essential for robust autonomous driving systems. The authors introduce the TUM Traffic Accident (TUMTraf-A) dataset, a collection of real-world highway accident sequences recorded at the A9 Test Bed for Autonomous Driving in Munich, Germany. The dataset comprises 48,144 labeled frames captured from four roadside cameras and LiDAR sensors operating at 10 Hz. It includes 294,924 2D and 93,012 3D bounding box annotations, track IDs, and trajectory data for ten object classes, including vehicles, pedestrians, and emergency responders. The data captures diverse accident scenarios such as overturning vehicles, fires, and high-speed collisions, providing ground truth for perception, tracking, and cooperative perception research. To leverage this data, the authors propose Accid3nD, a hybrid accident detection framework combining rule-based and learning-based approaches. The rule-based component analyzes vehicle trajectories to identify potential accidents using predefined thresholds, operating in real-time. If a potential incident is flagged, a learning-based module, utilizing a YOLOv8 model trained on the TUMTraf-A dataset, performs image-based verification. To minimize false positives, detections are filtered by a confidence score threshold of 0.8 and require confirmation across at least three consecutive frames. The system also fuses detection results from multiple camera angles to enhance accuracy. The authors evaluated the framework on 12,290 fifteen-minute video recordings processed from 128 days of data, identifying 3,748 standing vehicles in driving lanes, 138 in shoulder lanes, and 120 breakdown events. Experimental results demonstrate the efficiency and robustness of the proposed method. The rule-based approach achieves a processing speed of 10.41 ms per frame (95.05 FPS) on an NVIDIA RTX 3090 GPU, enabling real-time analysis. The total processing time for a 15-minute recording was approximately 234 seconds. Ablation studies confirm that the hybrid approach outperforms individual components, achieving state-of-the-art results on the dataset. The authors note that while the rule-based method is currently limited to detecting rear-end collisions, the learning-based component successfully identifies various crash types. The dataset, model, and development kit are open-sourced to facilitate further research in roadside perception, digital twin creation, and cooperative sensing for autonomous vehicles.

Key finding

The proposed Accid3nD framework, which fuses rule-based trajectory checks with learning-based image detection, achieves robust real-time accident detection on the new TUM Traffic Accident dataset.

Methodology

dataset

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via author_sweep_intake on 2026-05-28.

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	author_sweep	—	—	2	2026-05-28
archive	success	canonical_url	—	—	1	2026-06-04
extract	success	cached	—	—	3	2026-06-10
clean	success	clean	—	—	1	2026-06-04
chunk	success	chunk	—	—	1	2026-06-04
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-04
enrich	success	—	—	—	1	2026-05-28
promote	success	—	—	—	1	2026-06-04
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	2	2026-06-10
tag	success	vector_similarity	—	—	15	2026-06-11
verify	success	—	—	—	2	2026-06-10

Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

naturalistic crash near crash

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).

Empirical Findings: crash risk outcomes
Methodological Resource: dataset resource