Behavior-Based Predictive Safety Analytics Phase II
archive: archived pipeline: cataloged verified
Get this paper ↗ (full text — opens at the source; we link to it, we don't host it)
Summary
This report presents the findings of Phase II of the Behavior-Based Predictive Safety Analytics project, funded by the Safety through Disruption (Safe-D) University Transportation Center. The research addresses the challenge of predicting road crash involvement by analyzing individual driver behavior characteristics, personal traits, and environmental influences. Motivated by the fact that approximately 94% of crashes result from driver-related errors and that a small proportion of drivers account for a disproportionate share of crashes, the study aims to identify risky drivers before incidents occur. The work builds upon a pilot study using naturalistic driving data to develop behavioral indicators that can structure real-time crash risk evaluation for applications in fleet safety, insurance, and automated driving systems. The methodology utilized two large-scale naturalistic driving datasets: the Second Strategic Highway Research Program (SHRP 2) for light vehicles and large truck naturalistic driving collections (FAST DASH 2 and OBMS 2) for commercial motor vehicles. The SHRP 2 dataset included 3,546 drivers and incorporated subjective measures such as risk-taking behavior, risk perception, sensation seeking, and driving history. The truck dataset comprised 177 participants who averaged 40,000 miles each, providing extensive data for within-person analysis. The researchers cleaned the data using rolling windows and imputation methods, then calculated specific behavioral indices including longitudinal and lateral accelerations, headway, time-to-collision, speed behaviors, and lane deviations. Anal were conducted at both the person-level (between-subjects) and trip-level (within-subjects) to correlate these behavioral metrics with crash and near-crash (CNC) events. The results demonstrated significant correlations between specific driving behaviors and crash involvement. Between-subjects analysis revealed that drivers involved in CNCs were more likely to engage in closer following and strong accelerations compared to those who did not experience such events. Specifically, drivers with CNC involvement spent a higher percentage of time in short headway bins (0–1.0 seconds) and a lower percentage in longer headway bins (>3.5 seconds) than non-involved drivers. Correlation matrices indicated strong positive relationships between CNC involvement and longitudinal deceleration, lateral acceleration, and short headways. Within-subjects analysis further defined high-threshold acceleration events and average headway as key indicators for real-time risk assessment. The study also established exposure metrics, such as speed-based time and distance, to normalize behavioral calculations across different driving contexts. The significance of this research lies in its contribution to the emerging field of behavior-based predictive safety analytics. By identifying specific behavioral patterns—such as hard braking, strong lateral acceleration, and close following—that correlate with crash risk, the study provides a framework for developing real-time crash risk models. These models can account for individual driver behaviors and contextual roadway information, offering potential improvements in fleet safety management, insurance risk assessment, and the evaluation of automated driving systems. The findings support the notion that enduring personal factors and situational elements interact to produce crashes, and that monitoring observable behavioral patterns can effectively identify high-risk drivers.
Key finding
Drivers involved in crashes or near-crashes demonstrated significantly higher rates of strong longitudinal acceleration and closer following distances compared to drivers not involved in such events.
Methodology
naturalistic
Sample size: 3723
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via bulk_ingest_rosap on 2026-05-23 (6 acquisition events logged).
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | rosap | — | — | 2 | 2026-05-23 |
| archive | success | — | — | — | 1 | 2026-05-23 |
| extract | success | cached | — | — | 2 | 2026-06-10 |
| clean | success | — | — | — | 1 | 2026-06-01 |
| chunk | success | — | — | — | 1 | 2026-06-01 |
| embed | success | — | — | — | 1 | 2026-06-02 |
| enrich | success | — | — | — | 1 | 2026-05-23 |
| promote | success | — | — | — | 1 | 2026-05-23 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 3 | 2026-06-10 |
| tag | success | vector_similarity | — | — | 19 | 2026-06-11 |
| verify | success | — | — | — | 2 | 2026-06-10 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
- telematics crash prediction
- sex gender
- naturalistic crash near crash
- incidence prevalence
- induced exposure
- exposure measurement
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Empirical Findings: crash risk outcomes, observational prevalence
- Methodological Resource: dataset resource