Estimating likelihood of future crashes for crash-prone drivers
DOI: 10.1016/j.jtte.2015.03.003
archive: archived pipeline: cataloged verified
Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)
Summary
This study addresses the challenge of identifying and predicting future crash risks for "crash-prone" drivers, a small subset of licensed drivers responsible for a disproportionate number of traffic incidents. In Louisiana, at-fault crash-prone drivers represent only 5% of licensed drivers yet commit 34% of all crashes. Motivated by the need to support targeted safety education and enforcement programs, the research aims to develop a predictive model that estimates the likelihood of a driver being at fault in future crashes based on historical data and specific crash characteristics. The researchers utilized eight years of traffic crash data (2004–2011) from Louisiana, comprising approximately 2.08 million records. After filtering for records with driver license information, the dataset was categorized into at-fault/not-at-fault and crash-prone/non-crash-prone groups. Initial exploratory data analysis revealed that at-fault prone drivers were significantly more likely to be male, younger (15–24 years), and involved in crashes with alcohol, drug impairment, or distraction. To build the prediction model, the authors employed logistic regression. They initially considered 371 potential variables spanning human factors, crash characteristics, roadway geometry, environmental conditions, and vehicle status. Using a regression subset selection method to eliminate redundancy, the final model retained ten key predictors, excluding variables such as crash hour, day of the week, weather, and vehicle condition. The developed logistic regression model demonstrated moderate predictive capability, correctly classifying at-fault crashes with an accuracy of 62.40% and a specificity of 77.25%. The model successfully identified important variables associated with crash proneness, confirming that human factors like driver age, gender, alcohol and drug involvement, and distraction are critical predictors. Additionally, roadway alignment and lighting conditions were found to influence crash likelihood for this specific group. The analysis highlighted that at-fault prone drivers are particularly vulnerable in specific roadway conditions, such as elevated curves and dark environments without street lighting, and are more frequently involved in single-vehicle run-off crashes compared to not-at-fault drivers. The significance of this research lies in its application for traffic safety management. The model provides a tool for traffic agencies to monitor the performance of at-fault crash-prone drivers and identify high-risk individuals for targeted interventions. The findings support the implementation of specialized safety programs, including enhanced education and stricter regulations, aimed at this high-risk demographic. By focusing resources on drivers who repeatedly commit at-fault crashes, authorities can more effectively reduce overall crash rates and move toward strategic safety goals, such as the "Destination Zero Deaths" initiative.
Key finding
The developed logistic regression model correctly classified at-fault crashes with 62.40% accuracy and 77.25% specificity, enabling the identification of future crash incidence for at-fault drivers.
Methodology
modeling
Sample size: 2076009
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via author_sweep_intake on 2026-05-27.
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | author_sweep | — | — | 2 | 2026-05-27 |
| archive | success | canonical_url | — | — | 5 | 2026-06-06 |
| extract | success | cached | — | — | 3 | 2026-06-10 |
| clean | success | clean | — | — | 1 | 2026-06-04 |
| chunk | success | chunk | — | — | 1 | 2026-06-04 |
| embed | success | embed | Qwen/Qwen3-Embedding-8B | — | 1 | 2026-06-04 |
| enrich | success | semantic_scholar | — | — | 2 | 2026-06-04 |
| promote | success | — | — | — | 1 | 2026-06-04 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 2 | 2026-06-10 |
| tag | success | vector_similarity | — | — | 15 | 2026-06-11 |
| verify | success | — | — | — | 2 | 2026-06-10 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Empirical Findings: crash risk outcomes
- Theoretical Contribution: computational model