Estimating likelihood of future crashes for crash-prone drivers

Das, Subasish · 2015 · Journal of Traffic and Transportation Engineering (English Edition)

DOI: 10.1016/j.jtte.2015.03.003

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This study addresses the challenge of identifying and predicting future crash risks for "crash-prone" drivers, a small subset of licensed drivers responsible for a disproportionate number of traffic incidents. In Louisiana, at-fault crash-prone drivers represent only 5% of licensed drivers yet commit 34% of all crashes. Motivated by the need to support targeted safety education and enforcement programs, the research aims to develop a predictive model that estimates the likelihood of a driver being at fault in future crashes based on historical data and specific crash characteristics. The researchers utilized eight years of traffic crash data (2004–2011) from Louisiana, comprising approximately 2.08 million records. After filtering for records with driver license information, the dataset was categorized into at-fault/not-at-fault and crash-prone/non-crash-prone groups. Initial exploratory data analysis revealed that at-fault prone drivers were significantly more likely to be male, younger (15–24 years), and involved in crashes with alcohol, drug impairment, or distraction. To build the prediction model, the authors employed logistic regression. They initially considered 371 potential variables spanning human factors, crash characteristics, roadway geometry, environmental conditions, and vehicle status. Using a regression subset selection method to eliminate redundancy, the final model retained ten key predictors, excluding variables such as crash hour, day of the week, weather, and vehicle condition. The developed logistic regression model demonstrated moderate predictive capability, correctly classifying at-fault crashes with an accuracy of 62.40% and a specificity of 77.25%. The model successfully identified important variables associated with crash proneness, confirming that human factors like driver age, gender, alcohol and drug involvement, and distraction are critical predictors. Additionally, roadway alignment and lighting conditions were found to influence crash likelihood for this specific group. The analysis highlighted that at-fault prone drivers are particularly vulnerable in specific roadway conditions, such as elevated curves and dark environments without street lighting, and are more frequently involved in single-vehicle run-off crashes compared to not-at-fault drivers. The significance of this research lies in its application for traffic safety management. The model provides a tool for traffic agencies to monitor the performance of at-fault crash-prone drivers and identify high-risk individuals for targeted interventions. The findings support the implementation of specialized safety programs, including enhanced education and stricter regulations, aimed at this high-risk demographic. By focusing resources on drivers who repeatedly commit at-fault crashes, authorities can more effectively reduce overall crash rates and move toward strategic safety goals, such as the "Destination Zero Deaths" initiative.

Key finding

The developed logistic regression model correctly classified at-fault crashes with 62.40% accuracy and 77.25% specificity, enabling the identification of future crash incidence for at-fault drivers.

Methodology

modeling

Sample size: 2076009

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via author_sweep_intake on 2026-05-27.

StageOutcomeToolModelPromptAttemptsCompleted
discover success author_sweep 2 2026-05-27
archive success canonical_url 5 2026-06-06
extract success cached 3 2026-06-10
clean success clean 1 2026-06-04
chunk success chunk 1 2026-06-04
embed success embed Qwen/Qwen3-Embedding-8B 1 2026-06-04
enrich success semantic_scholar 2 2026-06-04
promote success 1 2026-06-04
summarize success llm qwen3.6-27b-prismaquant summ-v5 2 2026-06-10
tag success vector_similarity 15 2026-06-11
verify success 2 2026-06-10

Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).