Modeling Crashes Severity Using Ensemble Techniques

ALHADIDI, Taqwa; ELHENAWEY, Mohammed · 2023 · Crossref

DOI: 10.55549/epstem.1410227

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This study addresses the critical issue of traffic crash severity at urban roundabouts in Jordan, a context where such intersections are often identified as safety hotspots despite their design intent to improve safety. Motivated by the high economic and human costs of traffic accidents in low- and middle-income countries, the research aims to identify significant contributing factors to crash severity and evaluate the performance of ensemble machine learning techniques in predicting these outcomes. Specifically, the study fills a gap in existing literature by incorporating driver sociodemographic attributes, such as age and gender, which were previously underutilized in roundabout crash analysis. The researchers utilized a dataset from the Jordanian Traffic Institute covering 30,486 crashes across 15 roundabouts in Amman from 2017 to 2021. After rigorous data screening to remove missing values, duplicates, and erroneous records, the final dataset comprised 12,971 validated data points with 15 variables. To address class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was applied. Feature selection identified ten significant variables, including driver fault, age, license type, speed, and lighting conditions. Three machine learning algorithms—K-Nearest Neighbors (KNN), Support Vector Machines (SVM), and Adaptive Boosting (AdaBoost)—were implemented and evaluated using 10-fold cross-validation, with metrics including precision, recall, and F1-score. The results demonstrated high predictive performance across all models, with overall accuracy ranging from 98% for KNN to 99% for both SVM and AdaBoost. Feature importance analysis revealed that driver fault and driver age were the most significant factors influencing crash severity. Other notable contributors included license type, year of occurrence, speed, season, time of day, number of vehicles, lanes, and lighting. Conversely, driver gender, holidays, and roadway surface conditions were found to be insignificant. The models consistently classified crash severity with high precision, though SVM showed the highest recall for non-fatal accidents, while AdaBoost prioritized fatal accident prediction accuracy. The study concludes that driver behavior, particularly fault and age, is the primary determinant of crash severity at urban roundabouts in Jordan. These findings provide actionable insights for traffic safety agencies to develop targeted regulations and interventions focused on controlling driver behavior. The research validates the effectiveness of ensemble machine learning techniques in handling imbalanced traffic safety data and highlights the importance of including sociodemographic variables in crash modeling. Future work is recommended to test these models on larger, more comprehensive datasets to further refine predictive capabilities.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

StageOutcomeToolModelPromptAttemptsCompleted
discover success Crossref 1 2026-06-19
archive success canonical_url 1 2026-06-26
extract success cached 2 2026-06-26
clean success clean 1 2026-06-20
chunk success chunk 1 2026-06-20
embed success embed Qwen/Qwen3-Embedding-8B 1 2026-06-20
promote success 1 2026-06-19
summarize success llm qwen3.6-27b-prismaquant summ-v5 1 2026-06-26
tag success vector_similarity 6 2026-06-20
verify success 1 2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).