Supervised association rules mining on pedestrian crashes in urban areas: identifying patterns for appropriate countermeasures

Das, Subasish; Dutta, Anandi K; Avelar, Raul; Dixon, Karen; Sun, Xiaoduan; Jalayer, Mohammad · 2018 · International Journal of Urban Sciences

DOI: 10.1080/12265934.2018.1431146

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This study addresses the critical issue of pedestrian safety in urban areas, specifically focusing on vehicle-pedestrian crashes in Louisiana. Motivated by high fatality rates—where pedestrians accounted for 17% of traffic fatalities in Louisiana in 2012, with alcohol involvement in nearly 44% of these cases—the research aims to identify significant patterns and unsuspected relationships within crash data. The authors utilize supervised association rules mining to extract actionable insights for developing targeted countermeasures, moving beyond traditional parametric statistical modeling which often requires predefined assumptions. The methodology employs the ‘a priori’ algorithm for association rules mining on a dataset of 11,503 vehicle-pedestrian crashes recorded between 2004 and 2011, obtained from the Louisiana Department of Transportation and Development. The data was consolidated from multiple tables, filtering for at-fault vehicle crashes and single pedestrian involvements. To ensure meaningful analysis, variable importance was ranked using Random Forest algorithms to select key attributes such as roadway geometry, lighting, collision type, weather, and driver/pedestrian demographics and conditions. The analysis was structured into four supervised cases based on response variables: fatal crashes, severe crashes, moderate/complaint injury crashes, and impaired pedestrian conditions. Parameters for minimum support and confidence were calibrated through trial and error to balance specificity and data coverage, with results generated using the R package ‘arules’. The findings reveal several distinct patterns associated with crash severity and occurrence. Roadway lighting was identified as a critical factor; crashes occurring at night without street lights were highly associated with fatal outcomes, with a lift value of 3.008, indicating that the proportion of fatal crashes in these conditions was three times higher than the dataset average. The combination of no street lights and single-vehicle collisions yielded the highest lift value (3.733) for fatal crashes. Demographic patterns indicated that male pedestrians had a greater propensity for severe and fatal crashes, while middle-aged male pedestrians (35–54) were inclined toward crash occurrence. Additionally, younger female drivers (ages 15–24) were found to be more crash-prone than other age groups. The study also noted that impaired pedestrians remained vulnerable even on roadways with lighting at night, and single-vehicle crashes dominated the dataset. The significance of this research lies in its application of data mining to uncover non-trivial associations that inform specific safety countermeasures. By identifying that lighting alleviates crash severity and highlighting high-risk demographic groups, the study provides traffic safety professionals with evidence-based targets for intervention. The authors recommend raising awareness and implementing improvements tailored to these identified patterns, such as enhancing roadway lighting and targeting education or enforcement efforts toward younger female drivers and middle-aged male pedestrians. This approach demonstrates the utility of association rules mining in transportation safety for discovering hidden risk factors in large, complex datasets.

Key finding

The absence of street lighting at night is significantly associated with an increased likelihood of fatal and severe pedestrian crashes, particularly in single-vehicle collisions.

Methodology

dataset

Sample size: 11503

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via author_sweep_intake on 2026-05-28.

StageOutcomeToolModelPromptAttemptsCompleted
discover success author_sweep 2 2026-05-28
archive success canonical_url 7 2026-06-06
extract success cached 3 2026-06-10
clean success clean 1 2026-06-04
chunk success chunk 1 2026-06-04
embed success embed Qwen/Qwen3-Embedding-8B 1 2026-06-04
enrich success 1 2026-05-28
promote success 1 2026-06-04
summarize success llm qwen3.6-27b-prismaquant summ-v5 2 2026-06-10
tag success vector_similarity 15 2026-06-11
verify success 2 2026-06-10

Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).