The Development of Crash Modification Factors: Highway Safety Statistical Paper Synthesis

Donnell, Eric T.; Hanks, Ephraim M.; Porter, Richard J.; Cook, Lawrence J.; Srinivasan, Raghavan; Li, Fan; Nguyen, Maggie; Eccles, Kimberly · 2020 · Unknown

archive: archived pipeline: cataloged verified

Get this paper ↗ (full text — opens at the source; we link to it, we don't host it)

Summary

This report, produced by the Federal Highway Administration (FHWA) as part of the Evaluation of Low-Cost Safety Improvements Pooled Fund Study, synthesizes statistical methods for developing Crash Modification Factors (CMFs). The research addresses the need to better understand relationships between road safety and factors affecting crash occurrence and severity. It aims to compare current statistical-analysis methods and data sources with alternative approaches to improve the accuracy of safety-effect estimates and crash frequency predictions. The study is motivated by the transformation of the transportation-engineering community toward integrating more rigorous quantitative methods into task-development processes. The methodology involves a critical synthesis of existing safety-analysis techniques, supplemented by specific empirical analyses detailed in appendices. The report compares causal-inference methods, specifically propensity score (PS) matching, against traditional observational before–after methods for estimating the safety effects of centerline and edgeline rumble strips. It also evaluates regression trees (CART) and Random Forests against count regression methods for predicting crash frequencies on freeways. Additionally, the study examines methods to account for underreporting in crash-frequency models using New York State Department of Transportation (NYSDOT) geospatial roadway and crash datasets. A further component involves the probabilistic linkage of hospital and crash data from Utah to understand the relationship between crashes and site-specific contributing factors, utilizing the Crash Outcomes Data Evaluation System (CODES). Key findings indicate that causal-inference methods, such as PS matching, offer advantages over traditional before–after studies by better addressing selection bias and endogeneity. Simulation-based comparisons demonstrated that PS methods could provide more robust CMF estimates when unobserved heterogeneity is present. Regarding crash frequency prediction, tree-based methods like CART and Random Forests were found to have strong predictive power, often comparable to or exceeding traditional Negative Binomial regression models, particularly in capturing non-linear relationships and interactions among variables. The analysis of underreporting revealed that failing to account for unreported crashes leads to biased parameter estimates; models incorporating underreporting adjustments provided more accurate predictions of total crash frequencies. The probabilistic linkage of Utah hospital and crash data successfully identified matches, allowing for a comparison of reported injury severity with actual medical outcomes, highlighting discrepancies in crash reporting accuracy. The significance of this work lies in its provision of insight into more effective analytical tools for highway safety research. By validating alternative statistical methods and data integration techniques, the report supports the development of more accurate CMFs and safety performance functions. The findings suggest that integrating causal-inference frameworks and machine learning algorithms can enhance the reliability of safety countermeasure evaluations. Furthermore, the emphasis on linking disparate data sources, such as hospital records and crash reports, underscores the potential for deeper understanding of crash contributing factors. This synthesis serves as a guide for the transportation-engineering community to adopt advanced quantitative methods, ultimately improving the efficacy of low-cost safety improvements and the overall integrity of highway safety analysis.

Key finding

Advanced statistical methods, including causal inference, machine learning, and probabilistic data linkage, offer improved accuracy and insight for developing crash modification factors and analyzing road safety compared to traditional observational approaches.

Methodology

review

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via author_sweep_intake on 2026-05-28.

StageOutcomeToolModelPromptAttemptsCompleted
discover success author_sweep 2 2026-05-28
archive success canonical_url 6 2026-06-06
extract success cached 3 2026-06-10
clean success clean 1 2026-06-07
chunk success chunk 1 2026-06-07
embed success embed Qwen/Qwen3-Embedding-8B 1 2026-06-07
enrich skipped 4 2026-07-02
promote success 1 2026-06-04
summarize success llm qwen3.6-27b-prismaquant summ-v5 2 2026-06-10
tag success vector_similarity 15 2026-06-11
verify partial 2 2026-06-10

Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified_with_issues.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).