Estimating helmet wearing rates via a scalable, low-cost algorithm: a novel integration of deep learning and google street view

Li, Qingfeng; Wang, Xianglong; Bachani, Abdulgafoor M. · 2024 · DOAJ

DOI: 10.1186/s12889-024-19118-0

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This study addresses the critical need for large-scale, standardized data on motorcycle helmet usage to support evidence-based policymaking and intervention evaluation. While helmet use significantly reduces the risk of death and severe injury, global enforcement remains suboptimal, and traditional observational methods for monitoring compliance are costly, difficult to scale, and prone to subjective bias. To overcome these limitations, the authors developed a scalable, low-cost algorithm that integrates deep learning with Google Street View imagery to estimate helmet-wearing rates globally. The methodology employs a two-module approach. First, images are acquired from Google Street View in Bandung, Indonesia, a site selected for its high motorcycle usage and traffic injury burden. These images undergo preprocessing using the YOLOv5 object detection algorithm to filter out images without motorcycles and to crop motorcycles from the background, thereby reducing input complexity. Second, a custom YOLO model is trained to detect three specific object classes: helmets, drivers (defined by the front wheel), and passengers (defined by the rear wheel). The training dataset consisted of 3,995 images containing 9,310 manually labeled instances. Annotations were performed via Amazon Mechanical Turk, with high inter-rater reliability, and subsequently refined by a co-author to ensure precise bounding boxes. The model was trained on an Nvidia GeForce RTX3080Ti GPU. The algorithm demonstrated high accuracy in out-of-sample predictions. Across all three object classes, it achieved a precision of 0.927, a recall of 0.922, and a mean average precision at 50 (mAP50) of 0.956. Specifically, the model showed strong performance in detecting drivers (precision 0.88, recall 0.959) and helmets (precision 0.975, recall 0.923). When applied to estimate helmet-wearing rates, the algorithm calculated a rate of 83%, compared to a ground truth of 92%. The primary source of error was the misclassification of other road users, such as pedestrians and cyclists, as drivers or passengers. The authors conclude that this approach provides a robust, cost-effective tool for monitoring helmet compliance in any location with Street View coverage. The ability to generate geospatial and temporal data allows stakeholders to identify high-risk areas, track the impact of policy interventions, and benchmark progress toward global road safety targets. While the current model requires further fine-tuning for diverse global settings and faces potential API cost barriers for massive deployments, the framework offers a significant advancement over traditional data collection methods, facilitating more accurate and timely public health surveillance.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	DOAJ	—	—	1	2026-06-19
archive	success	unpaywall	—	—	1	2026-06-25
extract	success	cached	—	—	2	2026-06-26
clean	success	clean	—	—	1	2026-06-19
chunk	success	chunk	—	—	1	2026-06-19
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-19
promote	success	—	—	—	1	2026-06-19
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	1	2026-06-26
tag	success	vector_similarity	—	—	6	2026-06-19
verify	success	—	—	—	1	2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

helmet protective

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).

Empirical Findings: observational prevalence