Method for Establishing a Spatial Database of Traffic Signs with Machine Learning
DOI: 10.14710/mkts.v29i1.49928
archive: archived pipeline: cataloged verified
Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)
Summary
This study addresses the inefficiencies and inaccuracies inherent in the manual collection of traffic sign data for spatial databases. While traffic signs are critical for road safety and behavior regulation, current database management relies on surveyors manually recording sign types and locations. This process is hindered by the complexity of Indonesian traffic sign classifications, which include 154 sub-groups across four main categories (warning, prohibition, command, and instruction). The authors propose a machine learning-based image recognition method to automate the extraction of geotagged photo information into spatial data and attributes, thereby reducing reliance on manual classification. The research was conducted using traffic signs located on provincial roads in the Special Region of Yogyakarta. The methodology involved three main stages: dataset processing, model training, and detection. The initial dataset comprised 5,086 geotagged images, which was filtered to 1,806 images covering three sign types (warning, prohibition, and command) with more than 50 samples per class. The authors developed a preprocessing pipeline using Python and the OpenCV library to assist in labeling and bounding box detection. This pipeline included converting images from RGB to YUV and HSV color spaces, applying histogram equalization to enhance contrast, and using HSV color filtering to segment signs based on specific pixel value thresholds. Post-processing steps involved bitwise operations, grayscale conversion, morphological operations (erosion, dilation), and thresholding to isolate contours. The object detection model was trained using the YOLOv4 algorithm, with data split into 80% training and 20% testing sets. The results demonstrated that the machine learning-based image recognition method achieved a Mean Average Precision (mAP) of 88.66%. The model’s performance metrics included a precision of 0.72, a recall of 0.95, and an F1-Score of 0.82. Testing was conducted on 484 images with an Intersection over Union (IoU) threshold of 0.5. Specific sign categories showed varying levels of accuracy; for instance, "prohibition of parking" signs achieved 99.26% average precision, while "command to enter designated lane" signs achieved 61.07%. The final output was converted into GeoJSON and CSV formats, enabling the visualization of traffic sign locations within a Geographic Information System (GIS). The study concludes that automated image recognition significantly aids in the speed and efficiency of collecting georeferenced traffic sign data, eliminating the need for manual classification by surveyors. However, the authors note that accuracy is heavily influenced by the quality of the geotagged photos, including issues such as poor lighting, low resolution, and physical noise like graffiti or damage to signs. Future research is recommended to improve detection accuracy by simulating various photo capture variables, such as camera angles and lighting conditions.
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | DOAJ | — | — | 1 | 2026-06-17 |
| archive | success | unpaywall | — | — | 1 | 2026-06-25 |
| extract | success | cached | — | — | 2 | 2026-06-25 |
| clean | success | clean | — | — | 1 | 2026-06-18 |
| chunk | success | chunk | — | — | 1 | 2026-06-18 |
| embed | success | embed | Qwen/Qwen3-Embedding-8B | — | 1 | 2026-06-18 |
| promote | success | — | — | — | 1 | 2026-06-17 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 1 | 2026-06-25 |
| tag | success | vector_similarity | — | — | 6 | 2026-06-18 |
| verify | success | — | — | — | 1 | 2026-06-26 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-25; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.