Traffic sign recognition with multi-scale Convolutional Networks

Sermanet, Pierre; LeCun, Yann · 2011 · OpenAlex-citations

DOI: 10.1109/ijcnn.2011.6033589

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper presents a method for traffic sign recognition using multi-scale Convolutional Networks (ConvNets) within the German Traffic Sign Recognition Benchmark (GTSRB) competition. The authors address the challenge of classifying traffic signs under real-world variabilities, such as viewpoint changes, lighting conditions, occlusions, and low resolution. Unlike traditional approaches that rely on hand-crafted features like HOG or SIFT, ConvNets automatically learn hierarchical invariant features directly from data. The primary motivation is to develop a robust classifier that optimizes both accuracy and efficiency for applications in driver assistance and automated driving. The proposed architecture modifies the traditional ConvNet by feeding features from both the first and second stages of the network to the final classifier, rather than only the last stage. This multi-scale approach allows the classifier to utilize high-level, invariant global shapes from the second stage alongside low-level, precise local motifs from the first stage. The network employs sophisticated non-linearities, including rectified sigmoids and subtractive/divisive local normalization inspired by visual neuroscience. Data preparation involved resizing images to 32x32 pixels, converting them to YUV color space, and applying global and local contrast normalization. To improve robustness, the training set was augmented with jittered samples involving random perturbations in position, scale, and rotation. The authors empirically searched for optimal architectures by evaluating random-initialized networks before training the full system with supervised learning. During the first phase of the GTSRB competition, the system achieved an accuracy of 98.97%, ranking second overall and surpassing human performance (98.81%). Post-competition experiments established a new record accuracy of 99.17%. This improvement was achieved by increasing the classifier’s capacity to a two-layer structure with 100 hidden units and using grayscale images instead of color. Interestingly, a network using random features instead of trained filters still achieved a competitive 97.33% accuracy. Analysis of the remaining errors indicated that while grayscale was generally more effective, some misclassifications could have been corrected with color information, particularly for signs with low contrast in intensity. Other errors stemmed from motion blur, low resolution, and physical degradation of signs. The significance of this work lies in demonstrating that ConvNets can outperform human-level accuracy on traffic sign recognition without relying on hand-crafted features or temporal information. The findings highlight the effectiveness of multi-scale feature extraction and the surprising utility of grayscale inputs, likely due to the unreliability of raw color channels under varying lighting conditions. The paper suggests that future improvements could involve unsupervised pre-training, more diverse data augmentation, and ensemble methods combining colored and non-colored networks.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	OpenAlex-citations	—	—	1	2026-06-25
archive	success	semantic_scholar	—	—	6	2026-06-26
extract	success	cached	—	—	2	2026-06-26
clean	success	clean	—	—	1	2026-06-25
chunk	success	chunk	—	—	1	2026-06-25
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-25
promote	success	—	—	—	1	2026-06-25
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	1	2026-06-26
tag	success	vector_similarity	—	—	6	2026-06-25
verify	success	—	—	—	1	2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

distraction detection algorithms