DistractNet: a deep convolutional neural network architecture for distracted driver classification

Nasri, Ismail; Karrouchi, Mohammed; Snoussi, Hajar; Kassmi, Kamal; Messaoudi, Abdelhafid · 2022 · Crossref

DOI: 10.11591/ijai.v11.i2.pp494-503

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper addresses the critical safety issue of distracted driving, which contributes significantly to traffic accidents, injuries, and fatalities. The authors propose DistractNet, a novel deep convolutional neural network (CNN) architecture designed to detect and classify driver distraction states from visual data. The study aims to develop a model that offers high classification accuracy while maintaining efficiency in terms of training time and model size, thereby facilitating potential deployment in real-time monitoring systems. The research utilizes the State Farm Distracted Driver Detection dataset, comprising 22,424 RGB images categorized into ten classes: safe driving, texting (left/right hand), talking on the phone (left/right hand), operating the radio, drinking, reaching behind, hair and makeup, and talking to passengers. The images were resized to 224×224, 227×227, or 299×299 pixels and split into 70% for training and 30% for testing. DistractNet was designed with seven hidden layers, including convolutional and pooling layers for feature extraction, followed by fully connected and softmax layers for classification. The model was trained from scratch using MATLAB and compared against four pre-trained networks—ResNet-50, GoogLeNet, InceptionV3, and AlexNet—using transfer learning techniques. Performance metrics included classification accuracy, training time, execution speed, and model size. Experimental results demonstrate that DistractNet achieves an average accuracy of 99.32%, outperforming the compared pre-trained models, which ranged from 97.96% (GoogLeNet) to 98.44% (AlexNet). DistractNet also exhibited superior efficiency, with a model size of only 7.99 MB, significantly smaller than the largest competitor, AlexNet (629 MB). While training from scratch took approximately 93 minutes, the model achieved a fast execution speed of 0.0299 seconds per classification. Analysis of the confusion matrix revealed that DistractNet occasionally misclassified "reaching behind" and "talking to passenger(s)" due to similar head positions, as well as "talking (right hand)" and "hair and makeup." The study also confirmed that classification accuracy correlates positively with the volume of training data. The significance of this work lies in the development of a lightweight, highly accurate CNN model for distracted driver detection. By achieving higher accuracy than established pre-trained networks while requiring substantially less storage space, DistractNet presents a viable solution for integration into embedded systems and vehicle electronic control units. The authors suggest future work should focus on real-time implementation and integration with vehicle networks to provide immediate auditory or textual warnings to distracted drivers, potentially reducing accident rates.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

StageOutcomeToolModelPromptAttemptsCompleted
discover success Crossref 1 2026-06-24
archive success canonical_url 1 2026-06-26
extract success cached 2 2026-06-26
clean success clean 1 2026-06-25
chunk success chunk 1 2026-06-25
embed success embed Qwen/Qwen3-Embedding-8B 1 2026-06-25
promote success 1 2026-06-24
summarize success llm qwen3.6-27b-prismaquant summ-v5 1 2026-06-26
tag success vector_similarity 6 2026-06-25
verify success 1 2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).