Driver Monitoring System Using Computer Vision for Real-Time Detection of Fatigue, Distraction and Emotion via Facial Landmarks and Deep Learning.

Zambrano, Tamia; Arias, Luis; Haro, Edgar; Santos, Victor; Trujillo-Guerrero, María · 2026 · PubMed Central (PMC)

DOI: 10.3390/s26030889

URL: https://pmc.ncbi.nlm.nih.gov/articles/PMCPMC12899127/

archive: archived pipeline: cataloged verified

Summary

Real-time driver-monitoring system combining a MobileNetV2 CNN trained on RAF-DB for emotion recognition with MediaPipe 468-point facial landmarks for fatigue and distraction detection via Eye Aspect Ratio (EAR), Mouth Aspect Ratio (MAR), gaze, and head pose. Tested with 27 participants in real and simulated driving environments. Distraction detection (head-pose/gaze) reached 100% accuracy; eye-closure detection 88.89%; yawning 85.19%. Emotion recognition was strong for happiness (100%), anger/disgust (96.3%), and surprise (92.6%) but weak for sadness (66.7%) and failed for fear (0%) due to subtlety of real-world expressions and dataset limits.

Key finding

Combining MediaPipe facial landmarks (EAR/MAR/head-pose) with a MobileNetV2 emotion CNN achieved 100% distraction and ~85-89% drowsiness detection in real-time driving tests, though detection of subtle emotions (sadness, fear) remained poor.

Methodology

Real-time computer-vision pipeline: MediaPipe facial-landmark extraction feeding EAR/MAR thresholds and head-pose/gaze for fatigue and distraction; MobileNetV2 CNN trained on RAF-DB for seven-class emotion recognition. Evaluated in real and simulated driving conditions.

Sample size: 27 participants tested across real and simulated driving environments.

Quality score: 5 / 5

Topics