Advancing Dynamic Hand Gesture Recognition in Driving Scenarios with Synthetic Data
archive: archived pipeline: cataloged verified
Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)
Summary
This paper addresses the challenge of creating diverse, high-quality datasets for dynamic hand gesture recognition in automotive environments. Training robust deep learning models for vehicle interaction requires large amounts of data, but collecting real-world gestures during dual-task driving scenarios is expensive, time-consuming, and prone to bias. Existing synthetic data approaches often focus on static gestures or lack realistic environmental variations. To overcome these limitations, the authors introduce SynthoGestures, a framework that generates synthetic dynamic hand gestures using Unreal Engine and 3D models, aiming to improve model generalization and reduce reliance on extensive real-data collection. The SynthoGestures framework automates the generation of gesture datasets by iterating through customizable parameters, including camera types (RGB, infrared, depth), camera positions (e.g., top-view, behind the wheel), and gesture performance variations (speed, finger spacing, hand shape). The system simulates realistic sensor noise, such as depth camera artifacts and infrared Fresnel effects, to enhance data fidelity. Gestures are executed using inverse kinematics and spline-based movement paths to ensure natural motion. The authors evaluated the framework using the NVIDIA Dynamic Hand Gesture Dataset and a state-of-the-art recognition model. They generated 600 synthetic gesture videos across six gesture classes with varied parameters. Experiments compared baseline models trained on real data only against models pre-trained on synthetic data or trained simultaneously on mixed synthetic and real datasets, testing various ratios of synthetic to real data. The results demonstrate that synthetic data significantly enhances gesture recognition accuracy. Models trained exclusively on synthetic data achieved lower accuracy, but those utilizing a combination of synthetic and real data showed substantial improvements. Specifically, a model trained simultaneously with equal proportions of synthetic and real data achieved an accuracy of 89.58%, compared to 79.86% for models pre-trained on synthetic data and fine-tuned on real data. The baseline model using only real data served as a lower benchmark in this specific comparative context. Furthermore, ablation studies on variation ranges revealed that moderate variations in speed, position, and finger spacing optimized performance, while extreme ranges could disrupt recognition features. The framework successfully generated natural-looking dynamic gestures that augmented real datasets effectively. The significance of this work lies in providing a cost-effective, flexible tool for generating large-scale synthetic datasets tailored for automotive human-machine interfaces. By enabling the simulation of diverse camera setups and gesture variations without additional hardware costs, SynthoGestures facilitates the development of more robust and generalizable gesture recognition systems. The findings confirm that synthetic data can partially or fully replace real data in training pipelines, mitigating overfitting and addressing dataset biases. This approach accelerates the development of in-vehicle interaction technologies by reducing the time and effort required for data collection while maintaining high recognition performance.
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via openalex_abstract on 2026-05-08 (3 acquisition events logged).
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | — | — | — | 1 | 2026-05-07 |
| archive | success | canonical_url | — | — | 10 | 2026-06-09 |
| extract | success | cached | — | — | 2 | 2026-06-09 |
| clean | success | clean | — | — | 1 | 2026-06-04 |
| chunk | success | chunk | — | — | 1 | 2026-06-04 |
| embed | success | embed | Qwen/Qwen3-Embedding-8B | — | 1 | 2026-06-04 |
| enrich | success | openalex | — | — | 2 | 2026-05-08 |
| promote | success | — | — | — | 1 | 2026-05-07 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 1 | 2026-06-09 |
| tag | success | vector_similarity | — | — | 15 | 2026-06-11 |
| verify | success | — | — | — | 1 | 2026-06-09 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-09; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Methodological Resource: tool software