Teaching Multimodal Interaction in Cars to First-time Users

Marinissen, Thomas; Glimmann, Jonas; Bazilinskyy, Pavlo · 2026 · Crossref

DOI: 10.54941/ahfe1007155

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This study investigates the effectiveness of proactive teaching methods for multimodal gaze and gesture interactions in SAE Level 5 automated vehicles. As automation increases, users engage in non-driving-related tasks, necessitating alternative interaction modalities like gaze and mid-air gestures that do not rely on physical reach or fixed postures. However, the complexity of these multimodal systems risks overwhelming users and reducing feature discoverability. The research addresses the need for effective onboarding by comparing three variations of proactive visual teaching pop-ups designed to instruct first-time users on how to operate these novel interfaces. The experimental design involved 30 adult participants in a driving simulator setup simulating a luxury sedan interior with reclined seating. Participants were exposed to one of three teaching conditions while performing secondary tasks: Condition 1 (C1) used a small, side-placed pop-up requiring active user input to view details; Condition 2 (C2) used a larger, centrally placed pop-up also requiring user input; and Condition 3 (C3) used the largest pop-up, automatically displayed on the instrument cluster without requiring user input. The study measured notice rates, interaction rates, and user satisfaction using NASA-TLX, Acceptance, and Kano scales, alongside a custom questionnaire assessing visual properties and content clarity. Results indicated that C3 was the most effective teaching method, yielding the highest notice rate (28 out of 30 participants) and the highest interaction rate (12 interactions). In contrast, C1 resulted in zero interactions, implying no learning occurred, while C2 yielded only one interaction. Statistical analysis revealed significant differences in visibility, size, and duration ratings between conditions, with C3 scoring highest on visibility. Although C3 and C2 were preferred over C1 in rankings, there were no significant differences in overall acceptance or satisfaction scores across the conditions, suggesting users valued the concept of proactive teaching regardless of the specific implementation. Additionally, the mute gesture was the most preferred interaction, while play and pinch gestures were least liked due to recognition difficulties and complex animations. The findings conclude that proactive teaching of multimodal interactions is well-received by users and can significantly improve user experience in future automated vehicles. Effectiveness is driven by visual salience, size, and the reduction of user effort, with automatic, high-visibility presentations proving superior to those requiring active exploration. The study highlights that simple gestures paired with clear animations facilitate learning, while complex gestures hinder adoption. These results suggest that automotive interfaces should prioritize clear, visible, and minimally effortful proactive guidance to ensure users discover and correctly utilize new interaction modalities.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

StageOutcomeToolModelPromptAttemptsCompleted
discover success Crossref 1 2026-06-24
archive success canonical_url 1 2026-06-26
extract success cached 2 2026-06-26
clean success clean 1 2026-06-25
chunk success chunk 1 2026-06-25
embed success embed Qwen/Qwen3-Embedding-8B 1 2026-06-25
promote success 1 2026-06-24
summarize success llm qwen3.6-27b-prismaquant summ-v5 1 2026-06-26
tag success vector_similarity 6 2026-06-25
verify success 1 2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).