Studying Person-Specific Pointing and Gaze Behavior for Multimodal Referencing of Outside Objects from a Moving Vehicle
URL: http://arxiv.org/abs/2009.11195
archive: archived pipeline: in_review verified
Abstract
As driver assistance systems are increasingly being deployed and become more sophisticated, the role of the driver and their interactions with the vehicle change. Future systems will support more natural and intuitive interaction modalities such as pointing and gaze for selecting outside objects from a moving vehicle. We study person-specific differences in pointing and gaze behavior across different speeds and traffic conditions for multimodal referencing of outside objects.
Summary
Within-subject driving-simulator study (medium-fidelity OpenDS) of multimodal hand-pointing + eye-gaze referencing of buildings (PoIs) outside a moving vehicle. Participants drove a 40-min two-lane route at 60 km/h while pointing and looking at notified target buildings; PoI distance, distractor density, side (left vs right), and driving mode (autonomous vs manual) were manipulated within subjects. Pointing and gaze vectors were transformed into a common 1D cylindrical coordinate system and compared to ground-truth target vectors. The authors test five hypotheses on side-of-road, density, distance, driving-mode, and modality (gaze vs pointing) effects, plus a clustering analysis to support a person-specific user-adaptive fusion approach.
Key finding
Pointing and gaze accuracy differ significantly by object side (better for right-side PoIs), distractor density, distance, and driving mode (autonomous vs manual); gaze is significantly more accurate than pointing, and individual differences are large enough to motivate a person-specific (clustering / modality-switching) fusion strategy rather than a global model.
Methodology
Within-subject counterbalanced experiment in a medium-fidelity driving simulator (OpenDS); each participant drove ~40 min at up to 60 km/h on a star-shaped two-lane route while performing 24 referencing trials per condition. Hand-pointing tracked by a Leap-Motion-style camera rig, eye gaze by an eye tracker mapped onto LCD-attached ArUco markers; PoI cued via tablet + auditory tone. Independent variables: PoI side, distance, distractor density, driving mode (manual vs autonomous). Dependent variables: pointing accuracy, gaze accuracy, pointing duration, glance phases (information / pointing / control glance). Analyses included repeated-measures ANOVA across hypotheses, k-means clustering on per-participant performance and timing features, and a 17-participant online pre-study (mean age 31.12, SD 15.47) to set PoI salience.
Sample size: 73 recruited; 39 retained after exclusions for technical failure (30), motion sickness (2), or improper task execution (2); plus 17-participant online pre-study for PoI salience.