Predicting Take-over Time for Autonomous Driving with Real-World Data:\n Robust Data Augmentation, Models, and Evaluation

Rangesh, Akshay; Deo, Nachiket; Greer, Ross; Gunaratne, Pujitha; Trivedi, Mohan M. · 2021 · arXiv (Cornell University)

DOI: 10.48550/arxiv.2107.12932

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper addresses the challenge of predicting Take-Over Time (TOT) in conditionally autonomous vehicles, a critical factor for ensuring safe control transitions between the vehicle and the human driver. The authors identify that existing TOT prediction models are limited by the scarcity of real-world data, as capturing actual takeover events is costly and time-consuming. Furthermore, prior research often relies on simulator data or "Wizard of Oz" setups, which may not reflect real-world driver behavior. The study aims to develop robust TOT prediction models using real-world data, introducing a novel data augmentation scheme to overcome dataset limitations and employing multimodal outputs to account for the uncertainty in driver reactions. The researchers conducted a Controlled Data Study (CDS) using a Tesla Model S testbed equipped with driver-facing cameras. They collected 1,375 takeover events from 89 subjects who performed various secondary activities, such as texting, phone calls, and reading, while driving. For each event, they manually annotated the time required for the driver to place their eyes on the road, hands on the wheel, and foot on the pedal after a Take-Over Request (TOR). To address the limited dataset size, the authors proposed a data augmentation technique that generates synthetic training samples by shifting the TOR timestamp within the recorded event, creating intermediate states of driver readiness. Feature extraction utilized Convolutional Neural Networks (CNNs) to analyze gaze, hand, and foot activity from video feeds. These features were then fed into Long Short-Term Memory (LSTM) models to predict TOT. The study evaluated several architectures, including a baseline LSTM, independent LSTMs for each body part, and models with multimodal outputs. The results demonstrated that models trained on the augmented dataset significantly outperformed those trained on the original raw data. The augmentation scheme effectively increased the number of training samples by an order of magnitude, improving model generalization. Ablation experiments revealed that incorporating hand activity features was particularly crucial for accurate TOT prediction, as hand movements often lag behind eye and foot movements. The multimodal LSTM model, which outputs a distribution over possible TOTs rather than a single value, provided more robust estimates by accounting for the inherent uncertainty in driver behavior. The study also found that more distracting secondary activities, such as texting and reading, resulted in longer takeover times, while activities like talking to a passenger had less impact. The significance of this work lies in its contribution to the development of safer autonomous driving systems. By providing a method to generate sufficient training data from limited real-world recordings, the authors enable the training of more accurate and robust TOT prediction models. The proposed multimodal approach offers a nuanced understanding of driver readiness, allowing autonomous systems to make better-informed decisions about when to issue takeover requests or deploy active safety measures. This research bridges the gap between simulator-based studies and real-world deployment, offering a practical framework for modeling driver-vehicle interactions in complex, real-world scenarios.

Key finding

Models trained on an augmented dataset of real-world take-over events, utilizing multimodal driver behavior features, significantly outperform models trained on the original limited dataset in predicting take-over times.

Methodology

lab_experiment

Sample size: 89

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via author_sweep_intake on 2026-05-28.

StageOutcomeToolModelPromptAttemptsCompleted
discover success author_sweep 2 2026-05-28
archive success canonical_url 1 2026-06-04
extract success cached 3 2026-06-10
clean success clean 1 2026-06-04
chunk success chunk 1 2026-06-04
embed success embed Qwen/Qwen3-Embedding-8B 1 2026-06-04
enrich success 1 2026-05-28
promote success 1 2026-06-04
summarize success llm qwen3.6-27b-prismaquant summ-v5 2 2026-06-10
tag success vector_similarity 15 2026-06-11
verify success 2 2026-06-10

Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).