Towards a Common Understanding of Driving Simulator Validity
archive: archived pipeline: cataloged verified
Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)
Summary
This paper addresses the lack of a common understanding regarding driving simulator validity within the Automotive User Interface (AutoUI) community. Despite driving simulators being widely used for their controllability and safety, there is no consensus on when a simulator should be considered valid or how validity should be investigated. This ambiguity hinders the correct interpretation of findings and the comparison of results across studies. The authors aim to refine the definition of simulator validity and provide a framework and recommendations for researchers to evaluate simulator setups based on specific research questions rather than treating validity as a global property of the hardware. The authors conducted a literature-based discussion reviewing definitions of validity proposed over the past four decades. They synthesized these concepts into a proposed framework distinguishing between physical validity (the correspondence of the simulated environment to the real world) and behavioral validity (the correspondence of driver behavior in the simulator to real-world behavior). Behavioral validity is further subdivided into absolute validity (numerical correspondence of data) and relative validity (directional correspondence of effects). The paper analyzes factors influencing these constructs, including simulator properties (e.g., motion systems, visualization hardware), individual factors (e.g., driving style, susceptibility to simulator sickness), and mediating factors like perception and sense of presence. The authors also critique methodological issues in existing validation studies, particularly the misuse of null-hypothesis significance tests to claim validity from non-significant results. The main findings highlight that validity is use-case dependent; a simulator valid for one research question may not be valid for another. The authors argue that behavioral validity is the key metric for most HCI research, while physical validity is more relevant for tasks like tuning driving dynamics. They identify that higher fidelity does not automatically yield better validity, as perceptual biases or cue mismatches can distort behavior. Furthermore, they emphasize that statistical validity requires appropriate power and the use of equivalence tests or Bayesian hypothesis tests to distinguish between true equivalence and a lack of evidence due to low power. The paper notes that interfering factors, such as simulator sickness and perceptual biases, can negatively impact validity regardless of the simulator's physical realism. The significance of this work lies in its practical recommendations for the AutoUI community. The authors propose that researchers should define relevant effects and use cases before selecting a simulator, utilizing standardized tables to map simulator properties to research requirements. They recommend adopting standards for describing simulator setups to improve comparability and suggest using Bayesian methods or equivalence tests for validation studies. By shifting the focus from global simulator labels to context-specific validity assessments, the paper aims to foster a common understanding and more rigorous evaluation practices in driving simulation research.
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | Crossref | — | — | 1 | 2026-06-06 |
| archive | success | semantic_scholar | — | — | 6 | 2026-06-09 |
| extract | success | cached | — | — | 2 | 2026-06-10 |
| clean | success | clean | — | — | 1 | 2026-06-09 |
| chunk | success | chunk | — | — | 1 | 2026-06-09 |
| embed | success | embed | Qwen/Qwen3-Embedding-8B | — | 1 | 2026-06-09 |
| enrich | success | semantic_scholar | — | — | 1 | 2026-06-09 |
| promote | success | — | — | — | 1 | 2026-06-06 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 1 | 2026-06-10 |
| tag | success | vector_similarity | — | — | 8 | 2026-06-11 |
| verify | partial | — | — | — | 1 | 2026-06-10 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified_with_issues.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Methodological Resource: validation psychometrics, tool software
- Theoretical Contribution: computational model