The Quality of Response Time Data Inference: A Blinded, Collaborative Assessment of the Validity of Cognitive Models
DOI: 10.3758/s13423-017-1417-2
archive: archived pipeline: cataloged verified
Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)
Summary
This paper addresses the validity of inferences drawn from cognitive models of response time (RT) data, specifically evidence-accumulation models like the diffusion model and Linear Ballistic Accumulation (LBA) model. These models translate observed RTs and accuracy into latent psychological constructs: ease of processing, response caution, response bias, and non-decision time. The authors highlight a critical threat to validity: "researcher degrees of freedom." Analysts face numerous arbitrary choices regarding model selection, estimation methods, and inference procedures, which may bias conclusions. While previous validation studies provided mixed support for the convergent and discriminant validity of these models, they often lacked blinding and tested only single methods. This study aims to assess how robust model-based inferences are against these analytical choices in a realistic, collaborative setting. To test this, the authors conducted a blinded, collaborative assessment involving 17 teams of experts analyzing 14 identical two-condition data sets. The data were generated from a random dot motion task performed by 20 participants. The experimental design manipulated three factors: stimulus difficulty (easy vs. hard), response caution (speed vs. accuracy emphasis instructions), and response bias (balanced vs. skewed stimulus probabilities). Crucially, the 17 contributing teams were blind to the specific manipulations in each data set. They were tasked with inferring which psychological construct (ease, caution, bias, or non-decision time) differed between the two conditions using their preferred models and analytical methods. This design allowed the authors to evaluate the validity of inferences across a wide range of currently popular analytical approaches, guarding against the bias that occurs when analysts know the expected results. The results demonstrated that while conclusions were generally similar across different methods, the "modeler’s degrees of freedom" did affect the specific inferences drawn. Notably, the study found that simpler analytical approaches and models yielded inferences that were as robust and accurate as those from more complex methods. The blinded nature of the study ensured that the validity assessment was not confounded by analysts tailoring their choices to match known outcomes. The findings suggest that the choice of model or estimation technique has less impact on the validity of high-level inferences than previously feared, provided the models are applied correctly. The significance of this work lies in its recommendation for standardizing RT data analysis. The authors argue that cognitive models should become a typical analysis tool for response time data, particularly in standard experimental designs. They conclude that simpler models and procedures are often sufficient and may be preferable due to their robustness. The paper also outlines situations where more complicated models are necessary and discusses potential pitfalls in interpreting model outputs. By demonstrating that valid inferences can be drawn despite analytical variability, the study supports the utility of cognitive modeling while advocating for transparency and the use of simpler, well-validated methods to minimize researcher bias.
Key finding
Simpler cognitive modeling approaches yielded inferences that were as robust and accurate as more complex methods, though researcher degrees of freedom did affect the specific conclusions drawn.
Methodology
dataset
Sample size: 20
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via author_sweep_intake on 2026-05-28.
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | author_sweep | — | — | 2 | 2026-05-28 |
| archive | success | canonical_url | — | — | 1 | 2026-06-04 |
| extract | success | cached | — | — | 3 | 2026-06-10 |
| clean | success | clean | — | — | 1 | 2026-06-04 |
| chunk | success | chunk | — | — | 1 | 2026-06-04 |
| embed | success | embed | Qwen/Qwen3-Embedding-8B | — | 1 | 2026-06-04 |
| enrich | success | — | — | — | 1 | 2026-05-28 |
| promote | success | — | — | — | 1 | 2026-06-04 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 2 | 2026-06-10 |
| tag | success | vector_similarity | — | — | 15 | 2026-06-11 |
| verify | success | — | — | — | 2 | 2026-06-10 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Empirical Findings: behavioral performance data
- Theoretical Contribution: computational model