Comparison of Different Response Time Outlier Exclusion Methods: A Simulation Study

Berger, Alexander; Kiefer, Markus · 2021 · OpenAlex

DOI: 10.3389/fpsyg.2021.675558

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This simulation study addresses the lack of consensus regarding the optimal method for excluding response time (RT) outliers in cognitive psychology. While outlier exclusion is standard practice to improve signal-to-noise ratios, various methods exist with differing assumptions, and it remains unclear which approach best recovers the uncontaminated RT distribution without introducing statistical bias. The authors aimed to compare ten different outlier exclusion methods against the baseline of no exclusion, evaluating their performance in terms of bias, defined as the deviation in the proportion of significant statistical differences compared to valid, outlier-free data. The researchers conducted a comprehensive simulation using Ex-Gaussian distributions to model RTs, varying parameters such as sample size (20–100 trials), mean, standard deviation, and exponential decay rate to ensure generalizability. For each of 101 possible population mean differences (0–100 ms), 5,000 pairs of samples were simulated. Outliers were introduced by replacing valid RTs, with the proportion of outliers ranging from 0% to 10%. Two outlier generation approaches were tested: one generating outliers at the distribution tails and another inserting outliers overlapping with the genuine distribution. The study evaluated ten exclusion methods, including absolute cutoffs, relative cutoffs based on mean ±2 or ±3 standard deviations, Tukey’s interquartile range method, quantile-based exclusions, the Median Absolute Deviation (MAD) method, and a transformation-based approach. The results revealed substantial differences in bias among the exclusion methods. Some methods exhibited high rates of Type-I errors, rendering them unsuitable for use. Specifically, methods based on z-scores or standard deviations introduced only small biases, performing relatively well in recovering the true distribution. In contrast, the absence of any outlier exclusion resulted in the largest absolute bias, confirming that outliers significantly distort statistical outcomes. The study demonstrated that while outlier exclusion is necessary, the choice of method critically impacts the accuracy of statistical inference, with certain robust methods like MAD and z-score thresholds offering superior performance compared to others. The significance of this work lies in providing empirical guidance for researchers analyzing RT data. By quantifying the bias introduced by different exclusion techniques, the study challenges the notion that outlier exclusion is uniformly beneficial or harmful. It suggests that researchers should avoid methods prone to high Type-I errors and consider z-score or MAD-based approaches to minimize bias. These findings contribute to the field of mental chronometry by offering a data-driven framework for selecting outlier exclusion strategies, thereby improving the reliability and validity of findings in cognitive psychology research.

Key finding

Outlier exclusion methods based on z-scores or standard deviations introduced minimal bias compared to the large bias observed when no exclusion was applied, whereas some other methods produced unacceptably high Type-I error rates.

Methodology

simulation_modeling

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via openalex_abstract on 2026-05-08 (3 acquisition events logged).

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	—	—	—	1	2026-05-07
archive	success	canonical_url	—	—	3	2026-06-06
extract	success	cached	—	—	3	2026-06-10
clean	success	clean	—	—	1	2026-06-04
chunk	success	chunk	—	—	1	2026-06-04
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-04
enrich	success	openalex	—	—	3	2026-05-08
promote	success	—	—	—	1	2026-05-07
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	2	2026-06-10
tag	success	vector_similarity	—	—	15	2026-06-11
verify	success	—	—	—	2	2026-06-10

Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

perception reaction time