Handbook of Human Performance Measures and Crew Requirements for Flight Deck Research

Rehmann, Albert J. · 1995 · ROSA P / Crew Systems Ergonomics/Human Systems Technology Information Analysis Center

archive: archived pipeline: cataloged verified

Get this paper ↗ (full text — opens at the source; we link to it, we don't host it)

Summary

This handbook, commissioned by the Federal Aviation Administration (FAA) Technical Center and produced by the Crew System Ergonomics Information Analysis Center (CSERIAC), addresses the need for standardized human performance measures in flight deck research. The motivation stems from the increasing automation in modern aircraft, which has shifted pilot roles from manual control to system supervision and monitoring. This transition makes crew behavior less observable and more cognitively demanding, necessitating reliable evaluation tools. The primary objectives were to identify state-of-the-art measures for workload, situational awareness, and vigilance; provide guidance on selecting appropriate measures for specific study classifications; and establish criteria for pilot subject selection to ensure data generalizability across government and industry partners. The methodology involved an extensive literature search of databases such as DTIC and NTIS, focusing on the last decade of research, supplemented by consultations with subject matter experts from NASA, the Army, and various universities. The report reviews empirical assessment techniques—subjective, performance-based, and physiological—evaluating them against nine measurement criteria: reliability, validity, sensitivity, diagnosticity, intrusiveness, and implementation requirements. It categorizes workload measures into subjective rating scales (e.g., NASA-TLX, SWAT), primary task measures (e.g., control input activity, speed, accuracy), secondary task methods (e.g., embedded tasks, choice-reaction time), and physiological indicators (eye, brain, and heart-related metrics). The document also defines study classifications, including part-task, full-mission, and end-to-end simulations, providing a matrix to match appropriate measures to each classification. Additionally, it outlines guidelines for pilot subject characteristics, such as experience levels and crew composition, to enhance the representativeness of research data. Key findings highlight the trade-offs inherent in different measurement techniques. Subjective measures are noted for low cost and low intrusiveness but lack diagnostic specificity, serving primarily as global screening tools. Primary task measures are non-intrusive but often lack sensitivity at moderate workload levels and require application-specific development. Secondary tasks offer higher sensitivity for low-to-moderate workload but can be intrusive; "embedded" secondary tasks are recommended to mitigate this issue in operational settings. Physiological measures provide objective data but require significant instrumentation and may restrict use in early development stages. The report emphasizes that no single measure satisfies all criteria, advocating for a multi-measure approach. It also establishes a database of expert contacts and pilot selection criteria to support future FAA evaluations. The significance of this work lies in its provision of a standardized framework for human factors research within the FAA. By defining clear guidelines for measure selection and subject characteristics, the handbook facilitates the translation of performance data between studies and across different organizations. This standardization supports the evaluation of new technologies, such as Data Link communications and reconfigurable cockpit systems, ensuring that research outcomes are valid, reliable, and applicable to real-world flight operations. The report serves as a critical resource for human factors practitioners aiming to assess the impact of automation on crew performance effectively.

Key finding

The report establishes a comprehensive framework of measurement criteria and guidelines for assessing workload, situational awareness, and vigilance to standardize pilot performance evaluation in flightdeck research.

Methodology

review

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via bulk_ingest_rosap on 2026-05-23 (6 acquisition events logged).

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	rosap	—	—	2	2026-05-23
archive	success	—	—	—	1	2026-05-23
extract	success	cached	—	—	2	2026-06-10
clean	success	—	—	—	1	2026-06-01
chunk	success	—	—	—	1	2026-06-01
embed	success	—	—	—	1	2026-06-02
enrich	success	—	—	—	1	2026-05-23
promote	success	—	—	—	1	2026-05-23
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	3	2026-06-10
tag	success	vector_similarity	—	—	19	2026-06-11
verify	success	—	—	—	2	2026-06-10

Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).

Empirical Findings: self report data, physiological data
Methodological Resource: validation psychometrics