Analysis of Data on Air Force Personnel Collected at Lackland Air Force Base
archive: archived pipeline: cataloged verified
Get this paper ↗ (full text — opens at the source; we link to it, we don't host it)
Summary
This study analyzes data collected from Air Force personnel to identify variables that predict automobile accident frequency. The research was motivated by a 1967 report by the Personnel Research Laboratory at Lackland Air Force Base, which tested approximately 12,000 basic airmen and officer candidates. The initial study concluded that, aside from estimated miles driven, no variables practically predicted accidents. However, the author, Frederick L. McGuire, hypothesized that the data could provide further insights into accident prediction, particularly for young male drivers, by distinguishing between "true" predictors (established before the fact) and "quasi" predictors (after-the-fact estimates). The analysis focused on a subset of 2,961 enlisted airmen aged 17 to 20, excluding officer candidates to ensure homogeneity. The sample was randomly divided into validation and cross-validation groups. The criterion variable was the self-reported total number of lifetime accidents, categorized by frequency. The study examined biographical items, aptitude test scores, and driving history. Variables were classified as "true" predictors (e.g., age, home value, family income) or "quasi" predictors (e.g., estimated mileage, smoking habits). The primary goals were to determine which variables significantly related to accident frequency, distinguish their predictive timing, and assess the potential for combined prediction models. The results indicated that among "true" predictors, only the value of the parents' home survived cross-validation, showing a significant correlation of .100 with accident frequency. Higher home values correlated with higher accident rates, ranging from 44 accidents per 100 drivers for homes valued under $4,500 to 70 per 100 for homes over $20,000. When "quasi" predictors were added, the combined model achieved a cross-validation R of .22, with mileage, home value, mechanical aptitude, and electronics aptitude scores contributing significantly. Formal driver education showed no significant correlation with accident frequency. However, smoking habits correlated significantly with accidents (r = .104 in cross-validation), with heavy smokers reporting higher accident rates than non-smokers. The study also found that driver education participants came from higher socio-economic backgrounds, which likely masked any potential safety benefits of the training. The study concludes that identifying "risk groups" among young male drivers is feasible, with prediction correlations potentially reaching .30 or higher when combining data from other populations. The author recommends that future studies focus on retrospective gathering of "true" predictors, such as age, parental education, occupational category, home value, aptitude scores, and smoking habits. This approach allows for the classification of individuals into high, low, and median risk categories, facilitating more targeted highway safety programs and deeper causal analysis of accident-prone behavior. The findings suggest that socio-economic status and smoking habits are significant indicators of accident risk, while formal driver education does not serve as a reliable predictor in this homogeneous military population.
Key finding
The value of the parents' home was the only significant true predictor of accident frequency among young airmen, with a cross-validated correlation of .10, while driver education showed no predictive value.
Methodology
dataset
Sample size: 2961
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via bulk_ingest_rosap on 2026-05-23 (6 acquisition events logged).
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | rosap | — | — | 2 | 2026-05-23 |
| archive | success | — | — | — | 1 | 2026-05-23 |
| extract | success | cached | — | — | 2 | 2026-06-10 |
| clean | success | — | — | — | 1 | 2026-06-01 |
| chunk | success | — | — | — | 1 | 2026-06-01 |
| embed | success | — | — | — | 1 | 2026-06-02 |
| enrich | success | — | — | — | 1 | 2026-05-23 |
| promote | success | — | — | — | 1 | 2026-05-23 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 3 | 2026-06-10 |
| tag | success | vector_similarity | — | — | 19 | 2026-06-11 |
| verify | success | — | — | — | 2 | 2026-06-10 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Empirical Findings: crash risk outcomes, observational prevalence