Naturalistic Driving Data Baseline for Automated Driving System-Equipped Commercial Motor Vehicles
archive: archived pipeline: cataloged verified
Get this paper ↗ (full text — opens at the source; we link to it, we don't host it)
Summary
This report establishes naturalistic driving data baselines to support the development and evaluation of Automated Driving Systems (ADS) for Class 8 commercial motor vehicles (CMVs). The research addresses the need for standardized metrics to assess ADS effectiveness, specifically by defining "lagging" baselines (crash and near-crash outcomes that ADS aims to mitigate) and "leading" baselines (human driver performance attributes that ADS aims to emulate or exceed). The study was conducted by the Virginia Tech Transportation Institute under sponsorship from the Federal Motor Carrier Safety Administration. The methodology utilized two distinct naturalistic datasets. The lagging baseline was derived from 3.44 billion vehicle miles traveled (VMT), encompassing over 3,700 crashes. This event data was map-matched to ten specific highway Operational Design Domains (ODDs) within the United States. The leading baseline was constructed from 3.2 million miles of continuous driving data, matched to highway routes such as I-10 and I-75. This continuous data analyzed six maneuver types: speed behavior, longitudinal deceleration, following distance, lateral acceleration, lane deviation, and lane stability. These maneuvers were categorized across ten highway ODDs, seven speed limit categories, and seven lane-count categories. Additionally, a reference set of leading baseline performance was developed using Canadian highway data to support northern U.S. transit corridors. The findings provide detailed rates for safety outcomes and driving behaviors. The lagging baseline reports crash and near-crash rates per 100 million VMT, broken down by crash severity, type, and contributing factors such as driver inattention, negative behaviors, and unexpected events. The leading baseline quantifies the frequency of specific maneuvers per 1,000 miles traveled. For instance, it details the distribution of speed behaviors relative to posted limits, the frequency of lateral and longitudinal acceleration events, and the prevalence of specific following distance headways and lane deviations. The report also includes a public-use data tool that allows users to query these event rates based on selectable parameters. The significance of this work lies in providing non-proprietary, commonly shared vehicle maneuver attributes for ADS developers. These baselines offer a benchmark for defining appropriate criteria during the testing and deployment cycles of ADS-equipped CMVs. By establishing clear metrics for both safety outcomes and driving performance, the report facilitates the comparison of automated systems against human driver standards. The inclusion of Canadian data further extends the applicability of these baselines to cross-border traffic scenarios, supporting broader industry efforts to improve highway safety and efficiency through automation.
Key finding
The study produced comprehensive lagging and leading performance baselines for commercial motor vehicles based on billions of miles of naturalistic driving data, providing essential reference metrics for automated driving system development.
Methodology
naturalistic
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via bulk_ingest_rosap on 2026-05-23 (6 acquisition events logged).
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | rosap | — | — | 2 | 2026-05-23 |
| archive | success | — | — | — | 1 | 2026-05-23 |
| extract | success | cached | — | — | 2 | 2026-06-10 |
| clean | success | — | — | — | 1 | 2026-06-01 |
| chunk | success | — | — | — | 1 | 2026-06-01 |
| embed | success | — | — | — | 1 | 2026-06-02 |
| enrich | success | — | — | — | 1 | 2026-05-23 |
| promote | success | — | — | — | 1 | 2026-05-23 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 3 | 2026-06-10 |
| tag | success | vector_similarity | — | — | 19 | 2026-06-11 |
| verify | success | — | — | — | 2 | 2026-06-10 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
- naturalistic crash near crash
- incidence prevalence
- exposure measurement
- traffic density
- urban rural setting
- lane positioning
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Empirical Findings: crash risk outcomes, observational prevalence
- Methodological Resource: dataset resource