Utilizing various data sources for surface transportation human factors research : workshop summary report, November 6-7, 2013
archive: archived pipeline: cataloged verified
Get this paper ↗ (full text — opens at the source; we link to it, we don't host it)
Summary
This report summarizes a workshop convened by the Federal Highway Administration (FHWA) in November 2013 to address the integration of diverse data sources for surface transportation human factors research. The primary motivation was the increasing availability of varied datasets—from naturalistic driving studies and simulators to eye trackers and surveys—and the need to resolve inconsistencies among them to better understand human error, a leading cause of transportation injuries and fatalities. The workshop aimed to determine how researchers can best select, combine, and validate datasets to analyze driver and traveler behavior across three key interaction domains: driver-to-other road users, driver-to-infrastructure, and driver-to-vehicle. The event featured presentations from experts on specific data collection methodologies and subsequent panel discussions on data integration strategies. Presentations covered naturalistic cycling data collection using instrumented bicycles equipped with cameras, inertial measurement units, and GPS; driver-pedestrian recognition behaviors at intersections; and the use of dedicated short-range communications (DSRC) for cooperative safety systems. For instance, researchers demonstrated how naturalistic cycling data could be combined with accident databases to analyze crash causation, while other studies evaluated the performance of DSRC systems in transmitting collision warnings between vehicles and pedestrians. The expert panel discussed the challenges of consolidating data from multiple sources, noting that datasets can be complementary, confirmatory, or contradictory. They proposed "bottom-up" approaches to identify causes of inconsistency and "top-down" multi-site studies to detect contradictions across different environments. Key findings highlighted the potential and limitations of current data sources. Naturalistic cycling studies revealed that cyclists often exceed speed limits, particularly with electric bicycles, and that risk increases significantly near intersections with reduced visibility or surface issues. Research on driver-pedestrian interactions identified that driver avoidance behaviors correlate with predicted time lags, with yielding typically occurring when the time to conflict exceeds two seconds. DSRC experiments demonstrated effective data transmission rates between vehicles and pedestrians at intersections, supporting the viability of cooperative safety technologies. However, the panel noted that few existing datasets are comprehensive enough to link behavioral data directly with crash outcomes, and that integrating these sources remains in its infancy. The workshop concluded with recommendations for future research to address these gaps. Participants prioritized the development of methodologies for using multiple data sources simultaneously, including studies that incorporate multiple sites, user types (e.g., pedestrians, bicyclists, motorists), and analysis methods. Specific research needs identified include evaluating the effectiveness of roadway signage, researching speed perception, improving safety for vulnerable road users, and assessing Intelligent Transportation System technologies. The report emphasizes the necessity of creating comprehensive datasets that link behavioral observations with crash data to resolve contradictions and enhance the understanding of risky behaviors, ultimately informing better infrastructure design and safety countermeasures.
Key finding
The workshop concluded that integrating multiple data sources and methods is essential for resolving contradictions in human factors research and developing comprehensive models linking driver behavior to crash outcomes.
Methodology
other
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via bulk_ingest_rosap on 2026-05-23 (44 acquisition events logged).
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | rosap | — | — | 2 | 2026-05-23 |
| archive | success | — | — | — | 1 | 2026-05-23 |
| extract | success | cached | — | — | 2 | 2026-06-10 |
| clean | success | — | — | — | 1 | 2026-06-01 |
| chunk | success | — | — | — | 1 | 2026-06-01 |
| embed | success | — | — | — | 1 | 2026-06-02 |
| enrich | success | — | — | — | 1 | 2026-05-23 |
| promote | success | — | — | — | 1 | 2026-05-23 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 41 | 2026-06-10 |
| tag | success | vector_similarity | — | — | 19 | 2026-06-11 |
| verify | success | — | — | — | 2 | 2026-06-10 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
- naturalistic crash near crash
- simulator validity fidelity
- induced exposure
- rail grade crossings
- crash reconstruction hf
- exposure measurement
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Empirical Findings: observational prevalence
- Methodological Resource: dataset resource, tool software