TrafficNet: An open naturalistic driving scenario library

Zhao, Ding; Guo, Yaohui; Jia, Yunhan · 2017 · OpenAlex-citations

DOI: 10.1109/itsc.2017.8317860

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper introduces TrafficNet, an open-source, web-based library designed to improve the usability of naturalistic driving data for researchers and vehicle engineers. The authors address a critical gap in intelligent transportation systems: while large-scale datasets from Naturalistic-Field Operational Tests (N-FOTs) exist, they are typically stored in chronological order as raw sensor logs. This format requires extensive post-processing and big data analytics expertise to extract specific driving scenarios, creating a barrier for practitioners. TrafficNet bridges this gap by preprocessing massive raw data into an organized, scenario-based dataset, thereby facilitating the development and testing of autonomous vehicle functionalities. The study utilizes data from the Safety Pilot Model Deployment (SPMD) project, conducted by the University of Michigan Transportation Research Institute. This dataset includes multimodal traffic information from approximately 2,800 equipped vehicles, featuring sensors such as Mobileye cameras, radar, and Wireless Safety Units (WSU). The authors developed a set of categorization algorithms implemented in MySQL scripts to extract six critical driving scenarios from the chronological logs: free flow, car-following, lane change, frontal cut-in, pedestrian crossing, and cyclist encounters. For each scenario, the system generates two tables: an Event table recording primary keys (device, trip, start/end times) and a Sequence table containing time-sequenced data. Specific algorithms were designed to handle sensor noise and detection logic; for instance, free flow is identified by the absence of front obstacles, while lane changes are deduced by analyzing lateral distance changes relative to lane boundaries over specific time windows. The resulting TrafficNet database contains a total of 565,291 labeled events. The distribution of these events is heavily skewed toward common driving behaviors, with free flow comprising the majority at 440,001 events, followed by car-following (104,849 events) and cut-in maneuvers (72,886 events). Less frequent but critical safety scenarios include lane changes (10,873 events), pedestrian crossings (26,412 events), and cyclist encounters (1,270 events). The authors provide statistical summaries and visualizations of event distributions, noting that free flow events are well-distributed across rural and urban areas, whereas cyclist events are sparse and concentrated in downtown areas. The system also includes mechanisms to filter false positives, such as removing pedestrian detections lasting less than 0.5 seconds. The significance of TrafficNet lies in its accessibility and structured format, which lowers the computational and analytical barriers for utilizing naturalistic driving data. By providing both the scenario database and the source code for extraction algorithms, the authors aim to foster more sophisticated scenario-based categorization methods and accelerate the evaluation of automated vehicles. The paper concludes by acknowledging current limitations, such as potential mislabeling, and outlines future work to improve accuracy through automated and manual verification, expand the library with additional scenario types, and extend the analysis to other open databases.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	OpenAlex-citations	—	—	1	2026-06-18
archive	success	semantic_scholar	—	—	6	2026-06-25
extract	success	cached	—	—	2	2026-06-26
clean	success	clean	—	—	1	2026-06-18
chunk	success	chunk	—	—	1	2026-06-18
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-18
promote	success	—	—	—	1	2026-06-18
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	1	2026-06-26
tag	success	vector_similarity	—	—	6	2026-06-18
verify	success	—	—	—	1	2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

naturalistic crash near crash

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).

Methodological Resource: dataset resource, tool software