Interstate 80 Freeway Dataset : [fact sheet]
archive: archived pipeline: cataloged verified
Get this paper ↗ (full text — opens at the source; we link to it, we don't host it)
Summary
This document serves as a fact sheet describing the Interstate 80 (I–80) Freeway Dataset, the first dataset collected under the Next Generation SIMulation (NGSIM) program. The primary motivation for this data collection was to support the development of microscopic driver behavior algorithms. Stakeholders identified real-world vehicle trajectory data as essential for understanding and researching these behaviors, noting that the NGSIM datasets represent the most detailed and accurate field data available for traffic microsimulation research at the time. The data collection took place on April 13, 2005, on eastbound I–80 in Emeryville, California, within the San Francisco Bay area. The study area spanned approximately 500 meters and included six freeway lanes, one of which was a high-occupancy vehicle (HOV) lane, as well as an onramp. Researchers utilized seven synchronized digital video cameras mounted on a nearby 30-story building to record traffic. A customized software application called NG-VIDEO transcribed the video footage into vehicle trajectory data, providing the precise location of each vehicle every one-tenth of a second. The full dataset comprises 45 minutes of data, segmented into three 15-minute periods (4:00–4:15 p.m., 5:00–5:15 p.m., and 5:15–5:30 p.m.) to capture the transition from uncongested to fully congested conditions during the peak period. Beyond trajectory data, the dataset includes CAD and GIS files, aerial photos, loop detector data, raw and processed video, signal timing settings, traffic sign information, weather data, and aggregate analysis reports. The I–80 dataset was instrumental in developing and validating specific NGSIM algorithms, including Freeway Lane Selection, Cooperative/Forced Freeway Merge, and Oversaturated Freeway Flow. These algorithms allow for a deeper understanding of core driver behaviors with a level of detail and accuracy previously unavailable. The high-quality algorithms derived from this data can be incorporated into traffic microsimulation models, enabling transportation practitioners to make more reliable and valid decisions. The significance of this dataset extends beyond the initial algorithm development. It is freely available to the broader traffic simulation and engineering community, who utilize it to examine various topics such as congested freeway conditions, lane distribution, weaving areas, HOV lane usage, heavy vehicle movements, and advanced safety applications. Additionally, researchers use the I–80 dataset to validate and calibrate existing algorithms and models, thereby enhancing the overall reliability of traffic simulation tools.
Key finding
The NGSIM I-80 dataset captures 45 minutes of vehicle trajectories across six freeway lanes recorded at one-tenth-second resolution from seven synchronized cameras.
Methodology
dataset
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed. Discovered via bulk_ingest_rosap on 2026-05-23 (7 acquisition events logged).
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | rosap | — | — | 2 | 2026-05-23 |
| archive | success | — | — | — | 1 | 2026-05-23 |
| extract | success | cached | — | — | 2 | 2026-06-10 |
| clean | success | — | — | — | 1 | 2026-06-01 |
| chunk | success | — | — | — | 1 | 2026-06-01 |
| embed | success | — | — | — | 1 | 2026-06-02 |
| enrich | success | — | — | — | 1 | 2026-05-23 |
| promote | success | — | — | — | 1 | 2026-05-23 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 3 | 2026-06-10 |
| tag | success | vector_similarity | — | — | 24 | 2026-06-11 |
| verify | success | — | — | — | 3 | 2026-06-10 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-10; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Methodological Resource: dataset resource, tool software