ParkPredict: Motion and Intent Prediction of Vehicles in Parking Lots

Shen, Xu; Batkovic, Ivo; Govindarajan, Vijay; Falcone, Paolo; Darrell, Trevor; Borrelli, Francesco · 2020 · Crossref

DOI: 10.1109/iv47402.2020.9304795

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper addresses the challenge of predicting vehicle motion and driver intent in parking lots, an environment characterized by compact spaces, complex maneuvers, and limited structure compared to standard road networks. The authors aim to improve autonomous vehicle safety and efficiency by developing models that can accurately forecast human driver behavior in these unstructured domains. To achieve this, they developed a simulation environment using the CARLA simulator and collected a dataset of 600 human parking demonstrations (forward and reverse) performed by 10 subjects. The dataset includes vehicle pose history, parking spot occupancy, and semantic bird’s-eye view images. The study compares three prediction approaches: a physics-based Extended Kalman Filter (EKF) baseline, a multi-modal Long Short-Term Memory (LSTM) network, and a Convolutional Neural Network-LSTM (CNN-LSTM) model. The LSTM model processes pose history and occupancy data to predict intent and trajectory, while the CNN-LSTM incorporates semantic image features to capture environmental geometry. The models were evaluated using 5-fold cross-validation on intent classification accuracy and trajectory prediction error. Results indicate that data-driven models significantly outperform the physics-based baseline. The LSTM and CNN-LSTM models achieved approximately 85% top-1 accuracy and nearly 100% top-3 accuracy in intent classification, whereas the EKF relied on heuristic distance-based estimates. For trajectory prediction, the neural networks provided superior long-term forecasts. Crucially, the study found that knowledge of the driver’s intended parking spot substantially improved trajectory prediction accuracy, particularly for the CNN-LSTM model after 12 timesteps. Furthermore, multimodal predictions, which generate multiple trajectory candidates based on probable intents, reduced prediction errors compared to intent-agnostic models. The CNN-LSTM specifically demonstrated better obstacle awareness and handling of reverse maneuvers by leveraging semantic visual inputs. The significance of this work lies in demonstrating that incorporating semantic environmental representations and intent estimation enhances motion prediction in complex, unstructured environments. The findings suggest that autonomous systems can benefit from hierarchical models that jointly estimate intent and trajectory, rather than relying solely on kinematic extrapolation. This approach allows for more nuanced and safer interactions with human drivers in confined spaces like parking lots, addressing key challenges in autonomous driving perception and planning.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

StageOutcomeToolModelPromptAttemptsCompleted
discover success Crossref 1 2026-06-25
archive success semantic_scholar 6 2026-06-26
extract success cached 2 2026-06-26
clean success clean 1 2026-06-26
chunk success chunk 1 2026-06-26
embed success embed Qwen/Qwen3-Embedding-8B 1 2026-06-26
enrich success openalex 1 2026-06-26
promote success 1 2026-06-25
summarize success llm qwen3.6-27b-prismaquant summ-v5 1 2026-06-26
tag success vector_similarity 6 2026-06-26
verify success 1 2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).