Deep Reinforcement Learning for Predictive Longitudinal Control of Automated Vehicles
DOI: 10.1109/itsc.2018.8569977
archive: archived pipeline: cataloged verified
Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)
Summary
This paper addresses the challenge of longitudinal control for automated vehicles, specifically aiming to improve upon classical Proportional-Integral (PI) controllers and computationally expensive Nonlinear Model Predictive Control (NMPC) schemes. While PI controllers suffer from poor accuracy and comfort in the presence of disturbances like road grade changes, NMPC offers high accuracy but requires significant computational resources and precise model parameter identification. The authors propose a model-free Deep Reinforcement Learning (DRL) approach that incorporates advance knowledge of future speed references and road disturbances. A key contribution is the identification of a critical design parameter: the selection of advance knowledge signals during the training phase, which significantly impacts learning speed. The authors develop a Predictive Reinforcement Learning Controller with Incorporated Advance Knowledge (PRLC-A) using the Deep Deterministic Policy Gradient (DDPG) algorithm. To enable predictive behavior, the state vector is augmented with future speed error trajectories and road grade information over a prediction horizon. To address the challenge of designing training trajectories, the authors propose using Amplitude Modulated Pseudo Random Binary Signals (APRBS) to excite the system across the state space, rather than training on specific real-world scenarios. The reward function is designed to minimize speed tracking error and penalize high control outputs. The system was simulated using a discrete vehicle dynamics model implemented in Python with TensorFlow, incorporating engine and brake torque dynamics, rolling resistance, and aerodynamic drag. Experimental results demonstrate that training with APRBS signals yields considerably faster learning convergence compared to training on specific evaluation datasets. When evaluated on a real-world driving scenario in a parking garage, the PRLC-A controller achieved tracking performance close to the optimal solution of an NMPC controller. Crucially, the DRL approach offered substantial computational advantages; inference times for the PRLC-A were between 30 to 70 times faster than the NMPC controller and remained insensitive to increases in prediction horizon length. For instance, with a prediction horizon of 20 steps, the PRLC-A required approximately 1.1 ms per cycle, whereas the NMPC required 81.1 ms. The study concludes that DRL is a viable alternative to NMPC for predictive longitudinal control, offering near-optimal performance with significantly reduced computational costs. This makes it suitable for real-time applications in automated vehicles equipped with advance knowledge capabilities. However, the authors note challenges regarding high variance between training runs and the need for extensive training samples. Future work is directed toward investigating robustness against unmodeled disturbances, such as wind and varying vehicle mass, and exploring apprenticeship or imitation learning to stabilize training and improve performance.
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | Crossref | — | — | 1 | 2026-06-18 |
| archive | success | unpaywall | — | — | 2 | 2026-06-25 |
| extract | success | cached | — | — | 2 | 2026-06-26 |
| clean | success | clean | — | — | 1 | 2026-06-20 |
| chunk | success | chunk | — | — | 1 | 2026-06-20 |
| embed | success | embed | Qwen/Qwen3-Embedding-8B | — | 1 | 2026-06-20 |
| enrich | success | openalex | — | — | 1 | 2026-06-20 |
| promote | success | — | — | — | 1 | 2026-06-18 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 1 | 2026-06-26 |
| tag | success | vector_similarity | — | — | 6 | 2026-06-20 |
| verify | success | — | — | — | 1 | 2026-06-26 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.
Information type
What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).
- Theoretical Contribution: computational model