RLPG: Reinforcement Learning Approach for Dynamic Intra-Platoon Gap Adaptation for Highway On-Ramp Merging
DOI: 10.1109/iros55552.2023.10341918
archive: archived pipeline: cataloged verified
Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)
Summary
This paper addresses the challenge of maintaining traffic efficiency during highway on-ramp merging when autonomous vehicle platoons are present. While platooning improves fuel efficiency and safety by maintaining small inter-vehicle gaps, these tight formations can block merging vehicles from entering the mainline, leading to significant speed differentials, traffic breakdowns, and potential accidents. Existing solutions rely on control-based methods, such as Model Predictive Control (MPC), which struggle with the computational complexity and dynamic nature of real-world traffic conditions. The authors propose RLPG, a reinforcement learning (RL) framework that dynamically adapts the intra-platoon gap for individual platoon members to facilitate smooth merging and maximize overall traffic flow. The methodology formulates the gap adaptation problem as a Markov Decision Process (MDP). The system utilizes a Roadside Unit (RSU) to collect real-time traffic data, which constitutes the state space: traffic density and average speed on both the mainline and ramp, platoon length, and current intra-platoon gaps. The action space is continuous, allowing for precise gap adjustments within a range of 2 to 30 meters. The authors employ the Deep Deterministic Policy Gradient (DDPG) algorithm to train an actor-critic neural network. The actor network determines the optimal gap adjustments, while the critic network evaluates the impact on traffic flow. The reward function is designed to maximize traffic flow, providing positive rewards when average delay is below a congestion threshold and negative rewards otherwise. The model was implemented using Keras and TensorFlow and integrated with the SUMO traffic simulator via the TraCI interface. Simulation results demonstrate the effectiveness of the RLPG approach. A motivational study revealed that large platoons (sizes 20–30) could reduce traffic flow by up to 56% due to prolonged interruptions of merging vehicles. The RL model showed reliable convergence during training. In extensive testing across various scenarios—varying merging traffic densities, merging lane lengths, and merging aggressiveness—the RL-based approach significantly improved traffic flow compared to static gap strategies. The dynamic adaptation allowed platoons to reconfigure into smaller, more permeable units or adjust spacing to accommodate merging vehicles, thereby mitigating the adverse effects of platooning on merging efficiency. The significance of this work lies in introducing the first data-driven, machine learning-based approach for dynamic intra-platoon gap adaptation. By leveraging RL, the method effectively models complex, time-varying traffic dynamics that are difficult for traditional control strategies to handle. The findings suggest that adaptive gap control can preserve the benefits of platooning while preventing traffic breakdowns at critical merging zones. This contributes to the development of safer and more efficient autonomous transportation systems, particularly in mixed-traffic environments involving highway on-ramps.
Provenance
The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.
| Stage | Outcome | Tool | Model | Prompt | Attempts | Completed |
|---|---|---|---|---|---|---|
| discover | success | OpenAlex-citations | — | — | 1 | 2026-06-20 |
| archive | success | semantic_scholar | — | — | 6 | 2026-06-26 |
| extract | success | cached | — | — | 2 | 2026-06-26 |
| clean | success | clean | — | — | 1 | 2026-06-20 |
| chunk | success | chunk | — | — | 1 | 2026-06-20 |
| embed | success | embed | Qwen/Qwen3-Embedding-8B | — | 1 | 2026-06-20 |
| promote | success | — | — | — | 1 | 2026-06-20 |
| summarize | success | llm | qwen3.6-27b-prismaquant | summ-v5 | 1 | 2026-06-26 |
| tag | success | vector_similarity | — | — | 6 | 2026-06-20 |
| verify | success | — | — | — | 1 | 2026-06-26 |
Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.
Topics
Ranked by relevance to this paper. Hover a topic for its definition.