Decision making in dynamic and interactive environments based on cognitive hierarchy theory, Bayesian inference, and predictive control

Li, Sisi; Li, Nan; Girard, Anouck; Kolmanovsky, Ilya · 2019 · OpenAlex-citations

DOI: 10.48550/arxiv.1908.04005

archive: archived pipeline: cataloged verified

Get this paper ↗ (DOI — opens at the source; we link to it, we don't host it)

Summary

This paper presents an integrated decision-making framework for autonomous agents operating in dynamic, interactive environments, specifically addressing the challenge of predicting and responding to human-driven counterparts. The authors argue that traditional equilibrium-based game theories assume perfect rationality, which is often unrealistic, while standard "level-k" cognitive models can lead to poor decisions if the agent’s assumption about the opponent’s cognitive level is incorrect. To address this, the study combines cognitive hierarchy (CH) theory, Bayesian inference, and receding-horizon optimal control. The CH framework allows an agent to model opponents as a mixture of different reasoning levels (level-0 through level-k-1), providing a more robust prediction of behavior than fixed level-k assumptions. The methodology formulates the interaction as a two-player dynamic game within a partially observable Markov decision process (POMDP). The environment’s actions are modeled as stochastic disturbances based on recursive level-k policies, where each level optimally responds to the previous level using a softmax decision rule. Since the opponent’s specific cognitive level is unknown, the ego agent uses Bayesian inference to update its posterior belief about this level based on historical observations and actions. The decision strategy is determined via receding-horizon optimization, maximizing expected rewards over a planning horizon while satisfying probabilistic safety constraints (chance constraints) to ensure the system remains in safe states with high confidence. The optimization problem is transformed into a probability space and solved using nonlinear programming techniques that exploit gradient and Hessian information. The framework is validated through simulations of an autonomous vehicle interacting with human-driven vehicles in three traffic scenarios: a four-way intersection, highway overtaking, and highway forced merging. The human drivers are modeled as level-1 or level-2 reasoners, consistent with experimental studies on human behavior. The simulations demonstrate that the autonomous agent successfully infers the human driver’s cognitive level and adapts its strategy accordingly. For instance, the agent adjusts its maneuvers to safely navigate interactions with both level-1 and level-2 opponents, maintaining safety constraints with a confidence level of 0.99. The significance of this work lies in its ability to handle uncertainty in opponent behavior without relying on potentially incorrect fixed assumptions about cognitive levels. By integrating Bayesian inference with cognitive hierarchy theory, the proposed framework enables autonomous systems to strategically interact with humans in a more realistic and safe manner. This approach offers a broader applicability than heuristic methods for level estimation and provides a rigorous mathematical foundation for safe, interactive decision-making in autonomous driving and other human-machine interaction domains.

Provenance

The full processing record for this entry. Every stage of this paper's journey through the pipeline is logged — what ran, with which tool and model, how many attempts it took, and when it last completed.

Stage	Outcome	Tool	Model	Prompt	Attempts	Completed
discover	success	OpenAlex-citations	—	—	1	2026-06-18
archive	success	openalex	—	—	5	2026-06-25
extract	success	cached	—	—	2	2026-06-26
clean	success	clean	—	—	1	2026-06-18
chunk	success	chunk	—	—	1	2026-06-18
embed	success	embed	Qwen/Qwen3-Embedding-8B	—	1	2026-06-18
promote	success	—	—	—	1	2026-06-18
summarize	success	llm	qwen3.6-27b-prismaquant	summ-v5	1	2026-06-26
tag	success	vector_similarity	—	—	6	2026-06-18
verify	success	—	—	—	1	2026-06-26

Summary generated by qwen3.6-27b-prismaquant on 2026-06-26; verification: verified.

Topics

Ranked by relevance to this paper. Hover a topic for its definition.

anticipation

Information type

What kind of knowledge this paper contributes, grouped by family — independent of topic (what it is about) and method (how it was studied).

Theoretical Contribution: computational model