A policy network is trained to produce instantaneous decision variables for low-level online MPC, whose cost functions are modulated to latently encode the safety conditions of collision avoidance and driveable surface boundaries.
In this work, we propose a novel learning-based model predictive control (MPC) framework for motion planning and control of urban self-driving. In this framework, instantaneous references and cost functions of online MPC are learned from raw sensor data without relying on any oracle or predicted states of traffic. Moreover, driving safety conditions are latently encoded via the introduction of a learnable instantaneous reference vector. In particular, we implement a deep reinforcement learning (DRL) framework for policy search, where practical and lightweight raw observations are processed to reason about the traffic and provide the online MPC with instantaneous references.
The proposed approach is validated in a high-fidelity simulator, where our development manifests remarkable adaptiveness to complex and dynamic traffic. Furthermore, sim-to-real deployments are also conducted to evaluate the generalizability of the proposed framework in various real-world applications.
A policy network is trained to produce instantaneous decision variables for low-level online MPC, whose cost functions are modulated to latently encode the safety conditions of collision avoidance and driveable surface boundaries.
Collision avoidance.
Lange change.
Overtaking.
Perform agile overtaking.
Manifest high maneuverability when handling with emergent collision caused by traffic uncertainties.
Lane change in static traffic.
Interactive overtaking with a single human driver.
Interactive overtaking with multiple human drivers.
@article{wang2024learning,
title={Learning the References of Online Model Predictive Control for Urban Self-Driving},
author={Wang, Yubin and Peng, Zengqi and Xie, Yusen and Li, Yulin and Ghazzai, Hakim and Ma, Jun},
journal={arXiv preprint arXiv:2308.15808},
year={2024}
}