TRAJECTORY OPTIMIZATION OF ROBOTS VIA MODEL PREDICTIVE CONTROL AND REINFORCEMENT LEARNING
DOI:
https://doi.org/10.62985/j.huit_ojs.vol26.no2E.413Keywords:
Model Predictive Control, Reinforcement Learning, Proximal Policy OptimizationAbstract
Trajectory optimization for industrial robots remains a critical challenge due to complex kinematic constraints, environmental disturbances, and strict real-time performance requirements in modern automation systems. This study proposes a hybrid control framework that combines Model Predictive Control (MPC) with Reinforcement Learning (RL) to generate efficient and robust robot trajectories. MPC is employed to perform short-horizon optimization under explicit system constraints, ensuring precise and feasible motion planning, while a Reinforcement Learning strategy based on Proximal Policy Optimization (PPO) is integrated to learn long-term adaptive policies capable of compensating for uncertainties such as sensor noise and variable payload conditions. The synergy between MPC and RL enables improved motion accuracy, faster response, and reduced energy consumption. Simulation and experimental results validate the effectiveness of the proposed approach, demonstrating notable performance improvements over conventional control strategies. Future work will focus on real-time embedded deployment of the framework for intelligent manufacturing applications.
References
[1] J. B. Rawlings, D. Q. Mayne, and M. Diehl, Model predictive control: theory, computation, and design. Nob Hill Publishing Madison, WI, 2020. https://cir.nii.ac.jp/crid/1971150415095544201.
[2] A. Bemporad and M. Morari, "Robust model predictive control: A survey," in Robustness in identification and control: Springer, 2007, pp. 207-226. https://link.springer.com/chapter/10.1007/BFb0109870.
[3] R. S. Sutton and A. G. Barto, "Reinforcement Learning: An Introduction," in IEEE Transactions on Neural Networks, vol. 9, no. 5, pp. 1054-1054, Sept. 1998, doi: https://doi.org/10.1109/TNN.1998.712192.
[4] J. Kober, J. A. Bagnell, and J. Peters, "Reinforcement learning in robotics: A survey," The International Journal of Robotics Research, vol. 32, no. 11, pp. 1238-1274, 2013, doi: https://doi.org/10.1177/0278364913495721.
[5] F. Berkenkamp, M. Turchetta, A. Schoellig, and A. Krause, "Safe model-based reinforcement learning with stability guarantees," Advances in neural information processing systems, vol. 30, 2017, doi: https://doi.org/10.48550/arXiv.1705.08551.
[6] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal policy optimization algorithms," arXiv preprint arXiv:1707.06347, 2017, doi: https://doi.org/10.48550/arXiv.1707.06347.
[7] B. Amos, I. Jimenez, J. Sacks, B. Boots, and J. Z. Kolter, "Differentiable mpc for end-to-end planning and control," Advances in neural information processing systems, vol. 31, 2018, doi: https://doi.org/10.48550/arXiv.1810.13400.
[8] A. Romero, Y. Song, and D. Scaramuzza, "Actor-critic model predictive control," in 2024 IEEE International Conference on Robotics and Automation (ICRA), 2024, pp. 14777-14784, doi: https://doi.org/10.1109/ICRA57147.2024.10610381.
[9] R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," Advances in neural information processing systems, vol. 12, 1999. [Online]. Available:


