Optimizing Complex Control Systems with Differentiable Simulators: A Hybrid Approach to Reinforcement Learning and Trajectory Planning

Abstract

Deep reinforcement learning (RL) often relies on simulators as abstract oracles to model interactions within complex environments. While differentiable simulators have recently emerged for multi-body robotic systems, they remain underutilized, despite their potential to provide richer information. This underutilization, coupled with the high computational cost of exploration-exploitation in high-dimensional state spaces, limits the practical application of RL in the real-world. We propose a method that integrates learning with differentiable simulators to enhance the efficiency of exploration-exploitation. Our approach learns value functions, state trajectories, and control policies from locally optimal runs of a model-based trajectory optimizer. The learned value function acts as a proxy to shorten the preview horizon, while approximated state and control policies guide the trajectory optimization. We benchmark our algorithm on three classical control problems and a torque-controlled 7 degree-of-freedom robot manipulator arm, demonstrating faster convergence and a more efficient symbiotic relationship between learning and simulation for end-to-end training of complex, poly-articulated systems.

Read the publication

Language

English

Author(s)

Amit Parag
Nicolas Mansard
Ekrem Misimi

Affiliation

SINTEF Ocean / Fisheries and New Biomarine Industry
National Center for Scientific Research

Year

2025

Publisher

IEEE (Institute of Electrical and Electronics Engineers)

Book

Proceedings of 2025 IEEE International Conference on Robotics and Automation (ICRA)

ISBN

9798331541392

DOI

https://doi.org/10.1109/icra55743.2025.11127746

View this publication at Norwegian Research Information Repository

Contact us

Our services

Career

Sustainability

Management and board

Institutes

Other units

About us

Follow us