Taming an autonomous surface vehicle for path following and collision avoidance using deep reinforcement learning

Abstract

In this article, we explore the feasibility of applying proximal policy optimization, a state-of-the-art deep reinforcement learning algorithm for continuous control tasks, on the dual-objective problem of controlling an underactuated autonomous surface vehicle to follow an a priori known path while avoiding collisions with non-moving obstacles along the way. The AI agent, which is equipped with multiple rangefinder sensors for obstacle detection, is trained and evaluated in a challenging, stochastically generated simulation environment based on the OpenAI gym Python toolkit. Notably, the agent is provided with real-time insight into its own reward function, allowing it to dynamically adapt its guidance strategy. Depending on its strategy, which ranges from radical path-adherence to radical obstacle avoidance, the trained agent achieves an episodic success rate close to 100%.

Read the publication

Language

English

Author(s)

Eivind Meyer
Haakon Robinson
Adil Rasheed
Omer San

Affiliation

SINTEF Digital / Mathematics and Cybernetics
Norwegian University of Science and Technology
Oklahoma State University

Year

2020

Published in

IEEE Access

Volume

Page(s)

41466 - 41481

DOI

https://doi.org/10.1109/access.2020.2976586

Read fulltext

https://hdl.handle.net/11250/2723515

View this publication at Norwegian Research Information Repository

Contact us

Our services

Career

Sustainability

Management and board

Institutes

Other units

About us

Follow us