Abstract
We propose a multi-sensor fusion pipeline for multiple object tracking in autonomous surface vessels using lidar and camera data. Our approach follows the tracking-by-detection paradigm, leveraging the precision of lidar for accurate state estimation and camera data for robust association. The method addresses issues with false tracks from lidar returns by suppressing non-moving objects on the basis of optical flow. We compare the proposed pipeline against prior work, particularly in the use of lidar and stereo cameras as depth modalities, demonstrating its effectiveness in improving tracking performance.