To main content

Multi-Label Video Classification

This MSc thesis aims at developing a multi-label video classifier for underwater inspection videos

Figure 1: Segments of video with assigned labels
Figure 1: Segments of video with assigned labels

With recent advances performing inspections of marine structures as ships with remote sensing technologies the need for automated data annotations and analysis becomes apparent. During underwater ship inspections videos of the inspected ships are recorded. These videos serve as a basis for the generation of an inspection report. The bigger the inspected object the longer is the recording and a manual review and processing of collected videos becomes increasingly unfeasible.

To create a video classification model, the spatial as well as the temporal information needs to be considered. Deep learning models that aim to learn spatio-temporal information would be explored. The nature of inspection videos is conceptually different from human activity recognition as only static objects are considered. Hence the benefit of utilizing the temporal aspects needs to be revisited.


  • Identify Deep Learning approaches and algorithms for video understanding and labelling of video snippets.
  • Train an initial model that will serve as a basis for further development.
  • Implement the framework on Azure.
  • Evaluate the approach by using a relevant use case pilot in a current SINTEF innovation project (LIACi).

Expected Results and Learning Outcome

  • Definition of an appropriate deep learning model.
  • Software prototype.
  • Evaluation of prototype with real business use cases.
  • Publication

Recommended prerequisites

Basic knowledge of Machine Learning and Computer Vision techniques.

Apply by e-mail to the supervisor.