Using Graph Analytics for Maritime Route Prediction
Background
There are many approaches to predicting maritime traffic through statistical learning, depending on whether we want to operate on the full or reduced feature space. If we operate on the full feature-space, we could use sequence-to-sequence (Deep Learning) or use regression methods on vessel positions and/or velocities. We can also reduce feature space by discretizing the problem – either through gridding in space and/or time, by detecting waypoints, or by constructing an auxiliary data structure. For example, the fixed-grid problem could be attacked using reinforcement learning (it might end up resembling a game of “Battleships”).
In this thesis, we will attempt to cast the problem into a graph data structure, so that the forecasting question essentially becomes “Which sequence of edges is this vessel likely to traverse?”
Outline of the Work
The essence of work is two-fold. First, construct a route-network (represented as a graph) of the Norwegian seas based on AIS data. Second, train machine learning models that use this network to predict the likely route of ships.
There are many challenges that allow for experimentation. For example, how to best process AIS data to construct the route-network? How to deal with AIS data that may not fit in memory on a single machine? How to deal with the multi-scale nature of the network? How to deal with temporal variability of the graph (if it exists)? How to phrase and execute the problem of learning routes? Should we explore Graph Neural Networks?
If you are more interested in data engineering (rather than data science), we might also explore how to best speed up generation, processing, storage, and loading/saving of such a graph.
Learning Outcome
You will become familiar with (a) graph data structures, (b) graph processing, (c) supervised and unsupervised machine learning, (d) AIS data, and (e) graph analytics.
Prerequisites
To hit the ground running, you should have a passing familiarity with (a) the Python Data Science stack, (b) supervised and unsupervised machine learning algorithms, (c) understand how to evaluate models, and (d) have some idea of what a graph is and how it is represented.