Abstract
Laparoscopic surgery has become increasingly popular for surgical resection of malignant liver lesions due to its minimally invasive nature, faster patient recovery, and less hospital stay. Technological advancements in minimally invasive surgery with image guided navigation and augmented reality have significantly improved liver surgical planning and outcomes. Knowledge of the distribution of the liver vascular structures (hepatic, portal) and their association with tumors is critical in achieving surgical margins while preserving healthy tissue. This ensures the remaining liver has sufficient blood supply to maintain healthy function. Extracting these structures from the medical images greatly assists surgeons but is labor-intensive. Deep learning (DL) based methods have shown tremendous potential in segmenting medical structures with their key benefits of automation, reliability, and robustness. This PhD thesis investigates DL-based approaches to automatically segment the liver and its vessels for patient-specific model (PSM) generation and the methods to enhance the segmentation performance from multiple perspectives.
The thesis comprises five papers, including a review and four technical contributions. First, we reviewed existing DL-based approaches for liver image analysis on multi-modality images. We studied the application of DL strategies and their influence on image analysis. We then identified gaps and potential research directions, such as learning strategies, image enhancement, and alternate architectures that could enhance the segmentation performance. Hyperparameter tuning in DL architectures offers possibilities to improve segmentation performance. Furthermore, generalizing the parameter selections can lead to faster model and the training time required to achieve optimal results. In paper three, we studied different combinations of hyper-parameters on multiple datasets and network architectures. The task-specific optimal combinations for different architectures were proposed based on quantitative and statistical analysis. In paper four, we investigated the influence of various vessel enhancement filters on DL-based vessel segmentation. The paper proposed to combine filtered outputs or segmentations from the models trained on different enhanced images. The proposed methods significantly improved the segmentation accuracy. In addition, the clinical applicability of the proposed methods was studied by qualitative analysis. Convolutional neural networks (CNNs) are highly effective in detecting features but have limitations in determining the spatial relationships between the features. The thesis addressed this by exploring alternate semantic segmentation architectures based on the capsule network in paper two and vision transformers in paper five. Paper two examined whether SegCaps, a capsule net, could challenge CNNs in complex medical segmentation tasks and attempted to use EM-routing for segmentation. The paper showed that the capsule net has the ability to segment simple structures but has struggled with complex tasks. In contrast, in the studies from paper five using SwinUNETR, the vision transformer significantly outperformed CNNs for segmenting hepatic and portal veins. Further, using a multi-task learning strategy improved segmentation accuracy with a faster model convergence, reducing the training time compared to the baseline architecture. Overall, our work identified different strategies to optimize the performance of the DL-based segmentation methods to extract relevant structures for pre-operative and intra-operative liver surgery.