Abstract
The purpose of the present study is to evaluate inter-operator and intra-operator variations in the manual segmentation of the hippocampus from high resolution T1-weighted magnetic resonance (MR) images. The hippocampus was segmented manually in MR images of 118 epileptic and 25 non-epileptic patients (65 males, 78 females; median age of 36 years, mean age of 39 years) by three operators (M1, M2, M3) and three automated methods (FreeSurfer, LocalInfo, ABSS). To determine how much the manual segmentations performed by one operator differ from those of another operator, inter-operator variability was evaluated. To determine how much the manual segmentations done by an operator vary over time, i.e., to assess intra-operator variability, manual segmentations from each operator were compared to the automated segmentations. To this end, rational absolute value degree (RAVD), volume asymmetry, Dice coefficient, precision, similarity, specificity, accuracy, negative predictive value (NPV), Hausdorff distance, root mean square (RMS), average symmetric surface distance (ASSD), and mean distance were calculated. The segmentation results of one of the operators were considered as the ground truth for the evaluation of the segmentation results of the other operators and the automatic segmentation methods. Hausdorff distance and precision were different when using automated techniques as the test segmentation and the M3 segmentation as the ground truth, rather than M1 and M2 segmentations. The standard deviation of performance measures tended to be higher when using operator M3 as the ground truth and either operator M1 or M2 as the test segmentation. Variation in performance measures when using M3 as the ground truth is indicative of inter-operator variation. When comparing performance measures generated by automated versus manual techniques, standard deviations were larger when using operator M2 as the ground truth than when using operator M1. This suggests that operator M2 exhibited a larger intra-operator variation than operator M1. Among the automatic segmentation methods, ABSS was the most effective method in many regards (RAVD Dice coefficient, similarity, specificity, accuracy, NPV, Hausdorff distance, RMS, ASSD) while FreeSurfer and LocalInfo were more effective for the precision, mean distance, and lateralization of epileptogenicity. Inter-operator error was likely due to the temporal separation of the segmentations and thus, it may be reduced by having all operators working in the laboratory simultaneously and undergoing the same training, although some inter-operator variability may be unavoidable. Intra-operator variation can likely be reduced with further training and supervision of the operators by a neuroradiologist with expertise in hippocampus anatomy. Future automated segmentation techniques may incorporate elements of both atlas-based (FreeSurfer and LocalInfo) and neural-network-based (ABSS) segmentation techniques for optimal performance.
Citation Format: Benjamin Huber, Esmaeil Davoodi-Bojd, Hamid Soltanian-Zadeh. Identifying sources of variation in manual segmentation of hippocampus on from magnetic resonance images (MRI) [abstract]. In: Proceedings of the AACR Virtual Meeting: COVID-19 and Cancer; 2021 Feb 3-5. Philadelphia (PA): AACR; Clin Cancer Res 2021;27(6_Suppl):Abstract nr P03.