All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
The authors have addressed all the review comments.
Substantial revision is required to address the presentation of the proposed method and experiments.
a. Clear and unambiguous, professional English used throughout.
The paper is generally well-written and easy to follow, despite some sentences are confusing and need revisions:
Line 252-253 (“The experiments were implemented using the PyTorch (v1.8.0) deep learning library with Tensorflow backend in Python”) is confusing as PyTorch and TensorFlow are two different DL libraries, and as far as I know, TensorFlow cannot be used as backend of PyTorch;
Line 273-274 (“… the non-contrast and contrast-enhanced datasets in consecutive arrays were combined in another experiment”) says there is “another experiment”, please make it clear which experiment is referred to here.
b. Literature references, sufficient field background/context provided.
The paper introduces sufficient background on AAA segmentation. However, while coordinating information and transfer learning are the two main techniques of this paper, the related work section only briefly introduces them. As they are not originally proposed by the authors, it is important to provide more details on these two methods.
Also, in lines 156-157 (ResNet is a kind of popular network that has been proven effective in medical data), ResNet deserves a more detailed explanation, and reference should be provided on the claim that it is effective in medical data.
c. Professional article structure, figures, tables. Raw data shared.
Figure 5 training curve is hard to read. Increasing the resolution of the font is encouraged.
d. Self-contained with relevant results to hypotheses.
No comment.
e. Formal results should include clear definitions of all terms and theorems and detailed proofs.
No comment.
a. Original primary research within the Aims and Scope of the journal.
No comment.
b. Research question well defined, relevant & meaningful. It is stated how research fills an identified knowledge gap.
The paper conducts extensive experiments on the proposed coordinate integration methods. As the use of positional encoding is a hot topic in computer vision, how does the proposed scheme compare to other positional encoding method, for example, the fixed positional encoding in [1] (use sinusoid to encode the position) or learnable positional encoding?
[1] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I., 2017. Attention is all you need. Advances in neural information processing systems, 30.
c. Rigorous investigation performed to a high technical & ethical standard.
No comment.
d. Methods described with sufficient detail & information to replicate.
Line 233 is suggested to change into “the input data for CNN will be two or four”.
a. Impact and novelty not assessed. Meaningful replication encouraged where rationale & benefit to literature is clearly stated.
No comment
b. All underlying data have been provided; they are robust, statistically sound, & controlled.
No comment.
c. Conclusions are well stated, linked to original research question & limited to supporting results.
c.1: In table 2, we can see that the integration of coordinate information as an additional input only has marginal improvement over the baseline UNet. For example, in the contrast-enhanced dataset the DSC and JSC improvement is less than 0.2% in UNet. This challenges the main contribution of this paper.
On the other hand, the coordinate information significantly improves the performance of DenseVoxNet, and the authors explain in line 376-379 that coordinate information make the network converges easier. I would suggest investigating further on this direction.
c.2: The author states that the main limitation of CNN-based methods is the limited size of input data, due to the GPU memory constraints. The common practice in medical deep learning is to use patch-based inference, which means a 512x512x64 volume can be patched into multiple, e.g. 256x256x32 patches so as to maintain the original resolution. In this paper, the authors choose to scale down the image volume.
This paper proposes incorporating voxel position as an additional input for AAA segmentation. The paper needs more work on the related work section, including how the proposed coordinate encoding scheme compared to previous works, and why the scheme would be a good fit for the problem as well as transfer learning.
The authors conduct extensive experiments on various network architectures, nevertheless, the improvement is marginal over the baseline UNet. Also, the novelty of the proposed method is insufficient and most of the work is derivative. I suggest looking at more methods to incorporate positional information, e.g. in multiple layers, or a better encoding scheme.
This paper presents a new 3D AAA segmentation approach that incorporates coordinate information to improve the segmentation results. The authors have tested the proposed method on various network architectures, including UNet, AG-DSV-UNet, VNet, ResNetMed, and DenseVoxNet, and transfer learning from a network pre-trained on the pre-operative dataset to post-operative EVAR.
1. The authors didn't explain how to obtain the coordinate information for test images, which was critical to applying the proposed method in real-world applications. We usually use prediction methods for unseen images in practice, and obtaining coordinate information for these images could be difficult.
** Please provide the related explanation in the paper.
2. Some recently published related works are not referred to and compared in the paper. For example
[1] Wang, Yan, Florent Seguro, Evan Kao, Yue Zhang, Farshid Faraji, Chengcheng Zhu, Henrik Haraldsson, Michael Hope, David Saloner, and Jing Liu. "Segmentation of lumen and outer wall of abdominal aortic aneurysms from 3D black-blood MRI with a registration based geodesic active contour model." Medical image analysis 40 (2017): 1-10.
[2] Salvi, Anish, Ender Finol, and Prahlad G. Menon. "Convolutional Neural Network-based Segmentation of Abdominal Aortic Aneurysms." In 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 2629-2632. IEEE, 2021.
[3] Dziubich, Tomasz, Paweł Białas, Łukasz Znaniecki, Joanna Halman, and Jakub Brzeziński. "Abdominal aortic aneurysm segmentation from contrast-enhanced computed tomography angiography using deep convolutional networks." In ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium, pp. 158-168. Springer, Cham, 2020.
[4] Wang, Yan, Florent Seguro, Evan Kao, Yue Zhang, Farshid Faraji, Chengcheng Zhu, Henrik Haraldsson, Michael Hope, David Saloner, and Jing Liu. "Segmentation of lumen and outer wall of abdominal aortic aneurysms from 3D black-blood MRI with a registration based geodesic active contour model." Medical image analysis 40 (2017): 1-10.
** Please read and cite these papers accordingly in your paper.
1. No validation set is created and used in transfer learning so it is unclear to me how do the authors tune the model hyper-parameters in this case. Please describe it in detail in the paper.
2. Please provide a detailed definition of dice score, including how to compute it.
no comment
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.