All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Dear Author,
Your paper has been revised. It has been accepted for publication in PeerJ Computer Science. Thank you for your fine contribution.
[# PeerJ Staff Note - this decision was reviewed and approved by Mehmet Cunkas, a PeerJ Section Editor covering this Section #]
-
-
-
The paper is well written and clear. The introduction and background contextualize the study clearly. The references are up-to-date with recent research, and the structure conforms to PeerJ standards.
Furthermore, the authors have comprehensively addressed all the reviewers' comments. The revisions and additions to the manuscript are satisfactory and have significantly improved the clarity and rigor of the work.
The paper presents a highly relevant and timely contribution to the field of multimodal authentication. The work is technically rigorous, featuring an innovative fusion of facial and iris data using a sophisticated GAN-based architecture, and it commendably addresses important ethical considerations such as data anonymization. The methodology is exceptionally well-documented, ensuring full reproducibility through detailed descriptions of the GAN architectures, hyperparameters, and data preprocessing steps. The authors have further strengthened the manuscript by proactively addressing initial suggestions; they have now included a clear data availability statement for the specified datasets and enhanced the evaluation by incorporating robust metrics like the Detection Cost Function (DCF) and F1-score alongside the reported accuracy, FAR, FRR, and EER. These revisions have significantly elevated the quality and completeness of an already outstanding paper.
The work demonstrates significant replication value by building upon prior GAN-based authentication frameworks and adding considerable novelty through its innovative multimodal fusion approach and rigorous spoofing resistance tests. The conclusions are well-supported by promising results and are strengthened by the authors' forthright acknowledgment of limitations, particularly the computational complexity inherent to the GAN architecture, which may affect real-time deployment. The future directions are well-identified and appropriately focus on the critical next steps of optimizing computational efficiency and enhancing resilience against emerging threats, which will be vital for translating this research into real-world applications in high-impact domains such as access control systems, border security, and law enforcement.
The authors have further strengthened the manuscript by including a comparative analysis with non-GAN-based baseline methods to more fully contextualize the performance advantages of the proposed approach.
Overall, the authors have addressed all major comments properly, and the scientific quality of the manuscript has been enhanced.
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
**Language Note:** When preparing your next revision, please ensure that your manuscript is reviewed either by a colleague who is proficient in English and familiar with the subject matter, or by a professional editing service. PeerJ offers language editing services; if you are interested, you may contact us at [email protected] for pricing details. Kindly include your manuscript number and title in your inquiry. – PeerJ Staff
(1) Some content is unclear in the Abstract. “Subsequently, the features train a generative adversarial network to produce synthesized biometric templates.” What is the role of the synthesized biometric templates?
(2) The figures are blurred and plotted unprofessionally.
(3) In the Section Discussion, the authors simply state the Non-invertibility. Actually, Non-invertibility and correlation attacks should be analyzed and evaluated.
(4) Many SOTA cancelable biometric methods are not introduced, which are missing.
(5) Please insert the figures and tables directly into the text. It is difficult to read the text without the figures and tables.
(6) Some figures are unclear. Please provide as much information as possible, then the readers can get the details from each separate figure easily, even without the text. E.g., how to fuse binary features in Fig. 1?
-
-
The manuscript is composed in clear language. Nonetheless, a few specialized terms would benefit from brief definitions, particularly for readers not versed in GAN architectures or biometric fusion techniques. The introduction successfully situates the research and articulates the rationale for integrating multimodal biometrics with GANs. It would be strengthened by a sharper articulation of the novelty of the proposed keyless mechanism in relation to existing methods. The literature review is thorough; however, some citations, such as "Tarek 2021," are missing from the reference list. Confirm that every in-text citation is fully represented in the reference section.
While the three fusion schemes as Feature-Level, GAN-based, and Decision-Level, are articulated, additional clarity on the GAN architecture can be included. Specifically, detailing the number and types of layers, the choice of activation functions, and the exact training parameters, such as batch size. Leveraging the CASIA-V3-Internal, MMU1, ORL, and FERET datasets is justifiable, yet the manuscript must better defend the pairing of iris and face samples across these datasets. Addressing the origins and any intrinsic biases of these samples will strengthen the validity of cross-dataset fusion. Though the MATLAB implementation and the computing infrastructure are mentioned. The manuscript appropriately employs EER, FAR, and FRR; a wider view would be attained by also reporting computational time along with energy consumption, if feasible, to provide a fuller picture of the fusion schemes' performance in practical settings.
The findings show that the system achieves both higher accuracy and greater security, attaining equal error rates as low as 0.0297%. The contrast with unimodal and previously published multimodal approaches is compelling, yet the authors should expand the discussion on why Decision-Level fusion yields better performance, perhaps by quantifying the contributions of individual modalities and the mode of integration. The authors convincingly argue that the framework is resilient to pre-image and correlation attacks, grounded in theoretical considerations; however, including a suite of simulated attack scenarios would provide the empirical evidence needed to corroborate these claims fully. While Algorithms 1 and 2 are thorough, a pseudocode layout would improve the flow and comprehension, allowing readers to quickly grasp the logic steps.
Comment 1: In this paper, figures show the results, yet higher resolution and more informative labeling would enhance them significantly. Like in Figure 6, the axes could state “Hamming Distance Threshold” rather than the abbreviated “Hamming Distance” to provide clearer context.
Comment 2: The authors state that no human participants were involved, yet a brief statement on how the FERET dataset and any other datasets were obtained, anonymized, and used in compliance with ethical guidelines would strengthen the ethical transparency.
Comment 3: The conclusion identifies some limitations, yet a more detailed list of future directions would be valuable, such as specifically mentioning planned adversarial robustness evaluations and strategies for scaling the approach to significantly larger datasets in practical deployment.
Comment 4: The manuscript rightly notes the heavy computational burden of GAN training, yet it would benefit from a brief survey of current and emerging optimization techniques, such as pruned lightweight generator architectures, that could facilitate deployment in resource-constrained environments.
Strengths:
• Clarity & Language: The manuscript is written in clear, professional English with no major grammatical errors, and typing errors such as at line. However, some technical terms could benefit from brief definitions for broader accessibility.
• Introduction & Background: The introduction effectively contextualizes the research, clearly stating the motivation and gaps in the literature. References are relevant and up-to-date, though a few key recent studies could strengthen the background.
• The structure follows PeerJ standards, with distinct sections for Materials & Methods, Results, and Discussion.
Suggestions:
• Clarify acronyms (e.g., define " ECG" or "CNN" at first use for the non-expert readers).
• Expand the background on GANs’ vulnerabilities in authentication, such as mode collapse, to justify the proposed improvements.
Strengths:
• Scope: The study fits PeerJ’s scope and addresses a timely gap: robust multimodal authentication.
• Technical Rigor: The fusion of facial and iris data using a multimodal fusion approach based on the standard GAN model is innovative. Ethical considerations (e.g., anonymization) are taken into account.
• Reproducibility: The Methods section details GAN architectures (e.g., discriminator layers), hyperparameters (learning rate, batch size), and data preprocessing.
Concerns/Suggestions:
• Dataset Availability: The authors should specify if datasets (CASIA, MMU1, ORL, and FERET) are publicly available or require permissions.
• Evaluation Metrics: While accuracy, FAR, FRR, and EER are reported, it would be better to add other metrics such as the Decision Cost Function (DCF) that combines FAR and FRR and the F1 score.
Strengths:
• Replication Value: The work replicates prior GAN-based authentication works, but adds value via multimodal fusion and spoofing resistance tests.
• Conclusions: Provided promising results. Limitations are acknowledged. It is related to the computational complexity introduced by the GAN-based transformation, which may impact real-time performance, making it less suitable for resource-constrained environments.
• Future Directions: Well-identified. It should focus on optimizing computational efficiency and enhancing resilience against emerging threats.
• Emphasize Real-World Impact: For instance, "Access control systems, border security, and law enforcement."
Suggestions:
• Strengthen the Discussion: It would be beneficial to compare results to non-GAN-based baselines (If possible).
This paper presents a compelling and timely investigation into GAN-based multimodal authentication, addressing a significant challenge in modern security systems. The innovative fusion of iris and facial biometrics through generative adversarial networks demonstrates both technical sophistication and practical relevance. The research makes a valuable contribution to the field by advancing anti-spoofing capabilities while maintaining usability - a crucial balance in authentication systems.
The experimental design is rigorous and well-considered, employing appropriate evaluation metrics (EER, FAR, FRR) that properly reflect the security-critical nature of authentication tasks. The technical execution is particularly strong, with clear descriptions of the GAN architecture and training procedures that enhance reproducibility.
The manuscript is well-structured and clearly written, making complex concepts accessible without oversimplifying the technical content. The figures and tables effectively support the narrative. The literature review is comprehensive and properly contextualizes the work within current research trends.
The results are convincing and well-presented, with appropriate statistical validation and comparison to existing methods. The discussion thoughtfully addresses the strengths of the approach. The conclusion is well-supported by the evidence presented, addresses limitations, and suggests promising directions for future work.
Minor Suggestions for Enhancement
To further strengthen this already excellent work, the authors might consider expanding the discussion of computational efficiency, as this often proves crucial for real-world deployment of authentication systems. Additionally, a brief comparison with non-GAN-based multimodal approaches could help readers better appreciate the specific advantages of the chosen methodology. If space permits, including examples of synthetic data generation might provide a helpful visualization of the GAN's capabilities.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.