Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

Review History
Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients

All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

The initial submission of this article was received on August 8th, 2023 and was peer-reviewed by 2 reviewers and the Academic Editor.
The Academic Editor made their initial decision on October 18th, 2023.
The first revision was submitted on November 1st, 2023 and was reviewed by 2 reviewers and the Academic Editor.
The article was Accepted by the Academic Editor on November 14th, 2023.

Version 0.2 (accepted)

Vimal Shanmuganathan · Nov 14, 2023 · Academic Editor

Accept

The comments have been addressed.

[# PeerJ Staff Note - this decision was reviewed and approved by Vicente Alarcon-Aquino, a PeerJ Section Editor covering this Section #]

Nomaan Mohammed · Nov 8, 2023

Basic reporting

None

Experimental design

None

Validity of the findings

None

Additional comments

None

Cite this review as

Mohammed NJ (2023) Peer Review #1 of "Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients (v0.2)". PeerJ Computer Science

Reviewer 2 · Nov 5, 2023

Basic reporting

The authors have revised the manuscript in a clear and unambiguous order.

Experimental design

No comments

Validity of the findings

Now it is clear

Additional comments

None

Cite this review as

Anonymous Reviewer (2023) Peer Review #2 of "Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients (v0.2)". PeerJ Computer Science

Download Version 0.2 (PDF) Download author's response letter - submitted Nov 1, 2023

Version 0.1 (original submission)

Vimal Shanmuganathan · Oct 18, 2023 · Academic Editor

Minor Revisions

The spell error and grammar may also be corrected.

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

**Language Note:** The Academic Editor has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at copyediting@peerj.com for pricing (be sure to provide your manuscript number and title). Alternatively, you should make your own arrangements to improve the language quality and provide details in your response letter. – PeerJ Staff

Nomaan Mohammed · Sep 29, 2023

Basic reporting

The literature review is comprehensive and provides a historical context for the research. It discusses various related works and their approaches to spoof detection, highlighting the evolution of techniques over time.
The Results section presents the performance metrics of the proposed models and the final architecture. It is well-organized and provides a clear picture of the methodology's effectiveness.
Overall, the paper effectively compares the proposed methodology's performance with existing models in the literature, highlighting the approach's strengths.

Experimental design

The article is highly relevant in biometric authentication, as spoofing attacks on speaker verification systems have become a significant concern. The authors do an excellent job of explaining the importance of their research in addressing these vulnerabilities.
The methodology is well-detailed and structured. It describes the feature extraction techniques (MFCCs and spectrograms) and the neural network architectures used in a clear and organized manner. The inclusion of figures and diagrams enhances the understanding of the proposed models.

Validity of the findings

1. The article discusses the challenges and countermeasures related to spoofing attacks in automatic speaker verification systems. While the problem is not entirely novel, the proposed methodology, combining Mel Frequency Cepstral Coefficients (MFCCs) and spectrograms with convolutional neural networks (CNNs), seems to be a unique approach.
2. Using the ASVspoof 2017 V2 database for evaluation adds credibility to the results.

Minor observation:
3. Clarifying the process of assembling the different models into the final architecture would benefit the article more. Explain how the outputs of the individual models are combined for classification, and providing a diagram or flowchart of this process that would enhance understanding.

Additional comments

Minor Suggestions:
These minor suggestions might benefit the article and its readers more but are not necessary to be incorporated as the paper is well-structured and contains ample information.

1. More information about the specific hyperparameters used in training the neural networks, such as learning or dropout rates, would enhance the reproducibility and scientific rigor of the work.
2. The article could benefit from a more explicit mention of the results achieved in terms of accuracy and Equal Error Rate (EER) to provide a more precise summary of the paper's contributions.

Cite this review as

Mohammed NJ (2023) Peer Review #1 of "Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients (v0.1)". PeerJ Computer Science

Reviewer 2 · Oct 16, 2023

Basic reporting

Overall, the text effectively explains the process of spectrogram generation and sets the stage for a more detailed discussion of the research findings regarding the use of linear and logarithmic spectrograms to distinguish between genuine and spoof audio.

Experimental design

I have few questions:
The use of the ASVspoof 2017 V2 database is reasonable, especially for addressing replay attacks. However, some questions may arise regarding the representativeness of this dataset. For instance, how diverse is the dataset in terms of speaker characteristics, recording environments, and types of replay attacks?
The description of the dataset composition (Table 1) is helpful in understanding the size of each dataset for training, development, and evaluation. However, it would be valuable to know how the genuine and generated audio samples were selected or created. Were they randomly sampled or generated in a specific manner?

Validity of the findings

The addition of silence for shorter audios and truncation for longer audios is a common preprocessing step, as mentioned. It's relevant to understand whether these modifications impact the data's integrity and what implications they have for the proposed methodology's performance.

Additional comments

The paper, as presented, provides a well-written foundation for research in the field of audio signal analysis, specifically focusing on distinguishing between genuine and spoof audio, with a clear methodology and data description. With the suggested clarifications and additional information as outlined in the previous review, the paper can be considered for publication

Cite this review as

Anonymous Reviewer (2023) Peer Review #2 of "Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients (v0.1)". PeerJ Computer Science

Download Original Submission (PDF) - submitted Aug 8, 2023

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Review History Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients

Summary

Version 0.2 (accepted)

Vimal Shanmuganathan · Nov 14, 2023 · Academic Editor

Nomaan Mohammed · Nov 8, 2023

Basic reporting

Experimental design

Validity of the findings

Additional comments

Reviewer 2 · Nov 5, 2023

Basic reporting

Experimental design

Validity of the findings

Additional comments

Version 0.1 (original submission)

Vimal Shanmuganathan · Oct 18, 2023 · Academic Editor

Nomaan Mohammed · Sep 29, 2023

Basic reporting

Experimental design

Validity of the findings

Additional comments

Reviewer 2 · Oct 16, 2023

Basic reporting

Experimental design

Validity of the findings

Additional comments

Review History
Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients