Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on June 26th, 2025 and was peer-reviewed by 2 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on July 21st, 2025.
  • The first revision was submitted on August 19th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • The article was Accepted by the Academic Editor on August 29th, 2025.

Version 0.2 (accepted)

· · Academic Editor

Accept

Congratulations on being accepted.

[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]

Reviewer 1 ·

Basic reporting

N/A

Experimental design

N/A

Validity of the findings

N/A

Additional comments

N/A

Reviewer 2 ·

Basic reporting

The paper is considerably improved than before and the author has provided more clarity to issues previously raised.

Experimental design

The experimental design is also improved. Although proper structure is maintained, the technical quality of work done is still limited to a certain extent but reporting of results and discussion is detailed and meticulate. If other reviewers see fit, the manuscript can be a soft accept.

Validity of the findings

The findings are valid and addition of ANOVA as suggested provide a more stronger statistical validation.

Additional comments

The issues I had earlier raised are all addressed by the author. And the writing and flow of the entire manuscript has greatly improved. At this point, no further feedback from me.

Version 0.1 (original submission)

· · Academic Editor

Major Revisions

The contribution is not clear enough. The small used dataset and the impact of augmentation are not discussed. The analysis and discussion of the results are insufficient.

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

Reviewer 1 ·

Basic reporting

See the attached file.

Experimental design

See the attached file.

Validity of the findings

See the attached file.

Additional comments

See the attached file.

Annotated reviews are not available for download in order to protect the identity of reviewers who chose to remain anonymous.

Reviewer 2 ·

Basic reporting

The paper is well-aligned with current trends in speech/music processing—especially deep learning in Music Information Retrieval (MIR) and computational ethnomusicology. The use of both CNN and LSTM architectures on Mel spectrograms is timely, and the comparison to classical models adds value. The introduction of the TuFoC dataset tailored for Turkish folk music classification by region addresses a relatively under-explored subdomain and the author declared it as the first dataset of its kind for Turkish regional folk song classification. The focus on preservation of intangible cultural heritage makes this paper interdisciplinary and socially impactful.

Experimental design

1. Is it possible to include a map of the geographical areas from which the folk songs are collected for classification? I believe this would help readers understand more about the songs cultural background.
2. The dataset is relatively small, although I am aware of the scarcity of the folk songs data, how would relying on augmented data help in actual scenario when folk songs need to be classified in real time with real un-augmented data.
3. On page 2, line 69 – “a CNN-based model levelWhile prior work, such as….” needs to be rectified. The same sentence is seen in the next paragraph on line 76.
4. On line 297 of page 6, “Each MP3 file was loaded using the librosa library with its native sampling rate preserved (sr=sr)…”, is sr=22.05 kHz, or the sr before the downsampling which would be more “native” than the downsampled mp3?
5. I believe there is no quantitative comparison of augmentation parameters (e.g., pitch shifts, time stretch factors) used in GAD vs. AD. It is unclear if the type of augmentation technique used matter here.
6. The paper does not provide a justification as to why the model performance is very low in case of AD as compared to GAD. Even though some reasonings are given on the advantages of GAD, it is unclear the limitations of AD and the reason for its inferior performance.
7. There is no analysis on spectral integrity of the augmented data and how it affects the feature extraction. Since Figure 1 only show OD for Mel spectrograms of the five categories, it would be helpful to see OD, AD, and GAD of the song categories.
8. I believe the statistical conclusions could be strengthened with ANOVA.

Validity of the findings

No comments

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.