Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on January 20th, 2023 and was peer-reviewed by 2 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on January 31st, 2023.
  • The first revision was submitted on February 22nd, 2023 and was reviewed by 1 reviewer and the Academic Editor.
  • The article was Accepted by the Academic Editor on February 28th, 2023.

Version 0.2 (accepted)

· Feb 28, 2023 · Academic Editor

Accept

The revisions are satisfactory and the manuscript is recommended for publication.

[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Computer Science Section Editor covering this Section #]

Reviewer 2 ·

Basic reporting

no comment

Experimental design

no comment

Validity of the findings

no comment

Additional comments

Dear authors,
Thanks for revising the manuscript. No further questions.

Version 0.1 (original submission)

· Jan 31, 2023 · Academic Editor

Major Revisions

Based on the reviewers’ comments, you may resubmit the revised manuscript for further consideration. Please consider the reviewers’ comments carefully and submit a list of responses to the comments along with the revised manuscript.

Reviewer 1 ·

Basic reporting

The paper presents a study for forecasting air pollutants based on the dataset gathered from the region of Selangor. The investigation has used 4 machine learning and 2 deep learning techniques for prediction and claims that the results will help to build a holistic strategy to tackle air quality index.

The revised version of the paper has addressed several reviews of the previous cycle however authors should look into a few more aspects.

The authors have attempted to provide the motivation of the work however it still does not answer, who will benefit from this work and how? how this work differentiates itself from existing air quality prediction studies? This discussion should be added in section 1.

Experimental design

Figure & suggest that problem definition is extracted from the dataset. If the problem is dependent on the dataset then one should be able to modify the datset to solve it. Additionally there is not section describing a crisp problem statement.

The performance of ML models is highly dependent on quality of Dataset. Details of data preprocessing are not provided. Fig 7 suggest 5 Ml and 1 deep learning model that contradicts the earlier claim.

Authors have not provided the training and testing parameters for any of the models used for the study.

Validity of the findings

The figures 2-5 are not illegible. Also the labels of the correlation figure are easy to read.

Authors have not described the reasons behind the projected population increasing exponentially after 2020 in Banting in Fig 11. Authors have discussed that this has happened in the past but there is no evidence provided. The other stations show erratic bumps in the readings for multiple years.

Other than population, industrial and economic growth =, governmental policies regarding green house mission also affect the air quality index. Authors should consider including these features as well.

Reviewer 2 ·

Basic reporting

no comment

Experimental design

1. The research question is not well defined. From my humble view and the code the authors provided, they are not trying to solve a time series forecasting problem. Instead, they are trying to predict the PM2.5 value from other variables collected in the same timepoint. The authors should explicitly state the input dimension, e.g., which I think is number of variables*1 timepint.
2. If the above statement is true, the usage of LSTM is flawed and meaningless. LSTM is designed for capturing the temporal dependency in the time series forecasting problem. If only one time step is used, a simple feed-forward deep neural network would be enough.
3. The authors fail to justify how their research helps to fill in the knowledge gap when the considered problem is meaningless. The problem only happens when the PM2.5 sensor is broken, while the other sensors work. The adopted machine learning models are all well-known techniques, too.

Validity of the findings

Meaningful replication is not possible when the code for LSTM is not provided and the LSTM result is not reliable.

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.