Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on October 27th, 2020 and was peer-reviewed by 2 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on December 15th, 2020.
  • The first revision was submitted on January 21st, 2021 and was reviewed by 2 reviewers and the Academic Editor.
  • A further revision was submitted on March 11th, 2021 and was reviewed by 3 reviewers and the Academic Editor.
  • A further revision was submitted on May 4th, 2021 and was reviewed by the Academic Editor.
  • The article was Accepted by the Academic Editor on May 12th, 2021.

Version 0.4 (accepted)

· May 12, 2021 · Academic Editor

Accept

Well revised, and I am happy to consider this paper for publication. Congratulations!

Version 0.3

· Apr 2, 2021 · Academic Editor

Minor Revisions

Authors are requested a few more suggestions. If you are able to revise accordingly, I would be happy to accept the paper. Thank you.

Reviewer 2 ·

Basic reporting

no comment

Experimental design

The authors still only use a training set and a test set, with no validation set. The text doesn't guarantee in a sufficiently clear way that the results weren't optimized specifically for this test set, which could mean the data is biased to fit this specific test set.

The training hyperparameters should be optimized on a validation set, and the best parameters for the validation set should then be used to calculate the test metrics.

Validity of the findings

The previous comment on validation sets must be clearly addressed by the authors. It is not the same thing to use a validation set and then a test set, and to optimize directly for the test set performance (which seems to be the case here).

Furthermore, always prefer to state where the performance is calculated. If the accuracy was computed on the test set, mention it as "test accuracy". This must be abundantly clear to the reader.

Additional comments

no comment

Reviewer 3 ·

Basic reporting

Accept

Experimental design

Authors attended to the suggestions of the reviewers.

Validity of the findings

Authors attended to the suggestions of the reviewers.

Additional comments

Authors attended to the suggestions of the reviewers.

·

Basic reporting

The author has written in simple, unambiguous, and competent Language. The literature is well-referenced and applicable, with an introduction and history to show meaning. The paper discusses a new method for LP recognition that involves stacking two CNN networks. Batch normalization and a non-Linear activation layer follow the convolution layer, which is at the core of the convolution block.

Experimental design

All of the research questions have been presented and answered in a clear and concise manner, and they are all important and significant in the context.

Validity of the findings

All validation problems have been addressed; they are stable, statistically accurate, and well-controlled.

Additional comments

I congratulate the authors on their extensive data collection. Furthermore, the manuscript is written in plain, unambiguous language, and the comments have been satisfactorily addressed.

Version 0.2

· Feb 23, 2021 · Academic Editor

Minor Revisions

Reviewers have now commented on the paper. Based on their suggestions, I'm happy to provide you a decision: minor revision.

[# PeerJ Staff Note: It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors are in agreement that they are relevant and useful #]

Reviewer 3 ·

Basic reporting

A deeper error analysis is required.

Experimental design

The authors need to discuss the evaluation procedure.

Validity of the findings

The authors can discuss the results for other performance metrics.

Additional comments

The literature review needs to be updated with some of the recent works. Discussing the following works would make the manuscript richer for the readers:

DevNet: An Efficient CNN Architecture for Handwritten Devanagari Character Recognition. Int. J. Pattern Recognit. Artif. Intell. 34(12): 2052009:1-2052009:20 (2020)
Character recognition based on non-linear multi-projection profiles measure. Frontiers Comput. Sci. 9(5): 678-690 (2015)
Relative Positioning of Stroke-Based Clustering: a New Approach to Online Handwritten Devanagari Character Recognition. Int. J. Image Graph. 12(2) (2012)
Artistic Multi-character Script Identification Using Iterative Isotropic Dilation Algorithm. RTIP2R (3) 2018: 49-62
Character Recognition Based on DTW-Radon. ICDAR 2011: 264-268
Spatial Similarity Based Stroke Number and Order Free Clustering. ICFHR 2010: 652-657
Dtw-Radon-Based Shape Descriptor for Pattern Recognition. Int. J. Pattern Recognit. Artif. Intell. 27(3) (2013)

·

Basic reporting

A new approach to LP identification by stacking two CNN networks is discussed in the paper. The core of the convolution block, which is the convolution Layer is followed by batch normalization and a non-Linear activation layer.

Experimental design

There is a well-designed experimental section. The manuscript's strongest achievements are the suggested Full Depth CNN (FDCNN) model and the recommendation for a new license plate data set called LPALIC. The strength of the paper is the selection of parameters, filters, and the process of training. The manuscript also gives an overview of FDCNN with respect to the use of memory.
The drawback of the manuscript is that the results obtained during the testing process on the test dataset are missing and no validation dataset has been used, so it is unclear that how fine-tuning of the model hyperparameters has been performed and the problem of overfitting has been addressed. I think the writers should be able to update the manuscript before publishing to make all these points clear and further illustrate and solidify their already good findings

Validity of the findings

All outcomes are well defined and codes are given, but to ensure the validity of the data, the points listed must be answered. I assume that the relevance of the results can be more readily asserted if the issues are discussed and explained.

Additional comments

The paper is clear, but there are few points that require Clarification
1. Explains the methodology used in order to correctly recognize Arabic zero numbers and letters written in continuous style.
2.Discuss the criteria based on which a character is labeled as difficult /easy as specified in line number
342.
3.You should clarify, why one should go for the proposed FDCNN model, when  Assiri,2019 stacked method has outperformed.

Version 0.1 (original submission)

· Dec 15, 2020 · Academic Editor

Major Revisions

The manuscript needs for a thorough revision. Thank you.

[# PeerJ Staff Note: Please ensure that all review comments are addressed in a rebuttal letter and any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.  It is a common mistake to address reviewer questions in the rebuttal letter but not in the revised manuscript. If a reviewer raised a question then your readers will probably have the same question so you should ensure that the manuscript can stand alone without the rebuttal letter.  Directions on how to prepare a rebuttal letter can be found at: https://peerj.com/benefits/academic-rebuttal-letters/ #]

[# PeerJ Staff Note: The review process has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at copyediting@peerj.com for pricing (be sure to provide your manuscript number and title) #]

Reviewer 1 ·

Basic reporting

The paper presents a new approach to detect LP by stacking two CNN networks. It manages to obtain a state if the art results while using a small network in terms of the number of parameters.

Experimental design

The experimental Section is clear and well-designed.

Validity of the findings

The novelty of the approach is questionable, but the finding is interesting.

Additional comments

The paper presents another approach for LP recognition. The paper is clear, but there are several points that need further classification.
1. VGG seems to obtain the same performance, you should explain why you need another network
2. You need to explain why you have different results for the various datasets on average (table 11)
3. What is the point of including fashionMNIST
4. There is no Latin digits, the one you call Latin are Arabic and the one you call Arabic are Hindi

Reviewer 2 ·

Basic reporting

Some revisions are necessary on the presentation of the text itself. In my general comments to the authors, I’ve pointed out some grammar errors, typos and unclear passages.
This seems to warrant a full revision of the text to ensure the promising results of the paper are not obfuscated to readers.

Experimental design

The manuscript describes a classification architecture for license plates from multiple countries, which is also tested in common character and digit recognition datasets. Additionally it is also tested in FashionMNIST, a popular small grayscale image dataset. To my understanding, the strongest contributions of the paper are the proposed Full Depth CNN (FDCNN) model, and the proposal of a new license plate dataset called LPALIC. They seem to successfully address the knowledge gaps identified by the authors.

Regarding the FDCNN model, promising results are reported for all the studied datasets, including state of the art results for MNIST when only stacked CNNs are considered, according to the authors. I believe that the methodology description and results must be entirely reproducible, especially since state of the art results in widely used benchmarks are involved.

Because of this, I believe the paper needs revisions to allow for easier reproduction and verification of results, and also needs more justifications on certain choices in the methodology, as explained in my comments further below.

In this current form of the manuscript, readers might not understand why many aspects of this FDCNN architecture were chosen, or how they were tested and validated. The lack of the use of a validation set, for instance, should be clarified, as directly optimizing a model on the test set could potentially introduce biases. This is true even if the justifications are of an empirical nature. I believe the authors should be able to revise the manuscript to make all of these points clearer and further highlight and solidify their already strong results, before publication.

Validity of the findings

All results are well described and data and codes are provided, but the points mentioned about experimental design and basic reporting must be addressed to ensure the validity of the results.
I believe once those matters are addressed and clarified the validity of the findings will be more easily asserted.

Additional comments

Line 36 - It would be interesting to additionally address or discuss these recent works, which also reportedly provide state of the art accuracies for the same task of classifying MNIST:

Hirata and Takahashi; Ensemble Learning In CNN Augmented with Fully Connected Subnetworks
Byerly et al.; A Branching and Merging Convolutional Network with Homogeneous Filter Capsules
Assiri; Stochastic Optimization of Plain Convolutional Neural Networks with Simple methods
Kowsari et al.; RMDL: Random Multimodel Deep Learning for Classification

Line 84 – method was used

Line 100 – very little research was done

Line 100 – This paragraph starts with the idea that there are very few multi-language license plate datasets, but then cites works about license plate classification from what seems to be a considerable variety of countries and alphabet types. Please consider clarifying this paragraph by stating more explicitly how many datasets exist among the cited works, and all the different languages and character types used. This can create a clearer picture for the reader, and help to further highlight the new contributions of the paper.

Lines 101 to 104 - The grammar in this sentence is unclear. Please restructure.

Line 117 – Please revise the usage of “concerned” in this sentence.

Line 117 – Since the main contribution of the paper revolves around FDCNNs, it is important that the reader gets a clear understanding of other types of CNNs, what makes them different, and why the changes included in FDCNN are necessary. I feel that this information is currently lacking in the paper, and it’s not entirely clear how the authors justified pursuing this particular approach or why it’s necessary, or important, or better, to reduce featuremaps to 1x1 size before classification. Please include a more detailed discussion/justification.

Line 119 – Please restructure the sentence so FDCNN is not repeated twice in a row.

Line 158 – It would greatly help readers to have a table or something similar where all countries studied are listed, and what language/character types are used in license plates in that particular country.

Figures 1 and 2 – These figures show some interesting differences between datasets. Namely, Arabic alphabet LPs seem to vary much less in color. Could this possibly affect classification accuracies? There seems to be a marked imbalance in the number of Latin and Arabic images in the dataset. Would this warrant using additional metrics, besides accuracy, that are more sensitive to these imbalances? Please discuss this in the manuscript.

Line 187 – Please avoid qualitative/subjective qualifiers such as “modest”, used here. There are other examples of such adjectives in the manuscript. If necessary, compare directly to other values used in the literature, by choosing appropriate metrics for the comparison.

Line 205 – shrink

Line 250 – See comment about line 187. The same applies to the usage of “modest” in this line.

Line 251 – Please clarify the meaning of “needed iterations”. How is this defined?

Line 251 – Please specify batch size and momentum used. The supplemental files seem to show the minibatch used had a size of 120. They also mention a LearnRateDropFactor of 0.9, whereas the paper seems to mention one of 0.5 . It also seems that the standard momentum for Matlab’s ‘sgdm’ function is used, but this value is not stated explicitly in the paper. It would be interesting to add it.
Since one of the results of the paper is a demonstrable improvement upon the state of the art of stacked CNNs, it would be interesting, for reproducibility sake, to include these hyper-parameters in the methodology description.

Line 251- Additionally, it would be interesting to discuss how these parameters were chosen. If there were preliminary tests, or heuristics, how did other attempts affect the results, and by how much?
It would seem that no validation set was used, so how were metrics chosen? Was the model adjusted to get the best result directly on the test set, without using a separate validation set? Couldn’t this bias the models to adjust particularly well only to the test set?

Table 11 – The UAE test set is larger than the training set. I feel this decision should be justified in the paragraph above the table. I could not see a justification for it as is.

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.