Review History


To increase transparency, PeerJ operates a system of 'optional signed reviews and history'. This takes two forms: (1) peer reviewers are encouraged, but not required, to provide their names (if they do so, then their profile page records the articles they have reviewed), and (2) authors are given the option of reproducing their entire peer review history alongside their published article (in which case the complete peer review process is provided, including revisions, rebuttal letters and editor decision letters).

New to public reviews? Learn more about optional signed reviews and how to write a better rebuttal letter.

Summary

  • The initial submission of this article was received on March 11th, 2015 and was peer-reviewed by 2 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on April 3rd, 2015.
  • The first revision was submitted on June 18th, 2015 and was reviewed by 1 reviewer and the Academic Editor.
  • The article was Accepted by the Academic Editor on July 6th, 2015.

Version 0.2 (accepted)

· · Academic Editor

Accept

After following all the reviewers' recommendations I think the paper can be accepted for publication.

Reviewer 2 ·

Basic reporting

-

Experimental design

-

Validity of the findings

-

Comments for the author

The authors have addressed the majority of my concerns, so the paper could be accepted.

Version 0.1 (original submission)

· · Academic Editor

Major Revisions

Although the paper has been positively evaluated, one of the reviewers have serious concerns about it, so you should prepare a new version of the manuscript addressing all the reviewers' comments included.

·

Basic reporting

The research issue is well presented and is a challenging one. The structure of the paper is also sound and easy to follow, although several sub-sections should be combined into larger ones.
The main contribution of the paper regards the proposed architecture for having ML capabilities in the browser. The main drawback of the presentation regards clear presentation of client and server side. It was somehow difficult to figure out exactly which parts are on the server side and which are on the client side.
For example:
- The Boss is a data worker and/or a web worker?
- Both Master Server and Data Server are on the same server machine?
- Is Hadoop used in this project? Where? On Master Server and/or Data Server?
- The paper looks more or less like a development guide for a software application. Many technical details are provided (e.g., JavaScript, MapReduce, Hadoop) although a clear indication of how exactly MapReduce is used is not provided. Did authors use an already existing implementation of MapReduce, like Hadoop? ... and if yes, how does it integrates with JavaScript?

Experimental design

The experiments are fine although more accuracy and performance metrics may be required.

Validity of the findings

Yes, the findings are valid but may be a little preliminary. More structured experimental results may be needed.

Comments for the author

The paper needs two improvements:
- refine the presentation of the architectural decisions for a better understanding
- more structured experimental results (accuracy metrics and performance assessment) may be needed.

Reviewer 2 ·

Basic reporting

The paper is clear and readable. The idea is interesting, and I really enjoyed reading the manuscript.

In the introduction, the authors should clearly clarify the three objectives that are later mentioned in Section 2. Additionally, it is important to clearly state the motivation and justification of the paper. In this sense, research questions are missing. The paper should be more focused on the scientific side, but it mostly covers technological aspects.

Please describe the organisation of the paper uniformly. For example, Section 3.2 is described but section 3.1 is not even mentioned. Mentioning the main sections is more than enough.

Please, avoid including explanatory text in captions (Fig. 1 to 3). Such pieces of text should be incorporated to the section.

All the acronyms should be explained (e.g. SGD, etc.)

Experimental design

The paper is not properly motivated and the experimental framework presented is weak. For example, how did the authors reach this approach? Which design choices were made? Which limitations have their solution and how were they neutralised? Are other solutions (e.g. using services) viable in this case?

The technological contribution is notorious and technically sound, but the scientific side should be conducted more precisely. Additionally, technological choices should be plenty justified as well, e.g., why node.js?

A more detailed comparative framework with both other design alternatives and other previous solutions would be required.

All the implementation and contribution seems to be focused on a given problem domain. What if the library should be scaled, integrated within other system or just a single module would be reused? How is it done? I am afraid that it would be not that simple, and most of the current modules should be adapted or reimplemented, e.g. to add preprocessing capabilities, data and result handlers, new algorithms, etc. Who is in charge of extending the library? Can an external developer to add new features or should we depend on the releases provided by a development team? If it is considered extendable, then please include evidences (case studies, a further discussion, etc.) All these aspects should be explained in detailed so the real contribution is clearer.

The experimental framework should be clearly defined. Important information is missing to support the scalability of this approach. Which is the largest amount of data supported? Which technical/performance limitations have the authors found in their approach? Any limitation regarding data (type, size, etc.)?

Validity of the findings

The paper makes some strong assumptions without any substantial and precise scientific support. For example, in Section 2.1, the authors mention that “to make ML truly ubiquitous requires ML writing models and algorithms with web programming languages and using the browser as the engine”. This should be supported by references.

In fact, a major issue is that the authors often make use of subjective references to endorse their work and assumptions. It is clearly lacking of rigour. References to particular blog entries or subjective articles should be avoided. Please cite instead peer-reviewed works and other precise sources of information. If assertions are not properly founded, they become speculation.

Another strong issue is that the paper is not clear about what is really done and what is to be. For example, the abstract seems to indicate that GPU capabilities are already provided (“MlitB offers [..] including: development of distributed learning algorithms, advancement of web GPU algorithms [..]”. Until Section 2 we do not know whether it is really implemented or not. In fact, the last paragraph of Section 2.2 should be moved to Future Work.

Please, clearly differentiate the current contribution from future work.
It also happens with Objective 2 and 3, described in Section 2, which are not properly developed later.

Section 2.3 explains how important is to provide mechanisms to make reproducibility easier. I totally agree. However, it seems to be an item in our wish list, because how to make it with the library is not properly explained in the paper. In this sense, it is likely that the contribution of Dr. Antonio Ruiz and Dr. Jose Antonio Parejo (University of Seville, Spain) about reproducibility in the field of ML (they are building some sort of framework in this context) would be of interest.

A performance analysis is required so that we can really compare this approach and know whether it would be a viable choice. Citing other external works about the language performance is not enough. Please, design and include a comparative performance study of your own proposal. This becomes especially important in the field of ML because of its increasing computational requirements.

In a real environment, how many users could it support? How would the increase of users affect the performance?

Comments for the author

In general, my view on this work is very positive. The idea is really interesting and the work, promising and challenging. However, I regret to say that, according to the criteria given to the reviewers by PeerJ, the manuscript seems to be immature yet and requires an important rewriting effort. The experimental framework and its validation from a scientific perspective are weak.

Web workers and web sockets should be explained. Discuss why they are the best choice would be interesting (limitations, characteristics, etc.)

In Fig. 5, what does “Probability” means? It is not a precise measure.

Figure 6 is unreadable.

I cannot see Section 5.3 as an opportunity. It should be presented as future work, right?

Section 7.2 does not provide a significant contribution. Please extend to a better clarification.

How are new developments incorporated to the server?

What would happen if the browser is suddenly closed?

Is there any domain/s to which this library is especially well-suited?

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.