To increase transparency, PeerJ operates a system of 'optional signed reviews and history'. This takes two forms: (1) peer reviewers are encouraged, but not required, to provide their names (if they do so, then their profile page records the articles they have reviewed), and (2) authors are given the option of reproducing their entire peer review history alongside their published article (in which case the complete peer review process is provided, including revisions, rebuttal letters and editor decision letters).
I am writing to inform you that your manuscript - Finding the optimal bayesian network given a constraint graph - has been Accepted for publication.
I would recommend publication of your paper after the remaining (very) minor revisions following the referees' comments.
The authors have properly addressed my concern about self-containedness.
Fig. 3 still contains the minor mistake, but it can easily be fixed without further review.
The data-sets have been described in more detail, so the authors have addressed my concern.
The authors have included a small real-world example (feature selection using naive Bayes) which reinforces the validity of their findings.
The authors did a good job in revising the manuscript and I suggest acceptance.
In the final version, please fix the following really minor issues:
* Figure 3 still has a caption mistake
* The Markov Blanket of a BN (defined on page) also includes the co-parents of a node's children.
* On page 8, line 307, grammar: probably "that" needs to be deleted.
Your manuscript has been reviewed by two of our referees.
Comments from their reports appear below. When you resubmit your manuscript, please include a summary of the changes made and a brief response to all recommendations and criticisms.
The authors propose the notion of constraint graphs for the task of handling prior knowledge in a Bayesian network, and give indications on the efficiency gains over existing methods. They cite Perrier et al 2014 in which the same task is accomplished by a different method. It must be noted that Perrier et al give explicitly the complexity of the method. Contrastingly, this manuscript begins by calling the new method 'intuitive', and does not mention the complexity, except for the tables and some statements which implicitly say that it is polynomial (46 to 52).
In my view, for the work in this manuscript to be complete, the complexity issue is better to be addressed bit more analytically. It would not be very difficult as the method in this manuscript basically relies on Tarjan's algorithm (139 to 143).
The tables in the results section give considerable evidence for the authors' claim. It would be better to include some stronger evidence. Or the authors may verify it analytically.
* The article fulfills the criteria of PeerJ, except maybe self-containedness: I'd propose that the authors shortly review the usually applied scores for Bayesian network learning, i.e. MDL, BIC, BD, and mention their property that they decompose over the network's nodes.
* Also, I think the dynamic programming method by Malone et al., 2011 should be described in a more detailed way. I see that any algorithm (even a greedy one) could be used, as long as it obeys the constrained graph. However, since the authors assume this particular algorithm, I think explaining it in a more detailed way would make the paper more self-contained and understandable for the non-expert.
* Minor comment: Figure 3 doesn't have a sub-figure C, but the caption refers to it (and it actually should refer to B)
* Data-sets should be introduced.
In Section 4.1, stock market data is used, but the source of this data is not given.
In Section 4.2, the data set is not described at all (synthetic?).
In Section 4.3, the data seems to be random, except that certain variables are clamped to the same values; please describe this process in a more detailed way.
Findings are valid. Some more experiments on real data would be appropriate.
The paper is surely somewhat incremental, in light of Fan et al. and Malone et al. However, since novelty is not a primary issue for PeerJ, this should not be a concern here. I find the constraint graph elegant, easy to understand and use for non-experts (in probabilistic modeling), and that the approach interplays nicely with the "spirit" of Bayesian networks (combining expert knowledge and data).
Generally, it would be desirable to see some more experiments with real-world data.
It would also be interesting to see how large datasets can get (in terms of number of variables), when placing stringent but reasonable constraints on the structure.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.