Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on January 22nd, 2025 and was peer-reviewed by 3 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on March 17th, 2025.
  • The first revision was submitted on May 1st, 2025 and was reviewed by 1 reviewer and the Academic Editor.
  • The article was Accepted by the Academic Editor on June 3rd, 2025.

Version 0.2 (accepted)

· Jun 3, 2025 · Academic Editor

Accept

Thank you for addressing the comments and suggestions of the reviewers.

[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]

·

Basic reporting

The authors have considered all the comments and included the suggested minor revisions, so in my opinion the paper is ready for publication.

Experimental design

The authors have considered all the comments and included the suggested minor revisions, so in my opinion the paper is ready for publication.

Validity of the findings

The authors have considered all the comments and included the suggested minor revisions, so in my opinion the paper is ready for publication.

Additional comments

The authors have considered all the comments and included the suggested minor revisions, so in my opinion the paper is ready for publication.

Cite this review as

Version 0.1 (original submission)

· Mar 17, 2025 · Academic Editor

Minor Revisions

There are only reasonably minor points to respond to. Please address them and resubmit

[# PeerJ Staff Note: It is PeerJ policy that additional references suggested during the peer-review process should *only* be included if the authors are in agreement that they are relevant and useful #]

·

Basic reporting

The article is well-written, appropriate and contributes to the field of traceability of ML/DL models. In my opinion, there is little to improve here, as the language is clear and easy to follow, the references are sufficient and fairly complete, and the structure is also quite clean.

My only suggestion for the authors would be to consider whether this review, written a few years ago, is also appropriate to be included in their background on traceability tools.

Mora-Cantallops, M., Sánchez-Alonso, S., García-Barriocanal, E., & Sicilia, M.-A. (2021). Traceability for Trustworthy AI: A Review of Models and Tools. Big Data and Cognitive Computing, 5(2), 20. https://doi.org/10.3390/bdcc5020020

**PeerJ Staff Note:** It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors are in agreement that they are relevant and useful.

Experimental design

The research question in relevant and meaningful, and it aims to solve a problem that even though it is well known, it is not properly resolved as of today. The work is rigorous and the authors are very knowledgeable on this matter. No further comments.

Validity of the findings

The findings are clear and well stated. They are technically complete and there is a meaningful discussion. This being said, however, I believe there is a point to be made in this final section; while it is relevant to validate the proposed tool and to set the future aims into more complex workflows, there is an argument to be made about increasing the current use of these traceability tools, as maybe it might be more relevant at this point to find and/or propose ways to extend the use of such provenance tools into the community of DL researchers and developers, to whom, to be honest, these tools are either unknown or they feel them as undesired "extra work". So, we should ensure not only that the tools exist, but that they are also used and to remove the "barriers" the final users/developers find in the present iterations of the tools.

Additional comments

Overall, it's been a pleasure to review this article. Please note that both my suggestions are entirely up to the authors.

Cite this review as

Reviewer 2 ·

Basic reporting

In the main, the paper is correctly written, experiments support the conclusions, and the references are correctly chosen, so it is a hard-earned work that deserves publication. Being a well-written and well-argued work, I don't have many comments to make.

Experimental design

In Fig. 1, the DL workflow is presented. However, this workflow is far from being completed, and it keeps account only for tuning the model in the case of a higher error on the training and/or validation data set. Other very important loops were not considered:
1. Training error high? => bigger model
2. Training error high? => train longer
3. Training error high? => new model architecture
4. Validation error high? => mode data
5. Validation error high? => regularization
6. Validation error high? => new model architecture
7. Model evaluation: Is test error high? => make the training data more similar to test data
8. Model evaluation: Is test error high? => more data to train
9. Model evaluation: Is test error high? => new model architecture

In the "Overhead Analyses" chapter, some of the performances that were obtained are presented. It would be appropriate for you to show the system on which these performances were obtained - architecture, processor type, number of cores, working frequency, etc.

Validity of the findings

no comment

Additional comments

Small issues:

I propose to complete the following sentence "Deep Learning (DL) workflows consist of multiple interdependent steps …" with "Deep Learning (DL) workflows consist of multiple interdependent and repetitive steps …"

In Fig 2, an arrow must be used to underline the cycle.

Cite this review as

Reviewer 3 ·

Basic reporting

no comment

Experimental design

The design is good, considers how users might use the system, and offers use cases and performance details.

Validity of the findings

The findings appear valid.

Additional comments

This paper presents DLProv, a provenance system for tracking details of deep learning training workflows. The system is notable for its compliance with earlier provenance standards. The authors provide rationale for the system, compare it to other systems, and demonstrate it on realistic workloads, along with performance data.

This paper is in very good shape and is ready for publication now. The paper makes a strong case for the new system, and is very thorough in its presentation. It is hard to find substantial gaps in the presentation.

The weakest aspect of the paper is the somewhat slow section 2, which presents a very wordy contextual background, but does not cite many papers. Related work and a good quantity of cites are in section 3. Section 2 could be considerably shortened or edited to make it more readable, such as by breaking up long paragraphs. The figures in section 1 and 2 could also be edited with more interesting details. Another idea would be to bring in a compelling use case in section 2 and use that to drive the discussion about what the background for your contributions is.

Cite this review as

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.