All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thank you for submitting the revised version of your manuscript. After carefully reviewing the changes, I can confirm that all comments and suggestions have been appropriately addressed.
Based on this assessment, I am pleased to inform you that the manuscript is now ready for publication. Congratulations.
[# PeerJ Staff Note - this decision was reviewed and approved by Mike Climstein, a PeerJ Section Editor covering this Section #]
We thank the authors for the careful and thorough revisions that have clearly improved the manuscript. Nevertheless, some points still raise important doubts that must be clarified before the paper can be considered for publication. Please, see the following comments:
First, the calculation and interpretation of the standard error of measurement (SEM) and minimal detectable change (MDC) remain problematic. At present, SEM and MDC are derived from comparisons between different methods (e.g., STS-IP vs STS-S), which does not follow COSMIN standards. These indices should be estimated from test–retest data within the same method under stable conditions.
As currently framed, the values reflect method agreement rather than true measurement error, and their use to interpret longitudinal changes is misleading. The authors should either remove these indices or reframe them explicitly as agreement metrics, emphasizing that they are not suitable for tracking change over time in individual patients.
The manuscript repeatedly states that synchronous and asynchronous tele-assessments were “validated.” While the synchronous modality demonstrates strong validity, the asynchronous modality shows only borderline reliability (ICC = 0.69) with relatively wide limits of agreement. The wording should therefore be tempered to reflect “good validity for the synchronous tele-assessment” and only “moderate evidence of validity for the asynchronous tele-assessment.” This distinction should be applied consistently in the Abstract, Results, Discussion, Title, and Conclusion.
Because all participants performed the in-person assessment first, the higher performance observed in remote tests may partly reflect practice effects rather than true modality differences. We request that the authors include a sensitivity analysis (for example, a mixed-effects model controlling for whether the first remote test was synchronous or asynchronous) to better quantify the magnitude of this potential bias.
The data suggest proportional bias, yet this is not tested formally. We ask the authors to perform regression of the differences against the means, report the slope and its significance, and consider percentage-based limits of agreement (or log transformation) if heteroscedasticity is confirmed. This will provide a more accurate assessment of agreement.
Finally, the terminology used for effect sizes and correlations should be standardized. Expressions such as “perfect correlation” or “perfect effect size” are not appropriate in scientific reporting and should be replaced with conventional descriptors such as “very high” or “very large.”
Dear Authors,
Following the evaluation of your manuscript by two ad hoc reviewers, we are forwarding their comments below, which should be carefully considered in the event of a revised submission.
In addition to the reviewers’ comments, we would like to highlight the following points that we believe are relevant for improving the manuscript:
One of the reviewers recalculated the Intraclass Correlation Coefficient (ICC) between STS-IP and STS–S using the appropriate statistical model and the data provided by the authors. The resulting value (ICC = 0.73) differs from the one reported in the original version of the manuscript. We therefore strongly recommend that all statistical analyses be reviewed.
While the manuscript is generally well written and has potential to contribute to the field, the reviewers noted that the authors did not adequately discuss the main limitations and potential sources of bias in the reliability analysis.
We also emphasize that the most widely accepted methodological reference for reliability studies is the COSMIN initiative. We suggest that the authors consult the COSMIN guidelines and align both the terminology and methodological reporting of the manuscript accordingly.
If you choose to submit a revised version, please include a detailed response letter addressing each of the reviewers’ comments as well as the additional observations presented above.
Sincerely,
In terms of basic reporting, this article is sufficient and can contribute to the literature.
Participant definitions and measurement procedures are presented in a sufficiently clear and systematic manner. Measurement frequencies and methods are clear. However, the following adjustment should be made:
- Line 122: “five predefined video criteria” requires explanation. The criteria should be briefly stated or provided in an additional file.
Statistical analyses are accurate and comprehensive. Findings are presented clearly. However, some statements need to be written more clearly and understandably:
- Presentation of Bland-Altman analyses reduces readability. It should be simplified.
The discussion section is rich in literature comparisons, and the suggestions are applicable. However, the following corrections should be made linguistically:
- “suggesting the potential utility…” (line 289) should be written more clearly: “suggesting that self-reported values may be feasible and useful in remote settings.”
- “differences may be attributed…” (line 321 and later) paragraph can be simplified by removing repetitions.
• Clear, unambiguous, professional English language used throughout.
o The English language appears to be a literal translation from Brazilian Portuguese in some parts and would benefit from an English review. For instance, the sentence “COVID 19 is a viral and systemic infection that causes severe acute respiratory syndrome and can be aggravated by several risk factors” is an example of the inadequate use of “can” (which typically is used in the context of “capacity”). In this sentence the adequate verb would be “may” or “might” (indicating possibility, probability).
o Change throughout the text the term “reproducibility”, which is not the most accurate term. COSMIN, an international measurement property group, standardized the terms regarding measurement properties, and reliability is the property used to describe what you assessed.
• Intro & background to show context.
o Introducing a disease referencing its supposed geographic origin might contribute to stigmatization and should be avoided and is discouraged by some editorial standards (https://www.nature.com/articles/d41586-020-01009-0, https://www.who.int/publications/i/item/WHO-HSE-FOS-15.1). I suggest removing the phrase “In 2019, in Wuhan - China, an infectious disease caused by SARS-CoV-2”.
o WHO defined the name of the situation where symptoms lasted after a COVID infection as “Post-COVID condition”, please, change accordingly throughout the text.
o Although the authors provide a rationale for the use of STS test, the introduction does not present what is already known regarding its reliability and the validity in Post-covid condition when assessed face-to-face. Moreover, the evidence on these properties in similar populations using tele-assessment is not sufficiently discussed. Including such information would enhance our understanding of whether STS test is appropriate for remote assessment.
• Literature well referenced & relevant.
o As stated above, I miss references regarding what is already known regarding the measurement properties of the STS test, both in face-to-face and remote assessment
• Structure conforms to PeerJ standards, discipline norm, or improved for clarity.
Yes
• Figures are relevant, high quality, well labelled & described.
Yes
• Raw data supplied (see PeerJ policy).
Yes
• Original primary research within Scope of the journal.
o The research is within the scope of the journal
• Research question well defined, relevant & meaningful.
o The research question is weel defined, relevant and meaningful.
• It is stated how the research fills an identified knowledge gap.
o Yes, the rationale and hypothesis provide this information
• Rigorous investigation performed to a high technical & ethical standard.
o There is a limitation in this aspect. Since the face-to-face assessment is always the first assessment, there is no way to understand if the measurement error is due to learning effect or due to the remote aspect of the assessment
o “38 individuals” is considered a fair sample, sample size calculation should be based on the ICC and on COSMIN sample size requirements.
o There is missing data for the STS-S in one patient (28)
o I calculated ICC (STS-IP x STS – S) using the correct model and the data authors provided, and the results are different (ICC 0,73), therefore, I suggest a statistician verifying all analysis.
• Methods described with sufficient detail & information to replicate.
o CCI model should be described, And the reliability threshold is considered ICC >0.7 (COSMIN)
o Description of Ethics aspects is adequate
• Impact and novelty is not assessed. Meaningful replication encouraged where rationale & benefit to literature is clearly stated.
o “where females and males achieved 41 and 44 repetitions, respectively (Strassmann et al., 2013).”
Why not using Brazilian reference values from https://doi.org/10.1016/j.apmr.2021.08.009
• All underlying data have been provided; they are robust, statistically sound, & controlled.
o Methodological limitations of the study are lacking and should encompass sample size (COSMIN requirements), learning effect was not considered when constructing the method (randomizing what would be the first test followed by a regression analysis of the error, or two tests performed in face-to-face assessment (previous research already stated the learning effect, https://doi.org/10.1016/j.apmr.2021.08.009) Therefore, the SEM and MDC should not be used as values to interpret STS changes overtime or after rehabilitation.
• Conclusions are well stated, linked to original research question & limited to supporting results.
o They are adequate
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.