"I can't see which is the originality of this paper with respect to what the authors already presented in other papers.
In this paper we present a new set of tools for the validation of satellite soil moisture data that has the advantage to be open source. The validation tool is based on the SM2RAIN algorithm. Here we present the results of the application of the SM2RAIN to the CCI soil moisture products. Up to now, the CCI dataset has been evaluated only by using in-situ stations. This is the first time that CCI soil moisture products have been evaluated on a global scale.
Moreover some conclusions are quite obvious, other need more discussion.
For instance
""Cloud computing facilities can be very beneficial for analyzing huge amount of data"". Is there somebody who says the contrary? In my opinion this is quite obviuos, therefore useless.
Yes, the reviewer is right when saying that is quite obvious, but now the increasing availability of EO data make necessary to use this kind of platform in order to carry on a global analysis. The volume of Sentinel 1 data (with a spatial resolution up to 100m), for instance, will be able to be analyzed with a powerful desktop machine just for a very limited area. The same conclusion can be drawn for a coarser product with a very long record of data, i.e. the CCI products. We changed the conclusion, in order to underline this aspect. Please check the text at lines 121-123: “Cloud computing facilities can be very beneficial for analyzing huge amount of data and they are becoming a fundamental environment for these kind of analysis, due to the increasing volume of EO data”.
""Python® has proven to be very useful for validation and big data analysis procedure implementation"". Who says that this is not true? Another question can be if it is the optimal one for analysing big data. But this has not been investigating within this work.
We changed the conclusion accordingly. Please check the text at lines 124-126: “A Python® validation and big data analysis tool is presented. The validation tool will be exported in other open source languages in order to test their capabilities and to find out the best software structure”
""SM2RAIN algorithm can be used for estimating rainfall and for assess the quality of SM dataset""; This conclusion has to be better explained because it is not evident to me from the text. Why the quality is assessed? Which are the indexes that are used for such an assesment?
Soil moisture and rainfall are linked by a strong relationship, i.e., when it rains the SM reach higher values. By inverting this relationship, one can obtain a rainfall estimate by using the SM variations and thus, the more accurate and reliable the variation is, the more reliable the rainfall estimate will be. Then, by comparing the obtained rainfall one can assess the quality of SM products.
We add more details in the text, please check the text at lines 62-65: “The main idea is that the perfect SM product can record all the variation in SM condition due to rainfall. By inverting the relationship, one can obtain a rainfall estimate and then assess the quality of the SM product by comparing the estimated rainfall with a benchmark.”
and at lines 127-130: ”SM2RAIN algorithm can be used for estimating rainfall and for assess the quality of SM dataset, due to the relationship between rainfall and soil wetness conditions. An assessment carried on via SM2RAIN does not need long observed SM records, which can be hardly obtained on a global scale for more than 30 years;”
The performance scores used for assessing the quality of the SM products are the correlation coefficient (R) and the Root Mean Square Error (RMSE), as described at lines 79-80
""During the analysis period, the “combined” rainfall outperforms the “active” and the “passive” "": Yes, it is true but it is still a low value for the R coefficient. Then, what are the conclusions?
We add more details in the Results and Conclusions section. Please check the text at lines 105-109: “Table 1 summarizes the results obtained for the considered datasets in terms of median R and RMSE by considering the two different analysis grids. It is worth to underline that the median performances do not reach extraordinary values due to the presence of areas where the satellite retrieval is highly impacted, as discussed above. In this framework, the obtained results can be considered very satisfactory.”