Downscaling MODIS land surface temperature to 90 m using random forest regression to assess transferability

Pallab Kumar Das; Indrajit Mukherjee; Pankaj Prasad; Shashank Pushkar

doi:10.7717/peerj-cs.3246

Downscaling MODIS land surface temperature to 90 m using random forest regression to assess transferability

Pallab Kumar Das ¹, Indrajit Mukherjee¹, Pankaj Prasad ², Shashank Pushkar¹

1 Department of Computer Science and Engineering, Birla Institute of Technology, Mesra, Ranchi, Jharkhand, India

2 Department of Remote Sensing, Birla Institute of Technology, Mesra, Ranchi, Jharkhand, India

DOI: 10.7717/peerj-cs.3246

Published: 2025-11-05
Accepted: 2025-09-04
Received: 2025-04-24

Academic Editor: Davide Chicco

Subject Areas: Artificial Intelligence, Data Mining and Machine Learning, Data Science, Spatial and Geographic Information Systems
Keywords: Spatiotemporal trade-off, Satellite sensors, Downscaling, Random forest regression, Land surface temperature, Spatial resolution, XGBoost, CatBoost

Copyright: © 2025 Das et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.

Cite this article: Das PK, Mukherjee I, Prasad P, Pushkar S. 2025. Downscaling MODIS land surface temperature to 90 m using random forest regression to assess transferability. PeerJ Computer Science 11:e3246 https://doi.org/10.7717/peerj-cs.3246

Abstract

The observable spatiotemporal trade-offs among different satellite sensors necessitate the downscaling of optical and thermal data to identify Earth’s surface features and analyze temperature variations at finer scales. This study employs a random forest (RF) regression model to downscale Moderate Resolution Imaging Spectroradiometer (MODIS) land surface temperature (LST) data from a 1-km spatial resolution to 90 m. The model is trained on a primary region and applied to a spatially distinct secondary region to evaluate its transferability. Six acquisition dates, each representing a different month, were selected for downscaling. For contextual performance assessment, eXtreme Gradient Boosting (XGBoost) and Categorical Boosting (CatBoost) models were also trained and evaluated using the same setup. In the primary region, the RF model showed minimal root mean square error (RMSE) deviation across all acquisition dates, underscoring the importance of incorporating multiple variables. Comparable performance was observed with XGBoost and CatBoost in the primary region. In the secondary region, RF demonstrated strong generalizability, with the lowest RMSEs recorded in November (0.85) and February (0.96), and R² values of 0.65 or higher across all months. In contrast, XGBoost and CatBoost exhibited inconsistencies in the secondary region, with CatBoost failing notably in June, further highlighting RF’s robustness in transferability. The study’s outcomes suggest that RF regression is a promising and effective method for downscaling coarse-resolution LST data in similar geo-environmental conditions.

Introduction

Land surface temperature (LST) is a fundamental parameter in the energy exchange processes between the Earth’s surface and atmosphere, impacting studies in climatology, hydrology, agriculture, and urban heat island (UHI) dynamics (Voogt & Oke, 2003; Anderson et al., 2007; Tomlinson et al., 2011; Patel et al., 2012; Li et al., 2013). However, coarser spatial resolution LST data from satellite-borne sensors, such as Moderate Resolution Imaging Spectroradiometer (MODIS) and other thermal infrared sensors (advanced very high resolution radiometer (AVHRR), visible infrared imaging radiometer suite (VIIRS), sea and land surface temperature radiometer (SLSTR), etc.), are unsuitable for detailed LST mapping over large areas (Weng, Fu & Gao, 2014). Therefore, it is necessary to employ downscaling techniques to derive finer-resolution LST at frequent temporal points for more accurate and detailed environmental and climatic analyses.

The initial LST downscaling methodologies used simple statistical models, primarily exploiting the linear relationship between vegetation indices and surface temperature to enhance spatial resolution. Agam et al. (2007) developed the thermal image sharpening (TsHARP) method that exploits the relationship between radiometric surface temperature and vegetation index (VI) to sharpen MODIS LST. Liu & Pu (2008) proposed an iterative technique for enhancing the spatial resolution of advanced spaceborne thermal emission and reflection radiometer (ASTER) thermal infrared (TIR) radiance using simple linear regression with high-resolution land cover data. But these single-factor linear regression models lacked suitability in highly heterogenous regions.

Subsequent research evolved from the simple statistical approaches to a diverse array of machine learning models to better capture the complex interactions between LST and its contributing factors and address heterogeneity. Successful implementations include the use of multilinear regression (MLR) (Dominguez et al., 2011; Zakšek & Oštir, 2012), support vector machines (SVM) (Ghosh & Joshi, 2014; Ebrahimy & Azadbakht, 2019), powerful ensemble methods such as random forest (RF) (Hutengs & Vohland, 2016; Njuki, Mannaerts & Su, 2020; Ebrahimy et al., 2021; Ouyang et al., 2022), eXtreme Gradient Boosting (XGBoost) (Xu et al., 2021; Tu et al., 2022), and artificial neural networks (ANNs) (Bindhu, Narasimhan & Sudheer, 2013; Pu, 2021). These studies have effectively contributed to establishing the superior predictive efficacy of machine learning models.

The application of neural networks in LST downscaling has transitioned from the foundational ANNs to more sophisticated architectures such as convolutional neural networks (CNNs) (Yin et al., 2021) and geographically weighted neural network regression, exploiting vegetation indices and digital elevation models (DEMs) (Liang et al., 2023). While these studies handled spatial heterogeneity, their effectiveness hinged on the quality and representativeness of training data, particularly in areas with complex landscapes.

Recent advancements have addressed the limitations of machine-learning models by integrating various physical parameters to improve spatial adaptability. One exemplification is the multi-factor geographically weighted machine learning (MFGWML) model for downscaling Landsat 8 LST from 30 to 10 m using Sentinel-2A imagery (Xu et al., 2021).

While deep-learning models can be effective in downscaling, their complex nature requires extensive training datasets (Zhu et al., 2017). The computational overhead can limit workability in large areas. The black-box nature of deep neural networks is a concomitant impediment and presents interpretation challenges in understanding the relationships between LST and its contributing factors (Samek, Wiegand & Müller, 2017). On the contrary, ensembles such as the RF are computationally less expensive and provide greater control in balancing the bias-variance trade-off through extensive parameter optimization. They inherently have feature-importance rankings presenting valuable insights into the relative influence of different variables on LST (Belgiu & Drăguţ, 2016).

The primary aim of the current study is to downscale MODIS LST in forest and agriculture-dominated regions of India using RF regression. Moreover, the applied model is validated in another geographic region to assess its applicability and transferability, reflecting the study’s novelty. Integrating RF regression with feature selection for downscaling allows for modeling non-linear interactions and spatial heterogeneity, minimizing biases, and boosting accuracy. The methodology addresses spatial averaging biases using finer resolution Landsat 9 LST as the reference dataset, providing more detailed spatial information for more accurate spatial patterns and variability mapping.

Study area

This study is based on two regions of interest in Jharkhand, India, here referred to as primary and secondary (Fig. 1). The primary region, with a geographical extent of 85° 15′ to 85° 45′ East and 22° 59′ to 23° 25′ North is used for the development of the LST downscaling methodology. This region is situated on the Ranchi plateau, the most extensive part of the Chota Nagpur Plateau. The region is an almost square patch of approximately 2,500 km², partly overlapping with the city of Ranchi and includes the surrounding agricultural and forested areas, ensuring heterogeneity for model development. Its topography is characterized by plateaued, hilly upland terrain with an elevation above the mean sea level ranging from 151 to 780 m. The mean elevation is 416 m, indicating that a considerable proportion of the region is concentrated at higher altitudes. The dominant soil types are red and yellow soils (classified primarily as Alfisols and Inceptisols), formed from the underlying granite and gneiss bedrock, and are typically acidic with low to medium fertility (https://www.sameti.org/). The natural vegetation consists of tropical moist and dry deciduous forests, with sal (Shorea robusta), mahua (Madhuca longifolia), and palash (Butea monosperma) being the primary tree species (https://fsi.nic.in/forest-report-2021).

Figure 1: Study regions: (A) primary region where the RF, XGBoost and CatBoost models were trained, and (B) secondary region where these models were applied.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-1

The secondary region is also an almost square patch of approximately 2,500 km², with a geographical extent spanning 85°30′ to 85°59′ East and 22°30′ to 22°56′ North. It is a lower plateau with a more varied terrain of rolling plains and isolated hills. It is used to evaluate the predictive capability of the trained model developed using the primary region’s data. While it shares similar primary land cover types like forest land, cropland, and permanent waterbodies, it has a broader elevation range (87–863 m) and a lower mean elevation (242 m) compared to the primary region. Both study regions experience humid subtropical to tropical wet and dry climates, with temperatures soaring up to 45 °C in summer and dropping below 5 °C in winter. The southwest monsoon brings substantial rainfall, with an average annual precipitation ranging from 1,200 to 1,400 mm (Gupta, Banerjee & Gupta, 2021; Mahato et al., 2021). These study regions have been selected based on similar physical characteristics.

Datasets and methodology

The proposed methodology for downscaling MODIS LST incorporates a systematic workflow that combines a recursive feature elimination with cross-validation (RFECV) method to select relevant features, an ensemble machine learning model (RF), and multi-source data integration to enhance the spatial resolution of LST data acquired over six different dates (Table 1), representing six different months (January, February, May, June, November, and December). MODIS and Landsat 9 data were chosen from the exact dates across the six months to minimize bias in the dataset, except for 2022.11.12. Due to the absence of MODIS data over the regions of interest on 2022.11.12, the data from the subsequent day was incorporated into the model. This decision is supported after observing minimal meteorological variability in the parameters obtained from the ERA5-Land reanalysis dataset (https://doi.org/10.24381/cds.e2161bac) between the two dates, ensuring that the substitution does not introduce significant inconsistencies in the analyses. All product identifiers of the different datasets used in this study conform to the nomenclature established in the Earth Engine Data Catalog.

Table 1:

Acquisition dates and product IDs.

Month	Landsat 9 OLI/TIRS-2		MODIS
Month	Date	Landsat 9 Product ID	Date	MOD09GA & MOD11A1 Product IDs
January	2022.01.28	LANDSAT/LC09/C02/T1/LC09_140044_20220128	2022.01.28	MODIS/061/MOD09GA/2022_01_28
January	2022.01.28	LANDSAT/LC09/C02/T1/LC09_140044_20220128	2022.01.28	MODIS/061/MOD11A1/2022_01_28
February	2022.02.13	LANDSAT/LC09/C02/T1/LC09_140044_20220213	2022.02.13	MODIS/061/MOD09GA/2022_02_13
February	2022.02.13	LANDSAT/LC09/C02/T1/LC09_140044_20220213	2022.02.13	MODIS/061/MOD11A1/2022_02_13
May	2022.05.20	LANDSAT/LC09/C02/T1/LC09_140044_20220520	2022.05.20	MODIS/061/MOD09GA/2022_05_20
May	2022.05.20	LANDSAT/LC09/C02/T1/LC09_140044_20220520	2022.05.20	MODIS/061/MOD11A1/2022_05_20
June	2022.06.05	LANDSAT/LC09/C02/T1/LC09_140044_20220605	2022.06.05	MODIS/061/MOD09GA/2022_06_05
June	2022.06.05	LANDSAT/LC09/C02/T1/LC09_140044_20220605	2022.06.05	MODIS/061/MOD11A1/2022_06_05
November	2022.11.12	LANDSAT/LC09/C02/T1/LC09_140044_20221112	2022.11.13	MODIS/061/MOD09GA/2022_11_13
November	2022.11.12	LANDSAT/LC09/C02/T1/LC09_140044_20221112	2022.11.13	MODIS/061/MOD11A1/2022_11_13
December	2022.12.14	LANDSAT/LC09/C02/T1/LC09_140044_20221214	2022.12.14	MODIS/061/MOD09GA/2022_12_14
December	2022.12.14	LANDSAT/LC09/C02/T1/LC09_140044_20221214	2022.12.14	MODIS/061/MOD11A1/2022_12_14

DOI: 10.7717/peerj-cs.3246/table-1

The independent variables in this study were derived from multiple satellite sensors. A summary of these datasets and their sources (sensors) is provided in Table 2. The satellite images from all the acquisitions were cloud-free, hence, no cloud-masking was done. “No data” values were absent in all the images, which avoided imputations and removal of pixels. Appropriate rescaling factors were used to convert the raw pixels into physical quantities (reflectance and temperature values). Normalization or standardization was not performed, as these are not prerequisites for an RF implementation. Each split within the decision trees is decided by simple threshold comparisons invariant to monotonic rescaling of individual variables.

Table 2:

Summary of the datasets used in the study.

Dataset	Sensor	Platform	Spatial resolution	Temporal resolution	Swath
Land Surface Temperature (LST)	Moderate Resolution Imaging Spectroradiometer (MODIS)	Terra	1 km	Daily	2,330 km
Land Surface Temperature (LST)	Thermal Infrared Sensor-2 (TIRS-2)	Landsat 9	100 m (resampled to 30 m)	16 days	185 km
Surface Reflectance	MODIS	Terra	500 m	Daily	2,330 km
Digital Elevation Model (DEM)	Panchromatic	Cartosat-1	30 m	5 days (targeted revisit)	26–30 km (stereoscopic)

DOI: 10.7717/peerj-cs.3246/table-2

MODIS spectral bands and LST

MOD09GA.061 dataset (https://doi.org/10.5067/MODIS/MOD09GA.061) was used to select six spectral bands, namely blue, green, red, near infrared (NIR), shortwave infrared 1 (SWIR1), and shortwave infrared 2 (SWIR2). MOD11A1.061 dataset (https://doi.org/10.5067/MODIS/MOD11A1.061) was used to derive the LST. MODIS LST products store raw data as scaled integers to optimize storage efficiency and maintain precision. A scaling factor of 0.02 is applied to convert these integer values into physically meaningful LST values in Kelvin (K). Finally, temperatures in degrees Celsius (°C) are derived by subtracting 273.15 from the Kelvin results.

MODIS vegetation indices

The MOD09GA.061 dataset was used to derive the following vegetation indices:

Normalized difference vegetation index

(1) $N D V I = \frac{N I R - R e d}{N I R + R e d} .$

Normalized difference water index

(2) $N D W I = \frac{G r e e n - N I R}{G r e e n + N I R} .$

Modified normalized difference water index

(3) $M N D W I = \frac{G r e e n - S W I R 1}{G r e e n + S W I R 2} .$

Normalized difference built-up index

(4) $N D B I = \frac{S W I R 1 - N I R}{S W I R 1 + N I R} .$

Soil-adjusted vegetation index

(5) $S A V I = \frac{(N I R - R e d) (1 + L)}{N I R + R e d + L} .$

Here, $L$ is the soil brightness correction factor.

Vegetation indices are integral to LST downscaling as they provide essential information about vegetation cover, influencing surface temperatures (Wang et al., 2021; Pandey et al., 2023).

For digital elevation model data, two scene IDs, P5_PAN_CS_N22_000_E085_000_30m and P5_PAN_CS_N23_000_E085_000_30m from the PAN sensor of the CartoSat-1 satellite (https://bhoonidhi.nrsc.gov.in/bhoonidhi/index.html) were mosaiced to encompass the regions of interest.

Landsat 9 LST retrieval

The NDVI threshold-based emissivity method is one of the approaches to retrieve Landsat 9 LST.

The raw sensor-specific quantized and calibrated digital numbers (DNs) stored in Landsat imagery (https://earthexplorer.usgs.gov/) are converted into physically quantifiable radiance units.

(6) $L_{λ} = (M_{L} \times Q_{c a l}) + A_{L} .$ Here, $M_{L}$ and $A_{L}$ are the band-specific multiplicative (gain) and additive (offset) rescaling factors, respectively, provided in the imagery’s metadata file, and $Q_{c a l}$ denotes the quantized and calibrated pixel values (DNs).

The inverse of Planck’s function is applied to convert radiance into brightness temperature.

(7) $T_{b} = \frac{K_{2}}{\ln (\frac{K_{1}}{L_{λ}} + 1)} .$ Here, $K_{1}$ and $K_{2}$ are band-specific thermal conversion constants extracted from the metadata, and $L_{λ}$ is the spectral radiance in $W m^{- 2} s r^{- 1} μ m^{- 1}$ .

To adjust the brightness temperature for actual surfaces, each pixel’s surface emittance of thermal radiation is considered. The emissivity depends on surface composition, as vegetation, soil, rock, water, and urban infrastructure emit radiation differently. To quantify this, $N D V I$ is computed. The following equation calculates the proportion of vegetation, $P_{v}$ within a pixel, representing the extent of green vegetation cover to quantify surface heterogeneity, using NDVI extremes for “complete vegetation” ( $N D V I_{v}$ ) and “bare soil” ( $N D V I_{s}$ ).

(8) $P_{v} = {(\frac{N D V I - N D V I_{v}}{N D V I_{v} - N D V I_{s}})}^{2} .$ Higher vegetation cover has higher emissivity, while bare soil or rock has lower emissivity. Using $P_{v}$ and known emissivity values for dense vegetation and bare soil, land surface emissivity (LSE) for each pixel is dynamically calculated using Eq. (9).

(9) $ε = {\begin{matrix} ε_{s λ}, & N D V I < N D V I_{s} \\ ε_{v λ} P_{v} + ε_{s λ} (1 - P_{v}) + C, & N D V I_{s} \leq N D V I \leq N D V I_{v} \\ ε_{v λ}, & N D V I > N D V I_{v} \end{matrix} .$ Here, $ε_{s λ}$ and $ε_{v λ}$ are the emissivities of bare soil and pure vegetation, respectively, and $C$ represents surface roughness, taken as a constant.

Empirical NDVI-based emissivity relationships have been used extensively in the literature. These methods ensure that LSE is not treated as a constant but is spatially dynamic and tailored to the surface conditions (Sobrino, Jiménez-Muñoz & Paolini, 2004; Momeni & Saradjian, 2007; Sobrino et al., 2008; Li et al., 2013). For a smooth transition, emissivity was modeled as ε = 0.004P_v + 0.986 in this study.

With LSE determined, the at-sensor brightness temperature is converted to LST by accounting for the reduced recorded thermal radiance caused by the surface’s deviation from an ideal blackbody behavior.

(10) $L S T = \frac{T_{b}}{1 + (\frac{λ T_{b}}{ρ}) \ln (ε)} .$ Here, $T_{b}$ is the brightness temperature, $λ$ is the center wavelength of the TIR band (band 10) in $μ m$ , and $ρ$ is $1.4388 \times 10^{- 2} m K$ derived from Eq. (11), and $ε$ is the LSE.

(11) $ρ = \frac{h c}{σ} .$ Here, $h$ is the Planck’s constant ( $6.626 \times 10^{- 34} J s$ ), σ is the Boltzmann constant ( $1.38 \times 10^{- 23} J K^{- 1}$ ), and $c$ is the velocity of light ( $2.998 \times 10^{8} m s^{- 1}$ ).

Model construction using RF

The RF model is a powerful ensemble machine-learning method introduced by Breiman (2001). It is a low-bias, low-variance model as it tends to generalize well and perform consistently across training and test datasets. RF effectively handles large datasets that exhibit complex relationships. It is capable of noise reduction and outlier removal, and avoids overfitting problems by averaging the outputs of many decorrelated trees (Prasad et al., 2020; Njuki, Mannaerts & Su, 2020). The decorrelation is achieved through bootstrapping and selecting a random subset of features at each split. The interpretability through feature importance is a significant advantage.

There are different feature selection methods, namely Boruta, RFE, and multicollinearity methods, which have been applied in various studies (Prasad et al., 2023; Abdo et al., 2024). This study initially considered thirteen independent features for the primary region. To mitigate the curse of dimensionality and prevent overfitting, an RFECV method using gradient boosting was employed, selecting eight relevant independent features to enhance the model’s generalization capability (Barzani et al., 2024). The eight features include four spectral bands (blue, near infrared (NIR), shortwave infrared (SWIR)1, and SWIR2), two vegetation indices (MNDWI and SAVI), MODIS LST, and digital elevation model (DEM) processed from two CartoSat-1 scenes as the only topographic feature. Finer resolution (90 m) Landsat 9-retrieved LST was used as the target variable. The 90-m scale was fixed to preserve the band’s native resolution maximally. The Landsat LST data was processed and directly exported to a 90-m scale in Google Earth Engine (GEE). All the MODIS variables and DEM were resampled to 90 m using bilinear interpolation for spatial congruence with the target variable.

The model (Fig. 2) was built using optimal parameters obtained after randomly exploring 50 combinations from the grid of values of five parameters (Table 3), namely n_estimators, criterion, max_depth, min_samples_leaf, and max_features, with a 5-fold cross-validation using the RandomizedSearchCV method of the open-source scikit-learn Python library to minimize the bias-variance trade-off for improving the model’s predictive accuracy and achieve robustness (Bergstra & Bengio, 2012). The minimum value in the set of values for min_samples_leaf was carefully chosen to be 10 to restrain the growth of the decision trees to the deepest level, avoiding the risk of fitting any potential outliers. The parameter n_jobs was set to −1 to involve all the CPU cores for parallelization. The RandomizedSearchCV method was employed to reduce computational time while exploring diverse combinations. All the computing was done by an AMD Ryzen 5 5600H processor (3.30 GHz) with 16 GB RAM (3,200 MHz) in a Windows 11 environment.

Figure 2: Flow diagram of the downscaling methodology.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-2

Table 3:

Grid of parameters and optimal combination for RF.

Scikit-learn parameter	Description	Values	Optimal value
n_estimators	The number of decision trees	[50, 70, 100]	100
criterion	The function to measure the quality of a split	[‘squared_error’, ‘friedman_mse’]	‘friedman_mse’
max_depth	The maximum depth of a decision tree	[25, 30, 35, None]	None
min_samples_leaf	The minimum number of samples required to be at a leaf node	[10, 15, 20]	10
max_features	The subset of features considered for each split	[‘sqrt’, ‘log2’, None]	None

DOI: 10.7717/peerj-cs.3246/table-3

To evaluate the model’s generalization capability and avoid overfitting, the dataset was partitioned into training and test subsets in the ratio 7:3, ensuring the model was validated on unseen data before applying it to the secondary region. Once trained, the RF model was used to predict the LST for the secondary region, enhancing the spatial resolution of the MODIS LST. The following metrics (Eqs. (12)–(14)) were used to evaluate the downscaling performance of the RF regression model for both the primary and secondary regions. Since the model was trained on 70% of the data from the primary region, the remaining 30% was used to evaluate the model to obtain an unbiased estimate of its predictive performance. The trained model was then applied to the secondary region for the reproducibility of similar results observed in the primary region’s test set.

(12) $\begin{matrix} M A E = \frac{1}{n} \sum_{i = 1}^{n} | L S T_{a, i} - L S T_{p, i} | \end{matrix}$

(13) $\begin{matrix} R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(L S T_{a, i} - L S T_{p, i})}^{2}} \end{matrix} .$ Here, $L S T_{a, i}$ and $L S T_{p, i}$ are actual and predicted LST values for $i$ -th pixel, respectively, and $n$ is the number of pixels.

(14) $\begin{matrix} R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(L S T_{a, i} - L S T_{p, i})}^{2}}{\sum_{i = 1}^{n} {(L S T_{a, i} - \bar{L S T_{a}})}^{2}} \end{matrix} .$ Here, $\bar{L S T_{a}}$ is the mean around the actual LST values, and $n$ is the number of pixels.

Results

Feature importance by RF

Determining the importance of features is essential in LST, natural resources, and hazard mapping (Prasad et al., 2022, 2024, 2025). The feature importance plots of the primary region (Fig. 3) weigh heavily on three factors, namely, SWIR1, MODIS LST, and DEM. MODIS LST contributed 68.89% and 63.04% in January and February, respectively, to the model’s predictions. In these months, SWIR1 was the next highest contributor, influencing the predictions by 13.46% and 16.93%, respectively. The influence of MODIS LST in May and June elevated to 84.77% and 93.10%, respectively, undermining all other features. A contrasting dynamic was observed in December, where the importance of SWIR1 dominated all other features, influencing 58.02% of the model’s predictions. There are notable spikes in the importance of the other factors, namely NIR, blue, SWIR2, SAVI, MNDWI, and DEM, in November compared to January, February, May, and June. The most important feature for spatial downscaling was MODIS LST, followed by SWIR1, DEM, SWIR2, and MNDWI. In contrast, the blue band, SAVI, and NIR band were among the less important factors.

Figure 3: RF importance scores across all acquisition dates.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-3

Downscaling performance

The predictions on the test data across all the months aligned closely with the actual LST values (Table 4). The mean temperatures were consistent, with trifling deviations in the months of February and December. The model performed exceptionally well in May and June, as evident by the root mean square error (RMSE) and R² values. In June, the RMSE was the lowest, and the R² value maximally justified the 1:1 ideal line, explaining 96% of the variability. The spread of LST values, reflecting the level of information acquired (Mukherjee, Joshi & Garg, 2017; Maithani et al., 2022), was significantly closer to the actual variance in June. The model predicted a variance of 6.83 against the actual 7.25. This month’s minimum and maximum predicted LST values of 27.10 °C and 40.30 °C depicted the closest resemblance to the actual LST range, justifying the actual variance. The predictions were weaker in the months of February and December, with lower variances. The R² was relatively lower (0.76) in November and December, indicating the model’s inefficiency in fully capturing the variability. The test predictions also revealed a general tendency to compress the maximum and marginally elevate the minimum temperatures, affecting the variability. Substantial compression in the temperature ranges can be observed in the months of February and November, with deviations of 3.56 °C and 3.50 °C. The model’s tendency to underestimate high temperatures is offset by its simultaneous overestimation of low temperatures, keeping the mean temperatures consistent with the actual values. This predictive behavior is a fundamental characteristic of RF regression and is attributed to the model’s approach to averaging the outputs from all the decision trees.

Table 4:

Prediction assessment by the RF model from the primary region’s test set.

Month	Date	LST	Min	Max	Mean	Variance	MAE	RMSE	R²
January	2022.01.28	Actual	12.11	24.03	17.76	1.64	0.41	0.56	0.81
January	2022.01.28	Predicted	12.51	22.52	17.76	1.23	0.41	0.56	0.81
February	2022.02.13	Actual	16.53	30.74	23.11	2.70	0.57	0.76	0.79
February	2022.02.13	Predicted	17.37	28.02	23.12	1.98	0.57	0.76	0.79
May	2022.05.20	Actual	27.29	44.19	37.58	4.74	0.48	0.65	0.91
May	2022.05.20	Predicted	27.41	41.99	37.58	4.15	0.48	0.65	0.91
June	2022.06.05	Actual	26.60	41.30	33.68	7.25	0.39	0.54	0.96
June	2022.06.05	Predicted	27.10	40.30	33.68	6.83	0.39	0.54	0.96
November	2022.11.12	Actual	20.61	34.39	25.53	1.94	0.52	0.68	0.76
November	2022.11.12	Predicted	20.83	31.11	25.53	1.33	0.52	0.68	0.76
December	2022.12.14	Actual	18.03	32.30	23.24	2.14	0.55	0.72	0.76
December	2022.12.14	Predicted	18.34	28.95	23.25	1.47	0.55	0.72	0.76

DOI: 10.7717/peerj-cs.3246/table-4

The scatterplots revealed a substantial concentration of predictions against the actual LST values on the 1:1 ideal line (Fig. 4). The model underestimated the higher temperatures in all the months except May and June. There was no considerable disarray of predictions in May and June, and a small number of predictions were affected by a high error magnitude. Overall, the magnitude of errors was relatively greater on the edges of the actual temperature spectrum across all the months.

Figure 4: RF scatterplots of the predicted refined LST (y-axis) vs. actual Landsat 9 LST (x-axis) data at 90 m for the primary region’s test set; the yellow dashed diagonal line represents the 1:1 ideal line.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-4

The error histograms exhibited the model’s unbiased predictability, as the errors were symmetrically distributed around zero for all the months (Fig. 5). The bell-shaped curves suggest a normal distribution of errors, with most errors accumulating within the range of −2 to +2 °C. A high percentage of errors has a ±1 °C offset from the ideal value of 0 across all the months. The narrower distributions in May and June, particularly June, highlighted the model’s strong performance with fewer errors and more accurate predictions.

Figure 5: RF error histograms from the predicted refined LST across all acquisition dates for the primary region’s test set; the x-axis represents the prediction error in °C, and the y-axis represents the percentage of total pixels.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-5

Model transferability

The trained RF model was applied to the secondary region to assess its transferability. It showed a moderate drop in efficiency, reflecting the challenges of applying models to unknown spatial contexts. The secondary region analyses yielded R² values between 0.65 and 0.76, with RMSE values ranging from 0.85 °C to 1.18 °C (Table 5). The predictions in January had an RMSE of 1.01 and an R² of 0.76. The variability was underrepresented in the predicted temperatures, with the predicted variance (2.87) lower than the actual variance (4.22). A similar pattern was observed in February with an RMSE of 0.96 and an R² of 0.74. The predicted mean LST (22.34 °C) closely matched the actual mean (22.36 °C). Contrary to the test results, the model’s performance declined considerably in May and June in the secondary region with RMSEs of 1.18 and 1.01, respectively. The lower R² values indicated a weaker alignment of predictions against the actual values. With an RMSE of 0.85 and an R² value of 0.74, the November predictions captured trends to a large extent and maintained variability. In December, the model yielded an RMSE of 1.12 and an R² of 0.73. The predicted variance (3.10) in this month was significantly lower than the actual variance (4.74), reflecting limitations in capturing local variability. The model’s performance in simulating the temperature ranges varied seasonally within the secondary region. While the model depicted fair agreement with the observed temperature ranges during the peak warm months (especially the month of May), it consistently underestimated the temperature variations in the transitional months (January, February, November, and December), mostly estimating narrower ranges than observed.

Table 5:

Prediction assessment of the RF model from the secondary region.

Month	Date	LST	Min	Max	Mean	Variance	MAE	RMSE	R²
January	28.01.2022	Actual	11.28	27.65	18.72	4.22	0.82	1.01	0.76
January	28.01.2022	Predicted	12.49	20.84	18.17	2.87	0.82	1.01	0.76
February	13.02.2022	Actual	15.31	31.24	22.36	3.57	0.73	0.96	0.74
February	13.02.2022	Predicted	17.34	26.66	22.34	3.13	0.73	0.96	0.74
May	20.05.2022	Actual	29.13	43.05	36.46	4.60	0.92	1.18	0.69
May	20.05.2022	Predicted	28.20	41.63	36.63	4.36	0.92	1.18	0.69
June	05.06.2022	Actual	24.70	37.28	30.05	2.92	0.81	1.01	0.65
June	05.06.2022	Predicted	27.79	35.83	30.50	2.15	0.81	1.01	0.65
November	12.11.2022	Actual	19.49	35.77	25.49	2.76	0.66	0.85	0.74
November	12.11.2022	Predicted	21.00	29.74	25.51	2.44	0.66	0.85	0.74
December	14.12.2022	Actual	16.70	31.51	23.29	4.74	0.86	1.12	0.73
December	14.12.2022	Predicted	18.36	27.09	23.08	3.10	0.86	1.12	0.73

DOI: 10.7717/peerj-cs.3246/table-5

The secondary region scatterplots (Fig. 6) showed a derangement of some predictions, suggesting weaker correlations in all the months, notably the May and June months. There is a consistent underestimation of higher temperature ranges in all the months. While most predictions are tightly clustered along the 1:1 ideal line, indicating smaller errors, larger errors tend to cluster at the extremes of the target range, suggesting the model’s sensitivity in predicting extreme values. Since the model was trained in the primary region, the clusters of incorrectly predicted points in these plots indicate that the secondary region exhibits different spatial features that were never represented in the training data. Despite variations in R² values, the model maintained consistent performance throughout the six temporal points, never dropping below 0.65, suggesting fundamental reliability.

Figure 6: RF scatterplots of the predicted refined LST (y-axis) vs. actual Landsat 9 LST (x-axis) data at 90 m for the secondary region; the yellow dashed diagonal line represents the 1:1 ideal line.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-6

All six error histograms (Fig. 7) show a single, sharply peaked (high kurtosis) distribution around zero, especially in February, November, and December, indicating that most predictions align with the actual values. A slight offset in all the plots, evident from the right elongated tails in the distributions, further validates the underestimation in the predictions. The tails beyond ±2 °C for all the months confirm the existence of larger errors, but the rapid drop-off in frequency indicates they are relatively rare compared to the strong concentrations near zero.

Figure 7: RF error histograms from the predicted refined LST across all acquisition dates for the secondary region; the x-axis represents the prediction error in °C, and the y-axis represents the percentage of total pixels.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-7

The secondary region’s actual and predicted 90-m LST maps are presented in Fig. 8. The temperature spectrum is represented in a color gradient from blue to red, with blue indicating lower temperatures and red indicating higher temperatures. A visual comparative analysis of the maps depicts considerable similarity, making the RF model suitable for downscaling.

Figure 8: Actual vs. predicted 90 m LST maps of the secondary region from the RF model.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-8

The considerable white sparse areas in the predicted per-pixel RMSE maps (Fig. 9) depict close alignment with the actual LST values with lesser deviation (≤1 °C). Most predictions can be observed within a derangement of 2 °C, with a significantly low number of predicted temperature values exceeding an RMSE beyond 3 °C.

Figure 9: Per-pixel RMSE maps of the secondary region from the RF model; the white sparse areas have RMSE values less than 1 °C.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-9

Comparative evaluation of RF against XGBoost and CatBoost

To contextualize the performance of the RF model, a comparative analysis was done using XGBoost and categorical boosting (CatBoost). These models were trained and evaluated using the exact temporal points, feature sets, datasets, and validation procedures to ensure an appropriate comparison of their downscaling capabilities.

The XGBoost model, introduced by Chen & Guestrin (2016), mitigates the common overfitting issue in traditional boosting models by incorporating regularization techniques. These penalties help control model complexity, preventing overfitting. The model’s objective function is optimized using a second-order Taylor expansion for faster convergence.

In the context of regression, CatBoost, introduced by Prokhorenkova et al. (2018) is a powerful gradient boosting algorithm that predicts a continuous target variable by minimizing the default RMSE loss function. CatBoost trains each tree on a sequence of randomly shuffled datasets, ensuring that the error for a particular data point is evaluated on a model that has not been trained on that specific data point. This leads to more robust models that generalize better to unseen data.

To ensure optimal performance, hyperparameter tuning was employed for both algorithms using a grid search strategy similar to the RF-based methodology. The number of different hyperparameter combinations (n_iter) was set to 50, keeping it consistent with the RF model. Tables 6 and 7 specify the relevant parameters that were tuned with the grid of values and the optimal combinations.

Table 6:

Grid of parameters and optimal combination for XGBoost.

Scikit-learn parameter	Description	Values	Optimal value
n_estimators	The total number of trees to build	[100, 200, 500, 1,000]	500
max_depth	The deepest a single tree can grow	[3, 5, 7, 9]	9
learning_rate	How much to shrink the contribution of each new tree	[0.01, 0.05, 0.1, 0.2]	0.1
subsample	The fraction of training data (rows) used for each tree	[0.7, 0.8, 0.9]	0.8
colsample_bytree	The fraction of features (columns) used for each tree	[0.8, 0.9, 1]	1
gamma	The minimum loss reduction needed to split a node further	[0, 0.1, 0.3, 0.5, 1]	0.1
min_child_weight	The minimum amount of data required in a new node	[1, 2, 3, 5, 10]	3
reg_lambda	The L2 regularization strength to prevent overfitting	[0.1, 1, 5, 10, 20]	10

DOI: 10.7717/peerj-cs.3246/table-6

Table 7:

Grid of parameters and optimal combination for CatBoost.

Scikit-learn parameter	Description	Values	Optimal value
iterations	The maximum number of decision trees to build	[100, 200, 300, 500, 1,000]	300
depth	The maximum depth of each tree	[6, 8, 10, 12]	12
learning_rate	This scales the contribution of each tree	[0.01, 0.03, 0.05, 0.1]	0.1
l2_leaf_reg	The L2 regularization coefficient	[1, 3, 5, 10]	1
subsample	The fraction of training data (rows) sampled for each tree	[0.6, 0.8, 0.9, 1]	1
rsm	The proportion of features (columns) randomly selected for each tree	[0.7, 0.8, 0.9, 1]	1
bagging_temperature	Controls the intensity of bagging	[0, 0.5, 1, 2]	0

DOI: 10.7717/peerj-cs.3246/table-7

The performance statistics for both primary and secondary regions are specified in Tables 8 and 9. The XGBoost model produced a performance profile highly similar to the RF model on the primary region’s test set (Table 8). Across all the months, the metrics for both models were almost identical, suggesting that XGBoost is an equally feasible model for this dataset, offering a similar predictive advantage. However, in the secondary region, it consistently underperformed compared to RF (Table 9). XGBoost registered a higher RMSE and a lower R² value than the RF model across all six temporal points (Fig. 10). This suggests that the XGBoost model was slightly more biased to the primary region’s data and therefore less generalizable than RF.

Table 8:

Prediction assessment by the XGBoost and CatBoost models from the primary region’s test set.

			eXtreme gradient boosting							Categorical boosting
			Min	Max	Mean	Variance	MAE	RMSE	R2	Min	Max	Mean	Variance	MAE	RMSE	R2
January	2022.01.28	Actual	12.11	24.03	17.76	1.64	0.42	0.56	0.81	12.11	24.03	17.76	1.64	0.46	0.61	0.77
January	2022.01.28	Predicted	12.03	23.89	17.76	1.26	0.42	0.56	0.81	12.51	23.75	17.76	1.20	0.46	0.61	0.77
February	2022.02.13	Actual	16.53	30.74	23.11	2.70	0.57	0.75	0.79	16.53	30.74	23.11	2.70	0.63	0.81	0.75
February	2022.02.13	Predicted	17.05	29.53	23.12	2.04	0.57	0.75	0.79	17.32	29.59	23.12	1.94	0.63	0.81	0.75
May	2022.05.20	Actual	27.29	44.19	37.58	4.74	0.49	0.66	0.91	27.29	44.19	37.58	4.74	0.57	0.75	0.88
May	2022.05.20	Predicted	26.81	42.07	37.58	4.19	0.49	0.66	0.91	26.95	41.91	37.58	4.06	0.57	0.75	0.88
June	2022.06.05	Actual	26.60	41.30	33.68	7.25	0.39	0.53	0.96	26.60	41.30	33.68	7.25	0.44	0.59	0.95
June	2022.06.05	Predicted	26.33	40.75	33.68	6.88	0.39	0.53	0.96	25.93	40.29	33.67	6.81	0.44	0.59	0.95
November	2022.11.12	Actual	20.61	34.39	25.53	1.94	0.51	0.67	0.77	20.61	34.39	25.53	1.94	0.57	0.74	0.72
November	2022.11.12	Predicted	20.68	32.93	25.53	1.38	0.51	0.67	0.77	20.88	32.94	25.53	1.29	0.57	0.74	0.72
December	2022.12.14	Actual	18.03	32.30	23.24	2.14	0.55	0.72	0.76	18.03	32.30	23.24	2.14	0.61	0.79	0.71
December	2022.12.14	Predicted	18.05	31.30	23.25	1.52	0.55	0.72	0.76	18.09	31.89	23.25	1.42	0.61	0.79	0.71

DOI: 10.7717/peerj-cs.3246/table-8

Table 9:

Prediction assessment by the XGBoost and CatBoost models from the secondary region.

			eXtreme gradient boosting							Categorical boosting
			Min	Max	Mean	Variance	MAE	RMSE	R2	Min	Max	Mean	Variance	MAE	RMSE	R2
January	2022.01.28	Actual	11.28	27.65	18.72	4.22	0.86	1.06	0.73	11.28	27.65	18.72	4.22	0.87	1.07	0.73
January	2022.01.28	Predicted	11.97	21.40	18.20	2.59	0.86	1.06	0.73	11.99	21.43	18.13	2.64	0.87	1.07	0.73
February	2022.02.13	Actual	15.31	31.24	22.36	3.57	0.76	0.99	0.73	15.31	31.24	22.36	3.57	0.72	0.95	0.75
February	2022.02.13	Predicted	17.06	27.02	22.47	3.05	0.76	0.99	0.73	17.14	26	22.44	2.88	0.72	0.95	0.75
May	2022.05.20	Actual	29.13	43.05	36.46	4.60	0.93	1.21	0.68	29.13	43.05	36.46	4.60	0.90	1.17	0.70
May	2022.05.20	Predicted	28.51	41.76	36.62	4.09	0.93	1.21	0.68	30.19	41.95	36.61	4.16	0.90	1.17	0.70
June	2022.06.05	Actual	24.70	37.28	30.05	2.92	0.89	1.10	0.59	24.70	37.28	30.05	2.92	1.15	1.41	0.32
June	2022.06.05	Predicted	27.54	36.39	30.59	1.97	0.89	1.10	0.59	27.55	36.31	30.9	1.52	1.15	1.41	0.32
November	2022.11.12	Actual	19.49	35.77	25.49	2.76	0.68	0.86	0.73	19.49	35.77	25.49	2.76	0.65	0.84	0.74
November	2022.11.12	Predicted	21.01	30.78	25.59	2.30	0.68	0.86	0.73	20.52	30.27	25.53	2.28	0.65	0.84	0.74
December	2022.12.14	Actual	16.70	31.51	23.29	4.74	0.92	1.17	0.71	16.70	31.51	23.29	4.74	0.93	1.18	0.70
December	2022.12.14	Predicted	17.81	26.99	23.09	2.76	0.92	1.17	0.71	17.96	27.11	23.04	2.67	0.93	1.18	0.70

DOI: 10.7717/peerj-cs.3246/table-9

Figure 10: XGBoost scatterplots of the predicted refined LST (y-axis) vs. actual Landsat 9 LST (x-axis) data at 90 m for the secondary region; the yellow dashed diagonal line represents the 1:1 ideal line.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-10

As illustrated in Fig. 11, the residual distribution for the XGBoost model in the secondary region is broadly comparable to the RF model. However, a key distinction appears in the November and December plots, where the RF model exhibits elongated right tails. These tails signify large-magnitude errors that are notably absent in the XGBoost results. Despite this, the RF model achieved a lower overall RMSE because of its higher volume of extremely accurate (near-zero error) predictions, effectively outweighing the statistical penalty from the infrequent outliers.

Figure 11: XGBoost error histograms from the predicted refined LST across all acquisition dates for the secondary region; the x-axis represents the prediction error in °C, and the y-axis represents the percentage of total pixels.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-11

In contrast, the CatBoost model consistently straggled behind the RF model across all the evaluated months and metrics on the primary region’s test set (Table 8). CatBoost produced higher mean absolute error (MAE) and RMSE values and lower R² scores than RF in all the months. In May, CatBoost’s RMSE and R² were 0.75 and 0.88 respectively, both notably worse than the RF’s RMSE of 0.65 and R² of 0.91. This pattern of weak performance was consistent, as observed in December, where CatBoost’s RMSE and R² were 0.79 and 0.71 respectively, compared to the RF’s more accurate RMSE (0.72) and R² (0.76). In the secondary region, the CatBoost model managed to match or even slightly outperform RF in a few months, e.g., in February, when its RMSE was 0.95 (Table 9) compared to RF’s 0.96 (Table 5) and November, when its RMSE was 0.84 compared to RF’s 0.85. But it experienced a pronounced failure in June. In this month, its R² declined to just 0.32 (Fig. 12), whereas the RF model maintained an R² of 0.65 (Fig. 6). This extreme inconsistency makes CatBoost unstable and therefore, inapposite for downscaling as a generalized model based on this study.

Figure 12: CatBoost scatterplots of the predicted refined LST (y-axis) vs. actual Landsat 9 LST (x-axis) data at 90 m for the secondary region; the yellow dashed diagonal line represents the 1:1 ideal line.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-12

The CatBoost residuals plots (Fig. 13) exhibit close similarity with XGBoost in the months of January and May. Wider distributions can be observed in CatBoost in the months of June and December compared to XGBoost. Though inconsiderable in frequency, residuals of magnitude greater than 5 °C and extending up to 10 °C can be observed in the month of November in CatBoost. All the CatBoost residual plots show wider distributions for all the months, contrasting with the RF plots that depict strong concentration of residuals within ±2 °C.

Figure 13: CatBoost error histograms from the predicted refined LST across all acquisition dates for the secondary region; the x-axis represents the prediction error in °C, and the y-axis represents the percentage of total pixels.

Download full-size image
DOI: 10.7717/peerj-cs.3246/fig-13

Discussion

The accurate downscaling of LST data is essential for applications in agriculture, forestry, and climate change studies. For this, the primary aim of the present study is to downscale MODIS LST from 1,000 m to 90 m in similar geo-environmental conditions using an RF regression model. In this regard, the exact retrieval dates of Landsat 9 and MODIS LST were considered for validation across different months. The study identifies key predictors affecting LST downscaling and assesses the model’s performance in the primary and secondary regions.

The feature importance analysis revealed that MODIS LST, SWIR1, and DEM consistently ranked as the most influential factors across all the temporal points. In contrast, the SAVI, blue band, and NIR band ranked as the lesser important variables (Fig. 3). In the study by Njuki, Mannaerts & Su (2020), the vegetation indices (NDVI, enhanced vegetation index (EVI), SAVI, and fractional vegetation cover (FVC)) and DEM were found to be the most important variables, while the reflectance bands had little role in spatial downscaling of LST. The addition of MODIS LST as an independent variable in the present study improved the model performance significantly, while the reflectance bands (except SWIR1) were less influential in both studies. This pattern aligns with broader applications of RF regression in LST downscaling, as shown by Hutengs & Vohland (2016), improving accuracy by up to 19% over the TsHARP method, and (Ebrahimy et al., 2021), adapting the same approach to various Köppen climate zones. These studies underscore that including terrain and vegetation-related variables (and, more generally, relevant environmental predictors) is crucial for accurate temperature predictions. The feature importance pattern underlines the shifting of environmental factors with transitions to different months.

The overall downscaling results validate the effectiveness of the RF regression model in both primary and secondary regions. A notable finding emerged in the cross-regional validation across different months, wherein in November and December, the model demonstrated more consistent performance across both regions than in May and June. The notably narrower temperature ranges in November and December than in May and June, highlighting seasonal variations in climate patterns, can be attributed to the consistent performance. The secondary region’s lower R² and higher RMSE values stemmed from the region’s distinct climatic, topographic, and land-use conditions. In January and December, higher variability in the actual temperatures compared to the predictions highlighted the model’s limited capability in capturing regional heterogeneity. The spatially explicit RMSE patterns point to the potential benefits of localized model calibration. The overall transferability results make the RF model suitable for LST applications in similar environments.

The comparative analysis of RF with XGBoost and CatBoost was crucial for acknowledging its predictive capability within the broader landscape of machine learning models for LST downscaling. These models were specifically chosen in this study to balance the trade-off between the simpler machine learning approaches that often lack in simulating complex relationships in the data and deep learning frameworks that require a lot of training data and computation power. While all the models, notably the RF and XGBoost, performed comparably in the primary region’s test set, their limitations became evident in the new setting, with a marked decline in performance for XGBoost, and, more severely, for CatBoost specifically in the May and June months. The degraded performance can be attributed to the elevation differences between the primary and secondary regions. The distribution and characteristics of land cover types across varying elevations affect vegetation indices like SAVI and their relationship with LST (Wang et al., 2025; Song et al., 2025). Further, May and June are critical periods for vegetation growth and phenological changes, e.g., green-up, peak biomass in many regions. The relationship between LST and vegetation indices can be highly dynamic and vary significantly with seasonal changes (Sun & Kafatos, 2007; Karnieli et al., 2010; Ganguly et al., 2010; Gao & Zhang, 2021). A notable spatial disparity in temperature ranges was observed during the May and June months. While the primary region exhibited temperature ranges of 13.92 °C and 12.58 °C for May and June, respectively, the secondary region recorded significantly wider ranges of 16.90 °C and 14.70 °C for the same months, marking them as the broadest temperature variations across all assessed periods in that region. This forces the models to extrapolate beyond the variability they learned during training. The November and December months generally have lower solar radiation, cooler temperatures, and more stable atmospheric conditions with vegetation undergoing steady changes resulting more stable LST-feature relationships. Consequently, this environmental consistency enables the models to better generalize the learned relationships and produce better downscaling results.

The downscaling results indicate that month-specific calibration is essential for capturing seasonal variations in environmental and atmospheric conditions, thereby improving model performance. Additionally, regional adaptation is particularly crucial for months like May and June, where significant variations in climate, topography, and land-use patterns necessitate localized adjustments. Incorporating additional environmental parameters during model training could further enhance accuracy and robustness. These insights underscore the importance of spatiotemporal customization in modeling and highlight the potential benefits of a more comprehensive approach with a broader range of environmental factors. The methodology and results of this research will be helpful in sustainable agricultural practices and forest management in regions with similar geo-environmental conditions.

Conclusions

This study demonstrates the efficacy of RF regression in downscaling MODIS LST, emphasizing the importance of integrating multiple variables to achieve high predictive accuracy. The model’s performance was validated in the primary region and successfully applied to a spatially distinct secondary region, exhibiting its transferability. The RF model was also tested against XGBoost and CatBoost for contextual integrity and robustness.

Though the present work produced highly accurate downscaled images of LST in both primary and secondary regions using the RF regression model, it has limitations that suggest several avenues for future research. These include: employing hybrid machine learning models and advanced deep learning architectures to enhance spatial downscaling of LST; integrating more diverse environmental datasets into the existing models, such as soil moisture, atmospheric parameters, etc., to improve predictive accuracy; assessing the model’s performance across different geographic regions with unique climatic and land cover conditions to enhance generalizability; developing an automated framework for real-time LST downscaling to support climate monitoring, agriculture, and UHI studies; and validating the model with uncrewed aerial vehicle (UAV) and field observation datasets. These future advancements will make machine learning and deep learning-based downscaling a more robust and scalable solution for high-resolution thermal mapping.

Supplemental Information

Python notebook files for processing satellite data and exporting as GeoTIFF files.

DOI: 10.7717/peerj-cs.3246/supp-1

Download

[1] Abdo HG, Richi SM, Alqadhi S, Zeng T, Prasad P, Kotaridis I, Alharbi MM, Khaddour LA, Mallick J. 2024. A hybrid machine learning modelling for optimization of flood susceptibility mapping in the Eastern Mediterranean. Natural Hazards 121(6):7199-7228

[2] Agam N, Kustas WP, Anderson MC, Li F, Neale CMU. 2007. A vegetation index based technique for spatial sharpening of thermal imagery. Remote Sensing of Environment 107(4):545-558

[3] Anderson MC, Norman JM, Mecikalski JR, Otkin JA, Kustas WP. 2007. A climatological study of evapotranspiration and moisture stress across the continental United States based on thermal remote sensing: 1. Model formulation. Journal of Geophysical Research: Atmospheres 112(D10):2006JD007506

[4] Barzani AR, Pahlavani P, Ghorbanzadeh O, Gholamnia K, Ghamisi P. 2024. Evaluating the impact of recursive feature elimination on machine learning models for predicting forest Fire-prone zones. Fire 7(12):440

[5] Belgiu M, Drăguţ L. 2016. Random forest in remote sensing: a review of applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing 114(Part A):24-31

[6] Bergstra J, Bengio Y. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13:281-305

[7] Bindhu VM, Narasimhan B, Sudheer KP. 2013. Development and verification of a non-linear disaggregation method (NL-DisTrad) to downscale MODIS land surface temperature to the spatial scale of Landsat thermal data to estimate evapotranspiration. Remote Sensing of Environment 135(10):118-129

[8] Breiman L. 2001. Random forests. Machine Learning 45(1):5-32

[9] Chen T, Guestrin C. 2016. XGBoost: a scalable tree boosting system.

[10] Dominguez A, Kleissl J, Luvall JC, Rickman DL. 2011. High-resolution urban thermal sharpener (HUTS) Remote Sensing of Environment 115(7):1772-1780

[11] Ebrahimy H, Aghighi H, Azadbakht M, Amani M, Mahdavi S, Matkan AA. 2021. Downscaling MODIS land surface temperature product using an adaptive random forest regression method and google earth engine for a 19-years spatiotemporal trend analysis over Iran. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14:2103-2112

[12] Ebrahimy H, Azadbakht M. 2019. Downscaling MODIS land surface temperature over a heterogeneous area: an investigation of machine learning techniques, feature selection, and impacts of mixed pixels. Computers & Geosciences 124:93-102

[13] Ganguly S, Friedl MA, Tan B, Zhang X, Verma M. 2010. Land surface phenology from MODIS: characterization of the collection 5 global land cover dynamics product. Remote Sensing of Environment 114(8):1805-1816

[14] Gao F, Zhang X. 2021. Mapping crop phenology in near real-time using satellite remote sensing: challenges and opportunities. Journal of Remote Sensing 2021(2):215

[15] Ghosh A, Joshi PK. 2014. Hyperspectral imagery for disaggregation of land surface temperature with selected regression algorithms over different land use land cover scenes. ISPRS Journal of Photogrammetry and Remote Sensing 96(1–2):76-93

[16] Gupta N, Banerjee A, Gupta SK. 2021. Spatio-temporal trend analysis of climatic variables over Jharkhand, India. Earth Systems and Environment 5:71-86

[17] Hutengs C, Vohland M. 2016. Downscaling land surface temperatures at regional scales with random forest regression. Remote Sensing of Environment 178:127-141

[18] Karnieli A, Agam N, Pinker RT, Anderson M, Imhoff ML, Gutman GG, Panov N, Goldberg A. 2010. Use of NDVI and land surface temperature for drought assessment: merits and limitations. Journal of Climate 23(3):618-633

[19] Li Z-L, Tang B-H, Wu H, Ren H, Yan G, Wan Z, Trigo IF, Sobrino JA. 2013. Satellite-derived land surface temperature: current status and perspectives. Remote Sensing of Environment 131:14-37

[20] Liang M, Zhang L, Wu S, Zhu Y, Dai Z, Wang Y, Qi J, Chen Y, Du Z. 2023. A high-resolution land surface temperature downscaling method based on geographically weighted neural network regression. Remote Sensing 15(7):1740

[21] Liu D, Pu R. 2008. Downscaling thermal infrared radiance for subpixel land surface temperature retrieval. Sensors 8(4):2695-2706

[22] Mahato LL, Kumar M, Suryavanshi S, Singh SK, Lal D. 2021. Statistical investigation of long-term meteorological data to understand the variability in climate: a case study of Jharkhand, India. Environment, Development and Sustainability 23(11):16981-17002

[23] Maithani S, Nautiyal G, Sharma A, Sharma SK. 2022. Simulation of land surface temperature patterns over future urban areas—a machine learning approach. Journal of the Indian Society of Remote Sensing 50:2145-2162

[24] Momeni M, Saradjian M. 2007. Evaluating NDVI-based emissivities of MODIS bands 31 and 32 using emissivities derived by Day/Night LST algorithm. Remote Sensing of Environment 106(2):190-198

[25] Mukherjee S, Joshi PK, Garg RD. 2017. Analysis of urban built-up areas and surface urban heat island using downscaled MODIS derived land surface temperature data. Geocarto International 32:900-918

[26] Njuki SM, Mannaerts CM, Su Z. 2020. An improved approach for downscaling coarse-resolution thermal data by minimizing the spatial averaging biases in random forest. Remote Sensing 12(21):3507

[27] Ouyang X, Dou Y, Yang J, Chen X, Wen J. 2022. High spatiotemporal rugged land surface temperature downscaling over Saihanba Forest Park. China Remote Sensing 14(11):2617

[28] Pandey A, Mondal A, Guha S, Upadhyay PK, Rashmi, Kundu S. 2023. Analysis of spectral indices-based downscaled land surface temperature in a humid subtropical city. International Journal of Image and Data Fusion 14(4):336-358

[29] Patel NR, Parida BR, Venus V, Saha SK, Dadhwal VK. 2012. Analysis of agricultural drought using vegetation temperature condition index (VTCI) from Terra/MODIS satellite data. Environmental Monitoring and Assessment 184(12):7153-7163

[30] Prasad P, Gogoi D, Gogoi D, Kumar T, Chandra P. 2025. Threshold-based inventory for flood susceptibility assessment of the world’s largest river island using multi-temporal SAR data and ensemble machine learning algorithms. Stochastic Environmental Research and Risk Assessment 39:251-269

[31] Prasad P, Loveson VJ, Chandra P, Kotha M. 2022. Evaluation and comparison of the earth observing sensors in land cover/land use studies using machine learning algorithms. Ecological Informatics 68:101522

[32] Prasad P, Loveson VJ, Kotha M, Yadav R. 2020. Application of machine learning techniques in groundwater potential mapping along the west coast of India. GIScience & Remote Sensing 57(6):735-752

[33] Prasad P, Loveson VJ, Mandal S, Chandra P, Kulimushi LC. 2023. Multi-resource potentiality and multi-hazard susceptibility assessments of the central west coast of India applying machine learning and geospatial techniques. Environmental Earth Sciences 82(9):226

[34] Prasad P, Mandal S, Naik SS, Loveson VJ, Borah S, Chandra P, Sudheer K. 2024. Integration of multi-temporal SAR data and robust machine learning models for improvement of flood susceptibility assessment in the southwest coast of India. Applied Computing and Geosciences 24(4):100189

[35] Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A. 2018. CatBoost: unbiased boosting with categorical features.

[36] Pu R. 2021. Assessing scaling effect in downscaling land surface temperature in a heterogenous urban environment. International Journal of Applied Earth Observation and Geoinformation 96(4):102256

[37] Samek W, Wiegand T, Müller K-R. 2017. Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. ArXiv

[38] Sobrino JA, Jimenez-Munoz JC, Soria G, Romaguera M, Guanter L, Moreno J, Plaza A, Martinez P. 2008. Land surface emissivity retrieval from different VNIR and TIR sensors. IEEE Transactions on Geoscience and Remote Sensing 46(2):316-327

[39] Sobrino JA, Jiménez-Muñoz JC, Paolini L. 2004. Land surface temperature retrieval from LANDSAT TM 5. Remote Sensing of Environment 90(4):434-440

[40] Song S, Shi J, Fan D, Cui L, Yang H. 2025. Development of downscaling technology for land surface temperature: a case study of Shanghai, China. Urban Climate 61(9):102412

[41] Sun D, Kafatos M. 2007. Note on the NDVI-LST relationship and the use of temperature-related drought indices over North America. Geophysical Research Letters 34(24):45

[42] Tomlinson CJ, Chapman L, Thornes JE, Baker C. 2011. Remote sensing land surface temperature for meteorology and climatology: a review. Meteorological Applications 18(3):296-306

[43] Tu H, Cai H, Yin J, Zhang X, Zhang X. 2022. Land surface temperature downscaling in the karst mountain urban area considering the topographic characteristics. Journal of Applied Remote Sensing 16:81

[44] Voogt JA, Oke TR. 2003. Thermal remote sensing of urban climates. Remote Sensing of Environment 86(3):370-384

[45] Wang S, Luo Y, Li X, Yang K, Liu Q, Luo X, Li X. 2021. Downscaling Land surface temperature based on non-linear geographically weighted regressive model over urban areas. Remote Sensing 13(8):1580

[46] Wang J, Tang B-H, Zhu X, Fan D, Li M, Chen J. 2025. A comparative analysis of five land surface temperature downscaling methods in plateau mountainous areas. Frontiers in Earth Science 12:2

[47] Weng Q, Fu P, Gao F. 2014. Generating daily land surface temperature at Landsat resolution by fusing Landsat and MODIS data. Remote Sensing of Environment 145:55-67

[48] Xu S, Zhao Q, Yin K, He G, Zhang Z, Wang G, Wen M, Zhang N. 2021. Spatial downscaling of land surface temperature based on a multi-factor geographically weighted machine learning model. Remote Sensing 13(6):1186

[49] Yin Z, Wu P, Foody GM, Wu Y, Liu Z, Du Y, Ling F. 2021. Spatiotemporal fusion of land surface temperature based on a convolutional neural network. IEEE Transactions on Geoscience and Remote Sensing 59(2):1808-1822

[50] Zakšek K, Oštir K. 2012. Downscaling land surface temperature for urban heat island diurnal cycle analysis. Remote Sensing of Environment 117(4):114-124

[51] Zhu XX, Tuia D, Mou L, Xia G-S, Zhang L, Xu F, Fraundorfer F. 2017. Deep learning in remote sensing: a comprehensive review and list of resources. IEEE Geoscience and Remote Sensing Magazine 5(4):8-36

Introduction

Study area

Figure 1: Study regions: (A) primary region where the RF, XGBoost and CatBoost models were trained, and (B) secondary region where these models were applied.

Datasets and methodology

MODIS spectral bands and LST

MODIS vegetation indices

Landsat 9 LST retrieval

Model construction using RF

Figure 2: Flow diagram of the downscaling methodology.

Results

Feature importance by RF

Figure 3: RF importance scores across all acquisition dates.

Downscaling performance

Figure 4: RF scatterplots of the predicted refined LST (y-axis) vs. actual Landsat 9 LST (x-axis) data at 90 m for the primary region’s test set; the yellow dashed diagonal line represents the 1:1 ideal line.

Figure 5: RF error histograms from the predicted refined LST across all acquisition dates for the primary region’s test set; the x-axis represents the prediction error in °C, and the y-axis represents the percentage of total pixels.

Model transferability

Figure 6: RF scatterplots of the predicted refined LST (y-axis) vs. actual Landsat 9 LST (x-axis) data at 90 m for the secondary region; the yellow dashed diagonal line represents the 1:1 ideal line.

Figure 7: RF error histograms from the predicted refined LST across all acquisition dates for the secondary region; the x-axis represents the prediction error in °C, and the y-axis represents the percentage of total pixels.

Figure 8: Actual vs. predicted 90 m LST maps of the secondary region from the RF model.

Figure 9: Per-pixel RMSE maps of the secondary region from the RF model; the white sparse areas have RMSE values less than 1 °C.

Comparative evaluation of RF against XGBoost and CatBoost

Figure 10: XGBoost scatterplots of the predicted refined LST (y-axis) vs. actual Landsat 9 LST (x-axis) data at 90 m for the secondary region; the yellow dashed diagonal line represents the 1:1 ideal line.

Figure 11: XGBoost error histograms from the predicted refined LST across all acquisition dates for the secondary region; the x-axis represents the prediction error in °C, and the y-axis represents the percentage of total pixels.

Figure 12: CatBoost scatterplots of the predicted refined LST (y-axis) vs. actual Landsat 9 LST (x-axis) data at 90 m for the secondary region; the yellow dashed diagonal line represents the 1:1 ideal line.

Figure 13: CatBoost error histograms from the predicted refined LST across all acquisition dates for the secondary region; the x-axis represents the prediction error in °C, and the y-axis represents the percentage of total pixels.

Discussion

Conclusions

Supplemental Information

Python notebook files for processing satellite data and exporting as GeoTIFF files.