State-space modeling of the dynamics of temporal plant cover using visually determined class data

A lot of vegetation-related data have been collected as an ordered plant cover class that can be determined visually. However, they are difficult to analyze numerically as they are in an ordinal scale and have uncertainty in their classification. Here, I constructed a state-space model to estimate unobserved plant cover proportions (ranging from zero to one) from such cover class data. The model assumed that the data were measured longitudinally, so that the autocorrelations in the time-series could be utilized to estimate the unobserved cover proportion. The model also assumed that the quadrats where the data were collected were arranged sequentially, so that the spatial autocorrelations also could be utilized to estimate the proportion. Assuming a beta distribution as the probability distribution of the cover proportion, the model was implemented with a regularized incomplete beta function, which is the cumulative density function of the beta distribution. A simulated dataset and real datasets, with one-dimensional spatial structure and longitudinal survey, were fit to the model, and the parameters were estimated using the Markov chain Monte Carlo method. Then, the validity was examined using posterior predictive checks. As a result of the fitting, the Markov chain successfully converged to the stationary distribution, and the posterior predictive checks did not show large discrepancies. For the simulated dataset, the estimated values were close to the values used for the data generation. The estimated values for the real datasets also seemed to be reasonable. These results suggest that the proposed state-space model was able to successfully estimate the unobserved cover proportion. The present model is applicable to similar types of plant cover class data, and has the possibility to be expanded, for example, to incorporate a two-dimensional spatial structure and/or zero-inflation.


INTRODUCTION
There is a vast amount of historical data regarding plant abundance that were recorded as plant abundances in an ordered cover class, e.g., the Braun-Blanquet classification (Podani, 2006;Irvine & Rodhouse, 2010;Damgaard, 2014), much of which was determined visually. In many cases, such data are difficult to treat numerically; they are typically recorded in an ''ordinal scale'' so that standard arithmetic operations, such as addition or subtraction, are not applicable (Dale, 1989;Podani, 2006). In addition, the uncertainty derived from the visual classification of such data tends to be ignored in analyses.
However, attempts to estimate unobserved ''true'' plant cover (the proportion in a unit area) from the ordered class data have been developed along with progress in statistical methods in the field of ecology (Irvine & Rodhouse, 2010;Damgaard, 2014;Herpigny & Gosselin, 2015;Irvine, Rodhouse & Keren, 2016;Irvine et al., 2019;Damgaard & Irvine, 2019). Ordered class data are typically modeled using ordered logit (cumulative logit) models, but the interpretation of the models has been known to be rather complicated (Herpigny & Gosselin, 2015;Irvine, Rodhouse & Keren, 2016).
However, some attempts have been made to model the plant cover proportion, assuming that this proportion follows the beta distribution (Chen et al., 2008;Irvine & Rodhouse, 2010;Damgaard, 2014;Irvine, Rodhouse & Keren, 2016;Irvine et al., 2019;Damgaard & Irvine, 2019). For example, Damgaard (2014) modeled the plant cover class as determined visually using the incomplete beta function based on the beta distributions of the plant cover. Herpigny & Gosselin (2015) incorporated zero-inflation, accounting for the excess zeros in the class data, into the model. Irvine et al. (2019) have proposed a Bayesian hierarchical framework accounting for true and false zeros in the class data as well as misclassification of the classes. Damgaard & Irvine (2019) comprehensively discussed this subject.
In recent decades, state-space models have been applied to many subjects in ecology, such as population dynamics (Clark & Bjørnstad, 2004;Damgaard, 2012;Iijima, Nagaike & Honda, 2013), metapopulation dynamics (Harrison, Hanski & Ovaskainen, 2011), and tree growth (Shimatani & Kubota, 2011;Hiura, Go & Iijima, 2019). The state-space model consists of two types of sub-model, the observation model and the system model; the former describes the relationships that exist between the observed data and unobserved systems, and the latter describes the processes in the unobserved latent system, such as the temporal changes of the system. Notably, this class of models has a hierarchical structure and can explicitly describe the observation processes and the latent system processes separately (Kéry & Schaub, 2011;Irvine et al., 2019). In addition, by using state-space modeling, latent states can be estimated even if there are missing observations (Durbin & Koopman, 2012).
The state-space model has also been used for dealing with time-series pin-point cover data (Damgaard, 2012). However, the cover class data treated in the present study typically have less information than pin-point cover data. Few studies have applied state-space modeling to cover class data, but if the class data were collected longitudinally, we would be able to utilize the information; i.e., the value of the latent state at a survey occasion should be similar to those at temporally adjacent occasions. In addition, if the class data were surveyed in quadrats that are arranged sequentially, we could also utilize information from the spatial autocorrelation.
In this study, a state-space model was constructed to estimate the unobserved proportion of plant cover from ordered class data using the incomplete beta function, combining information from temporal and spatial autocorrelations. This type of model would help to utilize visually determined plant cover data with temporal and spatial autocorrelation.

Model basis
The beta distribution has been used to describe statistical variations in plant cover, because the distribution has a boundary from zero to one, and because it can describe various shapes (Chen et al., 2008;Irvine & Rodhouse, 2010;Eskelson et al., 2011;Damgaard, 2012;Damgaard, 2013;Damgaard, 2014;Herpigny & Gosselin, 2015;Wright et al., 2017;Takarabe & Iijima, 2019). In this approach, the proportion of cover p (0 < p < 1) is assumed to follow the beta distribution: where α (> 0) and β (> 0) are the parameters. Another parameterization using the mean of the proportion µ (0 < µ < 1) as a parameter is available (Irvine et al., 2019), where φ(> 0) and δ (0 < δ < 1) are the parameters that control the dispersion (φ = (1 − δ)/δ). The parameter δ is also defined as the intra-quadrat correlation of the plant distribution (Damgaard, 2012;Damgaard, 2013;Damgaard, 2014), and it can be regarded as representing the uncertainty of the observation of the cover proportion when it is rather small (Fig. 1). In the case of µ = 0.5, the distribution stays unimodal when δ is smaller than 1/3. In contrast, when δ becomes larger, the distribution tends to become bimodal (zero and one), or unimodal at zero or one (depending on µ). In the parameterization set using δ, the variance was given as δµ(1 − µ). The probability that p falls between x 0 and x 1 (0 < x 0 ,x 1 < 1, and x 0 < x 1 ) can be described as follows: where B(x,α,β) is the cumulative density function of the beta distribution, which is identical to the regularized incomplete beta function I x (α,β). Note that B(0,α,β) = 0 and B(1,α,β) = 1. Figure 2 shows the relationship between the mean proportion (µ) and the proportion of the beta distribution with a given µ and δ that is classified into each class. When the value of δ is small, the realized value of the cover class would directly correspond to the mean value. In contrast, the larger δ becomes, classes other than those corresponding to the mean tend to be chosen more frequently. For simplicity, this formulation does not explicitly account for the measurement or the detection process, though Irvine et al. (2019) have developed a model that explicitly incorporates a process in which the non-detection of plants and misidentification of the cover class may occur.

State-space model
Observation model. Assume that surveys on plant cover were conducted N T times in N Q quadrats. In the present study, quadrats were assumed to be arranged on a line. The cover class, Y , was defined as six classes corresponding with the proportion of plant cover as follows: In reality, Y would be typically determined with visual measurements. In this study, estimating the cover proportion of a particular species was the primary purpose rather than the presence/absence of the species. Thus, for simplicity, the model did not distinguish the absence of the species (or more precisely, the absence of the detection of the species) from the smallest plant cover class. When the plant species richness of the area is the study purpose, both states should be modeled separately. In those cases, incorporating zero-inflation (Herpigny & Gosselin, 2015) and the correction of false-negative errors (Chen et al., 2009;Chen et al., 2013) into the model is required. Irvine et al. (2019) have explicitly modeled this observation process to estimate latent cover proportions and detection errors.
The mean proportion of plant cover µ t ,q was defined by incorporating the latent state θ t at time t ∈ {1,2,...,N T }, where r t ,q denotes the spatial random effect incorporating spatial autocorrelation.
System model. The latent state θ t at time t denotes the states related to the mean proportion of plant cover, and the expected mean proportion of plant cover at time t for the overall plots, φ t , is given as The transition of the latent state θ t was defined using second-order differences with normal error as follows: where σ T denotes the standard deviations. This formulation is equivalent to the secondorder autoregressive (AR(2)) model, Priors of the latent states at time t ∈ {1,2} were defined as weakly informative (Gelman, Simpson & Betancourt, 2017) so that they would be effective for model identifiability but not strongly restrict the range of the posterior distributions; they should be wide enough for the logit-scaled parameters (in case θ = 5, φ is 0.99), The spatial random effect r t ,q at time t of quadrat q was defined as follows: where σ R denotes the standard deviation among the spatial random effects. The value of the random effect r t ,q was assumed to be affected by those of the adjacent quadrats. This formulation was equivalent to a process model of a state-space model with a first-order difference in the state changes. Then, the values were updated so that their sum should be zero for each survey time to avoid affecting the overall intercept and the identifiability of the model.
Priors for standard deviation parameters σ R and σ T were defined as weakly informative for the same reason as for θ 1 and θ 2 , as follows: σ ∼ HalfNormal(0,2.5 2 ).

Generation of simulated data
Assume that there were N T = 10 quadrats that settled sequentially, and plant cover classes were surveyed for N Q = 15 times in each quadrat. A simulated dataset was generated according to this assumption. In the simulated data, the parameter θ t , which denotes the latent state at time t , was generated following the relationship below: θ 1 = −6 θ t ∼ Normal(θ t −1 + 0.3,0.5 2 ) for t ∈ {2,3,...,N T }.
The latent state θ t (t ∈ {2,3,...,N T }) was randomly generated following the above normal distribution. Note that the first-order difference was used in this data generation, for simplicity, while the second-order difference was adopted in the model defined above. The spatial random effects r q (1 ∈ {2,3,...,N Q }) were also generated randomly, with the assumption of following the above normal distribution. In this simulation, the spatial random effects were assumed to be invariant through time.
Proportions of plant cover p were generated according to the model defined in the previous subsection: Then, the plant cover classes were generated with an uncertainty δ. In this simulation, the value of δ was set to 0.05. The parameter δ can be defined as the intra-quadrat correlation of plant distribution in the pin-point cover data (Damgaard, 2012), but, in this simulated data, it was decoupled from the parameter σ Q , which control the similarity between neighboring quadrats.

Fitting to the model
The generated data were fit to the Bayesian state-space model defined in the above subsection, and the posterior distributions of each parameter were estimated using the Markov chain Monte Carlo (MCMC) method. The model was implemented using Stan version 2.21.0 (Carpenter et al., 2017) with the re-parameterization of the model for the stability and efficiency of the Hamiltonian Monte Carlo algorithm, which was adopted in the Stan software. The Stan model code is also available at the GitHub repository. Posterior samples were drawn from 1,000 iterations after 1,000 warm-up (burn-in) iterations from each of 4 chains, and the posterior distributions of the parameters were estimated. Then, posterior predictive checks were conducted to evaluate the fitting to the model using the 'bayesplot' package (Gabry et al., 2019) in the statistical software R version 3.6.2 (R Core Team, 2019). In the posterior predictive check, the data drawn from the posterior predictive distribution that was calculated under the model were compared to the observed data using the rootogram (Kleiber & Zeileis, 2016), which plots the expected values under the model on the histogram of the observed data with a square-root-scaled Y -axis. If there are considerable discrepancies between them, it indicates that the model poorly explains the observed data.

Application to real data
Real data to be fitted to the model were taken from long-term vegetation monitoring following a catastrophic windthrow (Itô et al., 2018). The data were collected during the period of 1957 to 2017 in the headwater region of the Ishikari River, Hokkaido, northern Japan. Six plots (No. 27,30,34,35,46 and 54) were settled in the region in 1955. Quadrats sized two meters × two meters were settled sequentially, and the number was 15-25 for each plot. The visually determined cover classes were recorded for species that occurred in each quadrat. The classes used in the surveys were as follows: + (proportion: 0-0.01, excluding zero), 1 (0.01-0.1), 2 (0.1-0.25), 3 (0.25-0.5), 4 (0.5-0.75), and 5 (0.75-1). Species that were not detected (i.e., the cover was 0) did not appear in the dataset. However, in the analysis, the notation was changed to be identical to the simulated data shown above for the sake of simplicity in numerical treatments so that the absence (more precisely, non-detection) was added to class 1 (0-0.01, including zero). The dataset is also available at the GitHub repository since it was published under the license CC BY 4.0.
From this dataset, cover classes of a species of dwarf bamboo, Sasa senanensis, in the shrub layer of six plots were used as the real data to be fit to the Bayesian state-space model. The data of the species had a wide variation in the cover class measurements and were suitable for model evaluation. The plots had 19-25 quadrats, and the survey was conducted 20 times (in 1957-1968, 1972, 1976, 1980, 1984, 1988, 2002, 2009, and 2017). Though the measurements were not conducted in all years during the period , the latent state could be estimated using the state-space model (Durbin & Koopman, 2012). Figure 4 shows the changes in cover classes.
The posterior distributions were estimated using the MCMC method. Stan was also used for the estimation, and the posterior samples were drawn from 2,000 iterations after 2,000 warm-up (burn-in) iterations from each of 4 chains. Then, the posterior predictive checks were conducted using the 'bayesplot' package.

Simulated data
Gelman-Rubin statistics,R, (Gelman & Rubin, 1992;Brooks & Gelman, 1998) were smaller than 1.1 for all the parameters, suggesting that the Markov chain successfully converged to the stationary distribution. Table 1 shows the summary of the posteriors for the parameters δ, σ T , and σ R . The posterior means (and 95% credible intervals) of these parameters were estimated as 0.06 (0.03-0.09) for δ, 0.34 (0.08-0.62) for σ R , and 0.75 (0.32-1.43) for σ T ( Table 1). The values used for the data generation were 0.05, 0.5, and 0.5, respectively. Figure 5 shows the overall cover proportion (φ = logit −1 (θ )) calculated from the posterior median (the red line) and the 95% credible intervals (the red region) as well as the cover classes in the simulated data (black dots) and the cover proportion averaged for each time (black curve). The posterior predictive check showed no conflicts between the observed value and the predicted distribution for each time. Figure 6 shows the result at time 15.

Real datâ
R values were smaller than 1.1 for all parameters, and the Markov chain seemed to converge to the stationary distribution. Table 2 shows a summary of the posteriors of the parameters δ, σ T , and σ R for each plot. The posterior means of the uncertainty or intra-plot correlation parameter δ were estimated as 0.12-0.31, those of the spatial variability parameter σ R were 0.67-3.06, and those of the temporal variability parameter σ T were 0.04-0.63 (Table 2). Figure 7 shows the estimated overall cover proportions for each year for each plot. Figure 8 shows the results of the posterior predictive checks in 2017 for each plot. The posterior predictive checks showed little conflict between the observed values and the predicted distribution for each year except for a few of the predicted values such as that of class 5 of plot 30 and that of class 4 of plot 54.

DISCUSSION
For the simulated data, the posterior means (and medians) for the three major parameters did not differ much from the values that were used in the data generation (Table 1). The  estimated curve of the plant cover proportion was similar to that which generated the simulated data (Fig. 5). The result of the posterior predictive check (Fig. 6) also suggests little discrepancy between the fitted model and the simulated dataset. The differences between the posterior means and the original values in parameters σ R and σ T may be at least partially due to variations in the randomly generated data. However, the slightly smaller value of σ R may be attributable to small variations of cover classes among quadrats  in the first several surveys (Fig. 3). Over the period, the small value of θ t overwhelmed the value of r t . In addition, the small variations in the period would affect the narrow credible intervals of the posteriors (Fig. 5). Also, the assumption of the second-order differences in the system model rather than the first-order differences used in the data generation may have affected the difference in σ T . For the real data, the estimated curves of the overall plant cover proportion seem to be reasonable when comparing the measured cover classes for each plot (Fig. 7). Most of the posterior predictive checks (Fig. 8) also suggest little conflict between the model and the real datasets. Owing to the state-space modeling, the overall proportions could be estimated including the unobserved years, but the credible intervals were considerably wider for some plots (plots 46 and 54), especially in later years without surveys. This is mainly because of lack of information due to the sparse survey intervals.
The posterior mean of σ R was largest (3.06) in plot 27 (Table 2), although the range did not seem strongly affected by the prior (HalfNormal(0,2.5)). This is likely due to the somewhat more considerable variation in the measurements among the adjacent quadrats (Fig. 4). On the other hand, the value was the smallest in plot 36 (Table 2), reflecting the small variations among quadrats.
The estimated posterior means of δ ranged from 0.12 to 0.78 (Table 2), and these values suggest that the uncertainty or intra-plot correlation was somewhat large (Fig. 2). In particular, the value was largest in plot 30 (0.78), where most of the observed values were zero (Fig. 8). This is reasonable because the intra-quadrat correlation of the plant cover should be large when most values are zero. In this situation, it may be adequate to interpret the value of δ as the intra-quadrat correlation rather than uncertainty. On the other hand, the posterior mean of σ R , the scale parameter of spatial variation in the logit-scale, was not so small (1.06) compared to the simulated data (0.34). This may be because the overall mean of the cover proportion in the logit-scale was so small that σ R weakly affected the likelihood in this case.
If the intra-and inter-quadrat distributions are related, the inter-quadrat variation within a year also may correlate with the intra-quadrat correlation. An integrated evaluation of the intra-and inter-quadrat correlation may enable us to evaluate the spatial distribution pattern of a target plant at various scales.
The state-space modeling seems to have successfully estimated the changes in the latent states in the years that the surveys were not conducted. These results suggest that the present model is applicable to this type of plant cover class data.
Though the model proposed in this study is rather simple, more elaborate models can be constructed. For example, the one-dimensional structure of the present model can be expanded to two dimensions. To incorporate a two-dimensional spatial autocorrelation, conditional autoregressive (CAR) models can be utilized, and they are available in Stan (Joseph, 2016;Morris et al., 2019) Another possible expansion is to incorporate zeroinflation. Herpigny & Gosselin (2015) and Irvine et al. (2019) have already provided modeling of plant cover classes with zero-inflation. When incorporating this, false-negative errors should be considered (Chen et al., 2009;Chen et al., 2013;Irvine et al., 2019). In addition, misclassification of the cover classes should be explicitly incorporated (Irvine et al., 2019).

CONCLUSION
State-space modeling for plant cover class data can successfully estimate the unobserved cover proportion by utilizing spatial and temporal autocorrelations that are contained within the data. The present model can be applicable to similar types of plant cover class data, and then can be expanded to deal with two-dimensional field data, or to incorporate zero-inflation and misclassification of the cover classes.