Effects of number of parallel runs and frequency of bias-strength replacement in generalized ensemble molecular dynamics simulations

Takuya Shimato; Kota Kasahara; Junichi Higo; Takuya Takahashi

doi:10.7717/peerj-pchem.4

Effects of number of parallel runs and frequency of bias-strength replacement in generalized ensemble molecular dynamics simulations

Takuya Shimato¹, Kota Kasahara ², Junichi Higo³, Takuya Takahashi²

1Graduate School of Life Sciences, Ritsumeikan University, Kusatsu, Shiga, Japan

2College of Life Sciences, Ritsumeikan University, Kusatsu, Shiga, Japan

3Graduate School of Simulation Studies, University of Hyogo, Kobe, Hyogo, Japan

DOI: 10.7717/peerj-pchem.4

Published: 2019-10-15
Accepted: 2019-09-24
Received: 2019-07-08

Academic Editor: Johannes Margraf

Subject Areas: Theoretical and Computational Chemistry
Keywords: Generalized ensemble method, Molecular dynamics, Molecular simulation, Multicanonical ensemble, Enhanced conformational sampling

Copyright: © 2019 Shimato et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Physical Chemistry) and either DOI or URL of the article must be cited.

Cite this article: Shimato T, Kasahara K, Higo J, Takahashi T. 2019. Effects of number of parallel runs and frequency of bias-strength replacement in generalized ensemble molecular dynamics simulations. PeerJ Physical Chemistry 1:e4 https://doi.org/10.7717/peerj-pchem.4

The authors have chosen to make the review history of this article public.

Abstract

Background

The generalized ensemble approach with the molecular dynamics (MD) method has been widely utilized. This approach usually has two features. (i) A bias potential, whose strength is replaced during a simulation, is applied. (ii) Sampling can be performed by many parallel runs of simulations. Although the frequency of the bias-strength replacement and the number of parallel runs can be adjusted, the effects of these settings on the resultant ensemble remain unclear.

Method

In this study, we performed multicanonical MD simulations for a foldable mini-protein (Trp-cage) and two unstructured peptides (8- and 20-residue poly-glutamic acids) with various settings.

Results

As a result, running many short simulations yielded robust results for the Trp-cage model. Regarding the frequency of the bias-potential replacement, although using a high frequency enhanced the traversals in the potential energy space, it did not promote conformational changes in all the systems.

Introduction

In the past several decades, the molecular dynamics (MD) method has been widely applied to investigate the microscopic behavior of molecular systems. Although advances in high-performance computing technology have extended the timescale that is reachable by MD simulations (Salomon-Ferrer et al., 2013; Shaw et al., 2014; Abraham et al., 2015), there is still a large gap from experimental measurements. In particular, it is not straightforward to characterize the free-energy landscape (FEL) of a complex molecular system, because the characteristics of conformational ensembles obtained via canonical MD simulations largely depend on the initial conditions. To solve this problem, the generalized ensemble (GE) approach has been extensively developed and applied to the MD method. The GE approach enhances the conformational sampling using some tricks. First, in many GE methods, the conformational sampling can be performed with many parallel runs of simulations in a coupled or independent manner. For example, the replica-exchange MD (REMD) method (Sugita & Okamoto, 1999) involves performing many simulations of the same system, i.e., replicas, with different temperatures. The replicas with adjacent temperatures are coupled by exchanging their temperatures via Monte Carlo trials. On the other hand, the multicanonical MD (McMD) method (Nakajima, Nakamura & Kidera, 1997) can be performed by multiple independent runs, and a resultant ensemble is obtained by concatenating the trajectories of these runs (Ikebe et al., 2010). Second, the GE approach generates a non-Boltzmann distribution by applying bias potential, e.g., heating/cooling in the entire system or a part of the system, scaling the potential energies, and applying spring potentials for parts of system. These biases enhance the conformational changes of molecules and avoid trapping the molecular system at local minima in the FEL. During a simulation, the strength of the bias is frequently replaced, and the system alternates between different bias conditions. After simulations, a canonical ensemble can be obtained by reweighting each snapshot in the sampled conformational ensemble (Souaille & Roux, 2001; Shirts & Chodera, 2008).

For using these two features, users must adjust some settings. First, the number of runs is an adjustable parameter. In the case of the REMD method, using a larger number of replicas allows wider overlaps of the energy distributions between adjacent replicas and results in a higher acceptance probability. However, increasing the number of runs proportionally increases the computational costs. Users must choose the optimal balance between the number of runs and the length of each run according to the available computational resources. Previously, Ikebe et al. (2010) reported that an increase in the number of independent runs of McMD yields efficient exploration of a wider area of the conformational space. However, the balance between the number of runs and the length of each run has not been discussed. Second, the frequency of the bias-strength replacement is also adjustable. In the REMD method, the frequency of replica-exchange trials must be specified by users. Other methods using a continuous bias strength, e.g., McMD and adaptive umbrella sampling (AUS), can control the frequency of bias-strength replacement by using the virtual-system coupling scheme (Higo, Umezawa & Nakamura, 2013; Higo et al., 2015), as described later. It is reported that the frequency of the bias-strength replacement affects the resultant ensembles for the REMD method (Periole & Mark, 2007; Sindhikara, Meng & Roitberg, 2008; Rosta & Hummer, 2009; Sindhikara, Emerson & Roitberg, 2010; Jani, Sonavane & Joshi, 2014; Iwai, Kasahara & Takahashi, 2018). Although higher frequencies enhance the traversals in the temperature space, they are suspected as an origin of artifacts. Although the effects of these features have been examined, these studies were mainly based on simple model peptides with helix–coil transitions. The effects of the features for more practical cases, e.g., a protein folding–unfolding transition, are not fully understood. More importantly, the relationship between these effects and the complexities of molecular systems, e.g., the degree of freedom and ruggedness of the FEL, are expected to be revealed.

In this study, we aim to elucidate the effects of the number of runs and the bias-replacing frequency for the GE method on the resultant conformational ensembles of molecular models including a foldable mini-protein and disordered model peptides. We utilized the trivial-trajectory parallelized virtual-system coupled McMD (TTP-V-McMD) method (Ikebe et al., 2010; Higo, Umezawa & Nakamura, 2013), which is a variant of the McMD method, for simulating the three molecular models with an explicit solvent: (i) Trp-cage, (ii) 8-residue poly-glutamic acid (PGA8), and (iii) 20-residue poly-glutamic acid (PGA20). We chose these models as test cases to examine the simulation conditions because they are sufficiently small for elucidating their conformational ensembles within a practical computational time in addition to the fact that their structural properties have been well studied thus far. Trp-cage, which is a mini-protein consisting of 20 amino acids, has been widely studied as a prominent model of protein folding (Ahmed et al., 2005; Hudáky et al., 2007; Hałabis et al., 2012). Poly-glutamic acids have been used as model peptides to characterize the conformational properties of polypeptides (Clarke et al., 1999; Kimura et al., 2002; Finke et al., 2007; Donten & Hamm, 2013; Ogasawara et al., 2018). We analyzed their FELs under various parameter settings to provide a guide for adjusting these parameters for the GE methods. The questions to be answered are as follows: (1) Which condition is more efficient: many short simulations or a small number of long-term simulations? (2) Which is better: frequent or less frequent replacement of the bias strength? Moreover, we discuss the relationship between the relaxation of the energy and that of the protein conformation. While the McMD method enhances the relaxation in the energy space, it is not guaranteed to enhance the relaxation in the conformational space. We analyzed these two relaxation processes using the McMD trajectories calculated with the various settings.

Materials and Methods

We calculated the FELs of the three explicitly solvated molecular models: Trp-cage, PGA8, and PGA20, by using the TTP-V-McMD method with various settings. The theory of McMD, virtual-system coupled McMD (V-McMD), and trivial-trajectory parallelization (TTP) is briefly presented in the following subsections. Then, the simulation protocol applied in this study is described.

Multicanonical MD

The McMD method efficiently explores the conformational space of a molecular system, by applying a biasing energy term. The Hamiltonian H of the system is (1) $H = K + E_{mc},$ where K and E_mc denote the kinetic energy and multicanonical energy, respectively. E_mc is defined as follows: (2) $E_{mc} = E + R T \ln P_{c} (E, T),$ where E is the potential energy, and the second term corresponds to the bias potential. R is the gas constant, and P_c(E, T) denotes the canonical distribution at the temperature T: (3) $P_{c} (E, T) = \frac{n (E) \exp (- \frac{E}{R T})}{Z_{c} (T)},$ where n(E) denotes the density of states, and Z_c(T) is the partition function of the canonical distribution at the temperature T. With this definition, the potential energy distribution of an ensemble obtained from the McMD, or the multicanonical distribution, P_mc(E), becomes uniform: (4) $\begin{matrix} P_{mc} (E) = \frac{n (E) \exp (- \frac{E_{mc}}{R T})}{Z_{mc} (T)} \\ = \frac{n (E) \exp (- \frac{E}{R T})}{P_{c} (E, T) Z_{mc} (T)} = \frac{Z_{c} (T)}{Z_{mc} (T)} = c o n s t . \end{matrix}$

As a result, the McMD method performs a random walk in the potential energy space and generates a uniform distribution of potential energy in a resultant ensemble. After a multicanonical ensemble is obtained, a canonical ensemble at any temperature in a sampled energy range can be generated by reweighting the probability of existence of each snapshot.

Equations (3) and (4) include an analytical form of n(E), which is usually unknown a priori. Therefore, n(E) is approximated as a parametric function, e.g., the polynomial function, and its parameters are estimated by iterations of McMD simulations to make P_mc(E) near-uniform. In the ith iteration, the bias potential is calculated using Eq. (2) with the canonical distribution obtained from the (i–1)th iteration, i.e., $P_{c}^{i - 1} (E, T)$ . As the result of the ith iteration, we obtain $P_{m c}^{i} (E)$ . $P_{c}^{i} (E, T)$ can be calculated as (5) $P_{c}^{i} (E, T) = P_{m c}^{i} (E) P_{c}^{i - 1} (E, T) .$

See Higo et al. (2012) for details.

Virtual-system coupled McMD

Virtual-system coupled McMD (V-McMD) introduces a virtual system, which interacts with the molecular system, and the multicanonical ensemble is calculated for the entire system consisting of these two subsystems (Higo, Umezawa & Nakamura, 2013). In practice, this method can be roughly interpreted as a combination of McMD and the simulated tempering method. The simulated tempering method replaces the system temperature with the Metropolis criterion and performs a canonical simulation until the next replacement trial. On the other hand, in V-McMD, the potential energy space is split into several regions (Fig. S1), and the molecular system is trapped in one of these regions. With a certain time interval (t_VST), the molecular system replaces the region to be trapped. The state variable governing which region traps the molecular system is called the “virtual state,” and the system defined by the virtual state is called the “virtual system.” The energy range of each virtual state is defined to be overlapped with the adjacent virtual states. When the molecular system has the potential energy E_k in the overlapped region of the ith and (i + 1)th virtual states, the state transition between these two virtual states can occur. Because this transition does not change the atomic coordinates or potential energy, the Metropolis criterion of this state transition is always satisfied. The time interval of virtual-state transitions (t_VST) should be determined arbitrarily by users. See Higo, Umezawa & Nakamura (2013) for details.

Trivial-trajectory parallelization

According to the theory of TTP, trajectories of multiple independent McMD runs with the same molecular system and different initial conditions can be treated as a single trajectory of an McMD simulation by concatenating the trajectories in an arbitrary order. This theory requires the condition that the initial coordinates of each run are sampled from the multicanonical distribution. Because the initial coordinates of production runs can be obtained from the near-uniform potential energy distribution generated by iterative simulations, it is expected that this condition holds. The McMD method with the TTP theory, which is called the TTP–McMD method, can be considered as a hybrid Monte Carlo sampler, by assuming that the system transitions from the last snapshot of the ith run (the microscopic state m_il) to the first snapshot of the jth run (the microscopic state m_jf) via a Monte Carlo step (Fig. S2). See Ikebe et al. (2010) for details.

Simulation protocol

We studied the three molecular systems, which are Trp-cage, PGA8, and PGA20 in an explicitly solvated cubic periodic boundary cell, by using the TTP-V-McMD method. Random coil structures of Trp-cage, PGA8, and PGA20 were constructed using the Modeller software (Webb & Sali, 2016) without any template. The termini of the PGAs were capped with acetyl and N-methyl groups, and the termini of the Trp-cage were not capped. Each of these molecular models was plased into a cubic box filled by water molecules; the number of water molecules were 5,097, 2,879, and 3,800 for Trp-cage, PGA8, and PGA20, respectively. In addition, a Cl^– ion was added to the Trp-cage model to cancel the net charge of the system. The net charge of the PGA models was zero because all the Glu residues were protonated.

The system was relaxed by using the GROMACS software (Pronk et al., 2013). Energy minimizations were successively applied using the steepest descent and conjugate gradient methods. Then, an MD simulation under a constant-pressure ensemble with the Berendsen barostat was performed for 1 ns. In the first half of the simulation, gradual heating from 10 to 300 K was applied. In the simulation, the positions of the heavy atoms of the Trp-cage were restrained, the bond lengths were not constrained, and the integration time step (Δt) was 0.5 fs. Subsequently, an additional constant-pressure relaxation was applied for 1 ns with Δt = 2.0 fs, and the covalent bonds to hydrogen atoms were constrained using the LINCS method (Hess et al., 1997; Hess, 2008). The final configuration of each model was used for the TTP-V-McMD simulations. The cell dimensions of these configurations were 54.0378, 44.6116, and 49.1174 Å for Trp-cage, PGA8, and PGA20, respectively.

For each model, the following steps were performed using our MD simulation program, which is called myPresto/omegagene and is tailored for GE simulations (Kasahara et al., 2016). The protein conformation was randomized with a constant-temperature simulation at 800 K. By using 30 snapshots taken from a trajectory with an interval of 300 ps, 30 independent runs were simulated with a gradual decrease in the temperature from 629 to 296 K to estimate the density of states. Successively, the TTP-V-McMD simulations were iteratively performed while updating the estimation of the density of states (Higo, Umezawa & Nakamura, 2013). A total of 84 production runs were performed (N_run = 84) for each of three different interval times for the virtual-state transitions (t_VST) meaning the interval times for bias-potential replacement: t_VST = 0.002, t_VST = 0.2, and t_VST = 20 ps. The simulation length of each run (t_run) was 50 ns except for the Trp-cage model with t_VST = 0.2 ps, t_run, of which the simulation length was 200 ns. In total, 50.4 μs of trajectories were simulated as production runs. The virtual system was divided into seven states that cover the energy range corresponding to the canonical distribution from 296 to 629 K. The velocity scaling method (Berendsen et al., 1984) was applied to maintain the system temperature.

For the potential parameters, the AMBER ff99SB-ILDN force field (Lindorff-Larsen et al., 2010), the ion parameter presented by Joung & Cheatham (2008), and the TIP3P water model (Jorgensen et al., 1983) were applied. The electrostatic potential was calculated using the zero-multipole summation method, which is a non-Ewald scheme (Fukuda, 2013; Fukuda, Kamiya & Nakamura, 2014). The zero-dipole condition with the damping factor α = 0 was used (Fukuda, Yonezawa & Nakamura, 2011; Fukuda et al., 2012).

Comparison of simulated ensembles among different settings

On the basis of the trajectories obtained from of the TTP-V-McMD production runs, the effects of the simulation conditions, i.e., the time interval for bias-strength replacement (t_VST), the number of independent runs (N_run), and the simulation time of each run (t_run), were assessed.

For the Trp-cage model, we analyzed the FEL for various conformational ensembles on the basis of the two structural parameters: the root-mean-square deviation (RMSD) of Cα atoms from the native conformation (PDB ID: 1L2Y, model 1), which is denoted as RMSD_native, and the radius of gyration (R_g). The FEL is visualized as the map of the potential of mean forces (PMF) on the plane defined by these two parameters. We defined the reference ensemble as the ensemble calculated for the conditions of t_run = 200 ns, N_run = 84, and t_VST = 0.2 ps, because it is expected to have the highest reliability owing to its abundance of samples (it comprises a total of 16.8 μs of simulations). The FELs analyzed in various conditions were compared with the reference FEL with regard to the Pearson correlation coefficient of the PMF (PCC_PMF). To calculate the PCC_PMF for a pair of FELs, bins without samples in one of the two FELs were ignored. In addition, the probability of the existence of the native conformations in each ensemble (P_native) was measured to characterize each ensemble. The native conformations are defined as the conformations with RMSD_native ≤ 2.0 Å.

For the PGA models, the FELs were analyzed using principal component analysis (PCA) based on the Cα–Cα distances (28 and 190 dimensions for PGA8 and PGA20, respectively). The PCAs were performed using aggregations of trajectories with all the three t_VST conditions for each model. For each t_VST condition, the ensemble calculated from the entire trajectory (t_run = 50 ns and N_run = 84) was considered as the reference ensemble. The FELs were compared with regard to PCC_PMF, similar to the Trp-cage case.

To assess the effects of N_run and t_run, PCC_PMF (and P_native for the Trp-cage model) were calculated for ensembles with subsets of the reference trajectories. Because there are many possibilities to pick N_run runs from 84 runs and t_run-length trajectories from the entire set of trajectories, we analyzed them by using the bootstrap approach. We constructed an ensemble by taking a random sampling of N_run runs from 84 runs with replacement and repeated it 100 times. The statistics over the 100 ensembles were analyzed via simulation with this N_run setting. This process was repeatedly performed for N_run = 1, 2, …, 84. For the case of t_run, the trajectories were split into 5-ns bins, and an ensemble was constructed by taking a random sampling of t_run/5 bins with replacement. We confirmed that the results of the bootstrap analyses with 100 and 200 samples were consistent (Fig. S3).

The sampling efficiency was measured in terms of the frequency of traversals between low- and high-energy regions, which were defined as the ranges (E_min, E_low) and (E_high, E_max), respectively. Here, E_min and E_max denote the minimum and maximum potential energies in all the trajectories, respectively, and E_low and E_high are defined as follows.

(6)

E_{l o w} = E_{m i n} + X (E_{m a x} - E_{m i n})

(7)

E_{h i g h} = E_{m a x} - X (E_{m a x} - E_{m i n})

X is an arbitrary parameter in the range of 0–0.5. We assessed X = 0.2 and 0.3. The traversal frequency F_travers^E was calculated as the number of traversals between the two energy regions during 1.0 ns. The traversal frequencies of RMSD_native and R_g (F_travers^RMSD and F_travers^Rg, respectively) were also analyzed.

Results

In the first part of this section, the results of the Trp-cage model are described. The reference ensemble is characterized in the subsection, “FEL of folding–unfolding equilibrium of Trp-cage.” Next, the effects of the parameters t_run, N_run, and their balances are discussed in the successive subsections: “Effects of simulation time for each run,” “Effects of number of independent runs,” and “Balance between simulation time and number of runs,” respectively. Subsequently, the effects of the other parameter t_VST are discussed in the subsection, “Effects of frequency of bias-strength replacement.” Additionally, the following subsection, “Effects of system complexity” describes the results of the PGA8 and PGA20 models and compares them with those of the Trp-cage model.

FEL of folding–unfolding equilibrium of Trp-cage

For the Trp-cage model, we performed 34 iterations of TTP-V-McMD simulations while updating the estimation of the density of states, n(E), and obtained a near-uniform energy distribution (Fig. S4). On the basis of this estimation, we performed production runs with N_run = 84, t_run = 200 ns, and t_VST = 0.2 ps. This is called the reference setting hereinafter. The resultant canonical ensemble reweighted at 300 K is referred to as the reference ensemble.

The FEL of the reference ensemble projected on the RMSD_native–R_g plane is shown in Fig. 1A. The most stable basin corresponds to the native structure consisting of an α-helix at the N-terminus, a 3₁₀-helix at the middle, and a loop region at the C-terminus (the secondary structural elements were recognized by using the DSSP software) (Kabsch & Sander, 1983). For example, the RMSD_native of one of the most probable structures in this basin was 0.994 Å (Fig. 1B). The energy barrier (approximately 3.3 kcal/mol) was observed at RMSD_native ≈ 3 Å in a low-R_g regime. Around this barrier, the 3₁₀-helix at the middle of the peptide chain was partially deformed; this deformation can be the first step of an unfolding process (Fig. 1C). The details of the unfolding pathway are not discussed in this paper. The second basin was widely spread around RMSD_native = 4–7 Å and R_g = 7–9 Å. This corresponds to the unfolded state, and examples of the unfolded structures taken from this basin are shown in Figs. 1E and 1F. The difference in the PMF between the bottoms of the first and second stable basins was 1.014 kcal/mol, and the population of the native conformations (P_native) was 22.37%. The landscape is qualitatively similar to that calculated using the REMD method reported by another group (Day, Paschek & Garcia, 2010). Our TTP-V-McMD simulation successfully identified the native structure as the most stable basin in the energy landscape, by using the reference setting.

Figure 1: FEL calculated by the reference ensemble of Trp-cage.
(A) FEL based on the RMSD_native–R_g plane. The color gradation indicates the PMF. (B) Snapshot taken from the first basin (blue) superimposed on the experimentally solved structure (gray; PDB ID: 1Y2L, model 1). (C) Examples of snapshots near the first basin. The structures colored dark cyan and light cyan correspond to the positions C1 and C2 marked in (A), respectively. (D and E) Examples of unfolded structures in the second basin. The positions of each snapshot on the FEL are marked in (A).

Download full-size image

DOI: 10.7717/peerj-pchem.4/fig-1

Effects of simulation time for each run

The FELs of the Trp-cage model were drawn for a variety of t_run values under the condition of N_run = 84 and compared with the reference FEL. The FELs based on the trajectories of 0–25, 0–50, and 0–100 ns are shown in Figs. 2A–2C, respectively. The overall geometries of these FELs were qualitatively similar to the reference (Fig. 1A); their PCC_PMF values were 0.936, 0.936, and 0.994, respectively. The bootstrap statistics of PCC_PMF for each t_run value are summarized in Fig. 2D. For t_run = 200 ns, the bootstrap average and the standard deviation (SD) of PCC_PMF were 0.990 and 0.007, respectively. Even in the worst case among 100 randomly generated ensembles with t_run = 200 ns, PCC_PMF was 0.966. From this condition, a decrease in t_run yielded a slow decay of PCC_PMF, and PCC_PMF reached 0.9 at t_run ≈ 30 ns, which corresponds to 15% of the samples in the reference. Further decreasing t_run resulted in a steep decrease of PCC_PMF. Along with the decrease of the bootstrap average of PCC_PMF, the SD was increased. This means that an insufficient simulation time causes a loss of robustness of the results.

FELs of Trp-cage for various trun values with Nrun = 84. — Figure 2: FELs of Trp-cage for various t_run values with N_run = 84.
(A–C) FELs based on the trajectories of 0–25 ns (A), 0–50 ns (B), and 0–100 ns (C). (D) Bootstrap statistics of *PCC*_PMF. The solid line is the average, the dashed lines are the sum of the average and SD and the subtraction of the SD from average. The dotted lines indicate the maximum and minimum values among 100 randomly generated ensembles in each condition. (E) Statistics of P_native shown in the same scheme as (D).

Download full-size image

DOI: 10.7717/peerj-pchem.4/fig-2

In contrast to the fact that the PCC_PMF decays in a shorter t_run than the reference, the balance between the folded and unfolded states (P_native) was almost constant regardless of t_run (Fig. 2E); the bootstrap average of P_native for t_run = 5–200 ns was in the range of 0.220 to 0.225. However, the SD of P_native was reduced with the increase of t_run; the SDs of P_native at t_run = 5, 50, and 200 ns were 0.07, 0.02, and 0.008, respectively. The loss of robustness due to the insufficiency of the simulation time is demonstrated in terms of not only the similarity of the entire FEL but also the stability of the native fold.

Effects of number of independent runs

As in the previous subsection, the effects of the reduction of N_run on the FELs were assessed under the condition of t_run = 200 ns. Examples of FELs with N_run = 10, 21, and 42 are shown in Figs. 3A–3C, respectively; the PCC_PMF values were 0.637, 0.939, and 0.993, respectively. Although the positions and wideness of the basins were similar to the reference, the FELs with a smaller N_run were smoother and lacked small bumps on the landscapes. The bootstrap statistics of PCC_PMF for various N_run values (Fig. 3D) were similar to those for t_run (Fig. 2D). The quantity of the samples required for PCC_PMF ≥ 0.9 was approximately one-fourth of the reference (N_run ≈ 21). The average (and the SD) of PCC_PMF at N_run = 21 and 42 were 0.906 (0.07) and 0.956 (0.04), respectively. Larger N_run values are needed to obtain robust results.

Characteristics of FELs of Trp-cage for smaller Nrun values with trun = 200 ns. — Figure 3: Characteristics of FELs of Trp-cage for smaller N_run values with t_run = 200 ns.
(A–C) Examples of FELs with N_run = 10 (A), N_run = 21 (B), and N_run = 42 (C). (D and E) Bootstrap statistics of *PCC*_PMF (D) and P_native (E). See also the legend of Fig. 2.

Download full-size image

DOI: 10.7717/peerj-pchem.4/fig-3

Regarding P_native, the influence of the reduction of N_run (Fig. 3E) differed from that of the reduction of t_run (Fig. 2E). A lower N_run resulted in the underestimation of the population of native conformations. P_native reached at plateau for N_run ≥ 21. A certain number of runs was needed to obtain robust results, and t_run = 200 ns was too short to reach equilibrium with a small number of trajectories for this system.

Balance between simulation time and number of runs

The evaluation for various t_run values with N_run = 84 runs (Fig. 2) and that for various N_run values with t_run = 200 ns (Fig. 3) indicate that reducing t_run produced better results than reducing N_run if the cumulative simulation time (N_run × t_run) was the same. Figure 4 shows direct comparisons of the results, indicating that high-N_run conditions resulted in a higher PCC_PMF and more similar values of P_native to the reference, with lower SDs, than long-t_run conditions. In particular, the qualitative difference between the two strategies is shown by the mean of P_native. Reducing N_run resulted in the significant underestimation of the fold stability, but reducing t_run did not.

Direct comparison between reducing trun with the fixed-Nrun condition (blue line) and reducing Nrun with the fixed-trun condition (red line) for the Trp-cage model. — Figure 4: Direct comparison between reducing t_run with the fixed-N_run condition (blue line) and reducing N_run with the fixed-t_run condition (red line) for the Trp-cage model.
The vertical axes indicate (A) the average of *PCC*_PMF, (B) the SD of *PCC*_PMF, (C) the average of P_native, and (D) the SD of P_native. The horizontal axis indicates the accumulated simulation length (N_run × t_run).

Download full-size image

DOI: 10.7717/peerj-pchem.4/fig-4

In addition, we performed bootstrap analyses for all the combinations of 40-t_run settings (5, 10, 15, …, 200 ns) and 21-N_run settings (4, 8, 12, …, 84). The average values of PCC_PMF and P_native in all the conditions are presented in Fig. 5 and Fig. S5. The PCC_PMF was proportional to log(N_run × t_run). While the trend of P_native is ambiguous, the use of a larger number of samples resulted in a higher P_native. In the case where only small amount of data was available, a lower ratio of t_run/N_run (purple plots in Fig. 5) yielded better results.

Distribution of (A) the average of PCCPMF and (B) Pnative along the logarithm of the accumulated simulation length for various combinations of Nrun and trun extracted from the trajectories of the Trp-cage model. — Figure 5: Distribution of (A) the average of *PCC*_PMF and (B) P_native along the logarithm of the accumulated simulation length for various combinations of N_run and t_run extracted from the trajectories of the Trp-cage model.
The color of each plot indicates the log-ratio of N_run to t_run compared with the reference. The definition is log[(t_run/N_run)/(200/84)]. This value becomes greater than 0 for conditions with a higher ratio of t_run to N_run than the reference.

Download full-size image

DOI: 10.7717/peerj-pchem.4/fig-5

Effects of frequency of bias-strength replacement

The parameter t_VST controls the frequency of the bias-strength switching in the TTP-V-McMD method. We investigated the effects of this parameter by comparing the TTP-V-McMD simulations of the Trp-cage model under the three conditions—t_VST = 0.002, 0.2, and 20 ps—with t_run = 50 ns for N_run = 84.

Table 1 summarizes the frequency of traversals between high- and low-potential energy regimes (F_trv^E), as defined in Eqs. (6) and (7) with X = 0.3 and 0.2, as well as the frequency of traversals between RMSD_native (F_trv^RMSD) and R_g (F_trv^Rg). The simulations with a shorter t_VST resulted in faster traversals in the potential energy space, indicating that with a shorter t_VST, a wider potential energy range can be sampled in a shorter time. However, faster traversal in the potential energy space does not ensure faster transition of the protein conformation. For both X = 0.2 and 0.3, although the setting of t_VST = 0.002 ps yielded the highest F_trv^E, this condition did not yield a higher F_trv^RMSD and F_trv^Rg compared to when a longer t_VST was used. This result indicates that the relaxation of the conformation requires a longer time than that of the potential energy. If a strong bias is applied and the system takes a high-potential energy state, it can return to low-energy states before conformational changes. Therefore, a moderate speed for traversals in the potential energy space is ideal for efficient conformational sampling. In the case of X = 0.2, t_VST = 0.2 ps exhibited the most frequent conformational changes.

Table 1:

Average values (and the standard errors) of the traversal frequencies over 84 runs for the Trp-cage model.

t_VST (ps)	0.002	0.2	20
X		0.3
F_trv^E (ns⁻¹)	1.63 (0.06)	1.45 (0.04)	1.02 (0.04)
F_trv^RMSD (ns⁻¹)	0.057 (0.006)	0.060 (0.004)	0.062 (0.006)
F_trv^Rg (ns⁻¹)	0.040 (0.005)	0.044 (0.003)	0.050 (0.006)
X		0.2
F_trv^E (ns⁻¹)	0.70 (0.03)	0.62 (0.02)	0.46 (0.02)
F_trv^RMSD (ns⁻¹)	0.005 (0.001)	0.011 (0.001)	0.006 (0.002)
F_trv^Rg (ns⁻¹)	0.008 (0.002)	0.012 (0.001)	0.007 (0.002)

DOI: 10.7717/peerj-pchem.4/table-1

In addition, the resultant ensembles were slightly affected by the setting of t_VST. We analyzed P_native for ensembles of various t_run values with N_run = 84 using the bootstrap method (Fig. S6). The results for all three t_VST values showed similar trends, i.e., near-constant average values and the gradual decay of the SD with the increase of t_run. While t_VST = 0.2 ps showed a smaller P_native than the other two t_VST settings, the difference was smaller than the SD. On the other hand, higher SD values were observed in the following order: t_VST = 0.2 > 20 > 0.002 ps. This is consistent with the order of F_trv^RMSD and F_trv^Rg (Table 1). The result indicates that more frequent traversals between high- and low-RMSD_native conformations make it possible to explore a wider region of the conformational space; thus, the population of the native conformation decreases, and the SD increases.

Regarding the PCC_PMF with the reference setting (t_VST = 0.2 ps, N_run = 84, and t_run = 200 ns), the average PCC_PMF values at t_run = 50 ns differed among different settings of t_VST (Fig. S6). This indicates that changing t_VST yields subtle differences in the resultant ensemble. Regarding the balance between t_run and N_run, the trends were similar for all the settings of t_VST (Fig. S7).

Effects of system complexity: comparison with the PGA models

We performed the same analyses for the molecular models of PGA8 and PGA20. In contrast to Trp-cage, these peptides did not exhibit a particular fold. The FELs of both PGA8 and PGA20 were unimodal distributions, the basins of which consisted of a variety of collapsed conformations (Fig. 6 for t_VST = 0.2 ps). The ensembles included short secondary structural elements but they were unstable. Although the shape of the small bumps in the basins differed depending on the simulation conditions, the overall geometries of the FELs were similar (Fig. S8 for t_VST = 0.002 ps and 20 ps).

FELs calculated by ensembles of (A) PGA8 and (B) PGA20 using trun = 50 ns and Nrun = 84 with tVST = 0.2 ps. — Figure 6: FELs calculated by ensembles of (A) PGA8 and (B) PGA20 using t_run = 50 ns and N_run = 84 with t_VST = 0.2 ps.
(C–E) Examples of snapshots in the basins marked in (A) and (B).

Download full-size image

DOI: 10.7717/peerj-pchem.4/fig-6

Regarding the balance between t_run and N_run, Fig. 7 shows the bootstrap averages of PCC_PMF between the ensemble calculated by the full-length trajectory (t_run = 50 ns and N_run = 84) and those calculated by the reduced trajectories. No clear differences were found between the PCC_PMF curve with reduced t_run and that with reduced N_run for both the PGA8 and PGA20 (Fig. 7 for t_VST = 0.2 ps; Fig. S9 for the other conditions). A small number of long simulations exhibited the similar efficiency as that of many short simulations. In addition, no significant differences were found between the results of PGA8 and PGA20. It is noteworthy that the conformational space of PGA20 is considerably wider than that of PGA8 and similar to that of Trp-cage, because the conformational space volume of polypeptides is determined primarily by their length. Therefore, we concluded that the effects of balance between t_run and N_run are determined by the complexity of the FEL (e.g., existence of the free-energy barrier) rather than the conformational space volume. An increase in the number of runs is more effective for a system with more complex FEL.

Direct comparisons between reducing trun with fixed-Nrun (blue line) and reducing Nrun with fixed-trun (red line) for (A and B) the PGA8 and (C and D) PGA20 systems. — Figure 7: Direct comparisons between reducing t_run with fixed-N_run (blue line) and reducing N_run with fixed-t_run (red line) for (A and B) the PGA8 and (C and D) PGA20 systems.
The vertical axes indicate (A and C) the bootstrap average of *PCC*_PMF, and (B and D) the SD of *PCC*_PMF. The horizontal axis indicates the accumulated simulation length (N_run × t_run). The results of t_VST = 0.2 ps are presented. See also Fig. S9 for the other t_VST conditions.

Download full-size image

DOI: 10.7717/peerj-pchem.4/fig-7

For the PGA models, the frequencies of traversals in the potential energy and R_g spaces (F_trv^E and F_trv^Rg, respectively) are summarized in Table 2. Both the PGA8 and PGA20 models yielded similar trends as the Trp-cage model (Table 1). Although frequent replacements of bias-potential strength enhanced the traversals in the potential energy space, they did not enhance the conformational changes in terms of R_g. This implies that the conformational changes are much slower than the potential energy changes even if there is no free-energy barrier exists in the landscape. However, in contrast to the Trp-cage case, the drawback of the frequent replacement, that is, slow traversals in the conformational space, is unclear in the case of PGA20.

Table 2:

Average values (and standard errors) of the traversal frequencies over 84 runs for PGA models.

Model	PGA8			PGA20
t_VST (ps)	0.002	0.2	20	0.002	0.2	20
X	0.3			0.3
F_trv^E (ns⁻¹)	2.91 (0.03)	2.70 (0.03)	0.86 (0.02)	1.05 (0.04)	0.99 (0.03)	0.46 (0.02)
F_trv^Rg (ns⁻¹)	0.44 (0.02)	0.47 (0.02)	0.47 (0.02)	0.045 (0.005)	0.044 (0.005)	0.049 (0.006)
X	0.2			0.2
F_trv^E (ns⁻¹)	1.58 (0.02)	1.53 (0.02)	0.50 (0.01)	0.4 (0.02)	0.41 (0.01)	0.22 (0.01)
F_trv^Rg (ns⁻¹)	0.117 (0.007)	0.147 (0.007)	0.146 (0.007)	0.011 (0.002)	0.013 (0.002)	0.015 (0.003)

DOI: 10.7717/peerj-pchem.4/table-2

Discussion

We examined the performance of the TTP-V-McMD method with regard to two adjustable settings: (i) the balance between the number of runs (N_run) and the simulation length in each run (t_run) and (ii) the frequency of the bias-strength replacement (t_VST). For (i), in the Trp-cage model including folding–unfolding transition, we found higher robustness of the conditions with a larger number of runs than with longer simulations. In particular, the probability of the existence of native conformations in a resultant ensemble (P_native) was more sensitive to the condition than the entire similarity of the FEL. However, for the cases of PGAs without free-energy barrier in their FELs, no significant effect was shown in the balance between the number and length of simulations. Therefore, the optimal balance depended on the molecular system, and the complexity of the FELs was a key feature rather than the degree of freedom. In any case, increasing the number of simulations was recommended because it is not worse than increasing the length of each run. This result is practically useful because performing many parallel runs is easier than executing a single long simulation. While the result obtained here encourages performing many short runs, it requires the condition that the initial structures of the production runs are uniformly sampled from the multicanonical ensemble, whose energy distribution is uniform (Ikebe et al., 2010). As our protocol samples the initial structures of the production runs from the previous iteration of the McMD, it is expected that this condition holds.

For (ii), whereas higher frequencies of bias-strength replacement enhance the sampling of a wider range of potential energy, they do not ensure the enhancement of the sampling of a wider range of conformations. This means that the enhancement of the sampling along one variable (e.g., potential energy or temperature) does not ensure the enhancement of the sampling along another variable (e.g., RMSD and R_g). Rapid traversals in the energy space sometimes obtain a high energy and return to the low-energy regime before conformational change regardless of the existence of free-energy barrier in the FEL. A moderate frequency is needed to maximize the performance for any molecular system.

The findings that we obtained by applying the TTP-V-McMD method provide insight into the characteristics of many other GE methods. (i) For GE methods that involve running independent parallel simulations, e.g., simulated tempering and AUS, performing many short runs can be more effective than increasing the length of each run. For GE methods where parallel runs are coupled, e.g., the REMD method, this conclusion should not be simply applied. For example, an increase in the number of runs in the REMD method resulted in larger overlaps of the distributions of neighboring replicas, along with an increase in the acceptance probability of replica-exchange trials. Our previous evaluation for the REMD method showed that a larger number of replicas does not always yield better results (Iwai, Kasahara & Takahashi, 2018). The number of runs should be adjusted independently from the coupling condition of the parallel runs; for example, the number of runs in a REMD simulation could be increased by performing two or more independent REMD simulations with different initial conformations, and aggregating the resultant ensembles. (ii) Regarding the frequency of the bias-strength replacement, the conclusion that the interval should be long enough to relax the conformation could be transferred to other GE methods. For the REMD methods, the effects of the interval for replica-exchange trials have been reported; while some studies recommended shorter intervals (Sindhikara, Meng & Roitberg, 2008; Sindhikara, Emerson & Roitberg, 2010), the side effects of highly frequent exchange trials have also been reported and were consistent to our result (Periole & Mark, 2007; Iwai, Kasahara & Takahashi, 2018).

Conclusions

In this study, the effects of two parameters of GE methods, i.e., (i) the balance between the number of runs (N_run) and the simulation length in each run (t_run) and (ii) the frequency of the bias-strength switching (t_VST) were extensively examined with using all-atom explicit-solvent models of three polypeptides that are a foldable mini-protein and disordered peptides. We suggest a guide to adjust the setting for general molecular systems and GE methods. (i) Increasing in the number of runs should be prioritized rather than increasing the simulation length. (ii) Highly frequent replacements of the bias potentials may yield side effects because conformational relaxation was slower than potential energy relaxation. The time interval for replacement should be longer than or equal to 0.2 ps.

Supplemental Information

Schematic illustration of the V-McMD method.

(A) A virtual state is defined as a range of potential energies. The neighboring virtual states are overlapped in the potential energy space. This example shows the virtual system is consisting of five virtual states from v₁ to v₅. (B) A time course of potential energy in a V-McMD simulation. (C) A time course of virtual state in the same trajectory as (B). The virtual-state transitions in a certain time interval, t_VST.

DOI: 10.7717/peerj-pchem.4/supp-1

Download

Schematic illustration of the TTP method.

(A) Several V-McMD simulations are performed with the same molecular system but different initial conditions. They generate different trajectories. (B) The TTP method concatenates these trajectories in an arbitrarily order. The jumps at the concatenated points (triangles) are considered to be Monte Carlo steps.

DOI: 10.7717/peerj-pchem.4/supp-2

Download

Effects of the number of bootstrap samples.

The bootstrap averages of PCC_PMF using 100 (red) and 200 (cyan) samples for various t_run are shown. The red curve is the same as the average shown in Fig. 2D.

DOI: 10.7717/peerj-pchem.4/supp-3

Download

Potential energy distribution of the multi-canonical ensemble of the Trp-cage model.

The vertical axis indicates the natural logarithmic probability of snapshots with corresponding potential energies. The horizontal axis is the potential energy. The solid line is the multicanonical distribution generated by the reference ensemble, and the dashed lines are the canonical distribution at 300 K (left) and 600 K (right).

DOI: 10.7717/peerj-pchem.4/supp-4

Download

Results of bootstrap analyses for combinations of various N_run and t_run.

The color gradations indicate (A) the average of PCC_PMF, (B) the SD of PCC_PMF, (C) the average of P_native, and (D) the SD of P_native.

DOI: 10.7717/peerj-pchem.4/supp-5

Download

The effects of t_VST on (A) the average of PCC_PMF, (B) the SD of PCC_PMF, (C) the average of P_native, and (D) the SD of P_native for the Trp-cage model.

The horizontal axis indicates t_run. The blue, red, and orange lines indicate t_VST = 0.002, 0.2, and 20 ps, respectively.

DOI: 10.7717/peerj-pchem.4/supp-6

Download

Summary of the bootstrap averages (A, B, C, G, H, I) and SDs (D, E, F, J, K, L) of PCC_PMF (A, B, C, D, E, F) and P_native (G, H, I, J, K, L) for all the conditions of the Trp-cage model.

The horizontal and vertical axes indicate N_run and t_run, respectively.

DOI: 10.7717/peerj-pchem.4/supp-7

Download

FELs calculated by ensembles of PGAs using t_run = 50 ns and N_run = 84.

(A) The FEL of PGA8 with t_VST = 0.002 ps. (B) The FEL of PGA8 with t_VST = 20 ps. (C) The FEL of PGA20 with t_VST = 0.002 ps. (D) The FEL of PGA20 with t_VST = 20 ps. See also Fig. 6 in the main text for t_VST = 0.2 ps.

DOI: 10.7717/peerj-pchem.4/supp-8

Download

Direct comparisons between reducing t_run with the fixed-N_run condition (blue line) and reducing N_run with the fixed-t_run condition (red line) for (A, B, C, D) the PGA8 and (E, F, G, H) PGA20.

The vertical axes indicate (A, C, E, G) the average of PCC_PMF, (B, D, F, H) the SD of PCC_PMF. The horizontal axis indicates the accumulated simulation length (N_run × t_run). The t_VST were (A, B, E, F) 0.002 ps and (C, D, G, H) 20 ps. See also Fig. 7 in the main text for the other t_VST condition.

DOI: 10.7717/peerj-pchem.4/supp-9

Download

[1] Abraham MJ, Murtola T, Schulz R, Páll S, Smith JC, Hess B, Lindahl E. 2015. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1–2:19-25

[2] Ahmed Z, Beta IA, Mikhonin AV, Asher SA. 2005. UV−resonance Raman thermal unfolding study of Trp-cage shows that it is not a simple two-state miniprotein. Journal of the American Chemical Society 127(31):10943-10950

[3] Berendsen HJC, Postma JPM, Van Gunsteren WF, DiNola A, Haak JR. 1984. Molecular dynamics with coupling to an external bath. Journal of Chemical Physics 81(8):3684-3690

[4] Clarke DT, Doig AJ, Stapley BJ, Jones GR. 1999. The α-helix folds on the millisecond time scale. Proceedings of the National Academy of Sciences of the United States of America 96(13):7232-7237

[5] Day R, Paschek D, Garcia AE. 2010. Microsecond simulations of the folding/unfolding thermodynamics of the Trp-cage miniprotein. Proteins: Structure, Function, and Bioinformatics 78:1889-1899

[6] Donten ML, Hamm P. 2013. pH-jump induced α-helix folding of poly-l-glutamic acid. Chemical Physics 422:124-130

[7] Finke JM, Jennings PA, Lee JC, Onuchic JN, Winkler JR. 2007. Equilibrium unfolding of the poly(glutamic acid)20 helix. Biopolymers 86(3):193-211

[8] Fukuda I. 2013. Zero-multipole summation method for efficiently estimating electrostatic interactions in molecular system. Journal of Chemical Physics 139(17):174107

[9] Fukuda I, Kamiya N, Nakamura H. 2014. The zero-multipole summation method for estimating electrostatic interactions in molecular dynamics: analysis of the accuracy and application to liquid systems. Journal of Chemical Physics 140(19):194307

[10] Fukuda I, Kamiya N, Yonezawa Y, Nakamura H. 2012. Simple and accurate scheme to compute electrostatic interaction: zero-dipole summation technique for molecular system and application to bulk water. Journal of Chemical Physics 137(5):054314

[11] Fukuda I, Yonezawa Y, Nakamura H. 2011. Molecular dynamics scheme for precise estimation of electrostatic interaction via zero-dipole summation principle. Journal of Chemical Physics 134(16):164107

[12] Hałabis A, Żmudzińska W, Liwo A, Ołdziej S. 2012. Conformational dynamics of the Trp-cage miniprotein at its folding temperature. Journal of Physical Chemistry B 116(23):6898-6907

[13] Hess B. 2008. P-LINCS: a parallel linear constraint solver for molecular simulation. Journal of Chemical Theory and Computation 4(1):116-122

[14] Hess B, Bekker H, Berendsen HJC, Fraaije JGEM. 1997. LINCS: a linear constraint solver for molecular simulations. Journal of Computational Chemistry 18:1463-1472

[15] Higo J, Dasgupta B, Mashimo T, Kasahara K, Fukunishi Y, Nakamura H. 2015. Virtual-system-coupled adaptive umbrella sampling to compute free-energy landscape for flexible molecular docking. Journal of Computational Chemistry 36(20):1489-1501

[16] Higo J, Ikebe J, Kamiya N, Nakamura H. 2012. Enhanced and effective conformational sampling of protein molecular systems for their free energy landscapes. Biophysical Reviews 4(1):27-44

[17] Higo J, Umezawa K, Nakamura H. 2013. A virtual-system coupled multicanonical molecular dynamics simulation: principles and applications to free-energy landscape of protein–protein interaction with an all-atom model in explicit solvent. Journal of Chemical Physics 138(18):184106

[18] Hudáky P, Stráner P, Farkas V, Váradi G, Gábor T, Perczel A. 2007. Cooperation between a salt bridge and the hydrophobic core triggers fold stabilization in a Trp-cage miniprotein†. Biochemistry 47(3):1007-1016

[19] Ikebe J, Umezawa K, Kamiya N, Sugihara T, Yonezawa Y, Takano Y, Nakamura H, Higo J. 2010. Theory for trivial trajectory parallelization of multicanonical molecular dynamics and application to a polypeptide in water. Journal of Computational Chemistry 32(7):1286-1297

[20] Iwai R, Kasahara K, Takahashi T. 2018. Influence of various parameters in the replica-exchange molecular dynamics method: number of replicas, replica-exchange frequency, and thermostat coupling time constant. Biophysics and Physicobiology 15:165-172

[21] Jani V, Sonavane UB, Joshi R. 2014. REMD and umbrella sampling simulations to probe the energy barrier of the folding pathways of engrailed homeodomain. Journal of Molecular Modeling 20(6):2283

[22] Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. 1983. Comparison of simple potential functions for simulating liquid water. Journal of Chemical Physics 79(2):926-935

[23] Joung IS, Cheatham TE. 2008. Determination of alkali and halide monovalent ion parameters for use in explicitly solvated biomolecular simulations. Journal of Physical Chemistry B 112(30):9020-9041

[24] Kabsch W, Sander C. 1983. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22(12):2577-2637

[25] Kasahara K, Ma B, Goto K, Dasgupta B, Higo J, Fukuda I, Mashimo T, Akiyama Y, Nakamura H. 2016. myPresto/omegagene: a GPU-accelerated molecular dynamics simulator tailored for enhanced conformational sampling methods with a non-Ewald electrostatic scheme. Biophysics and Physicobiology 13:209-216

[26] Kimura T, Takahashi S, Akiyama S, Uzawa T, Ishimori K, Morishima I. 2002. Direct observation of the multistep helix formation of poly-l-glutamic acids. Journal of the American Chemical Society 124(39):11596-11597

[27] Lindorff-Larsen K, Piana S, Palmo K, Maragakis P, Klepeis JL, Dror RO, Shaw DE. 2010. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins: Structure, Function, and Bioinformatics 78:1950-1958

[28] Nakajima N, Nakamura H, Kidera A. 1997. Multicanonical ensemble generated by molecular dynamics simulation for enhanced conformational sampling of peptides. Journal of Physical Chemistry B 101(5):817-824

[29] Ogasawara N, Kasahara K, Iwai R, Takahashi T. 2018. Unfolding of α-helical 20-residue poly-glutamic acid analyzed by multiple runs of canonical molecular dynamics simulations. PeerJ 6(1–2):e4769

[30] Periole X, Mark AE. 2007. Convergence and sampling efficiency in replica exchange simulations of peptide folding in explicit solvent. Journal of Chemical Physics 126(1):14903

[31] Pronk S, Páll S, Schulz R, Larsson P, Bjelkmar P, Apostolov R, Shirts MR, Smith JC, Kasson PM, Van der Spoel D, Hess B, Lindahl E. 2013. GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics 29(7):845-854

[32] Rosta E, Hummer G. 2009. Error and efficiency of replica exchange molecular dynamics simulations. Journal of Chemical Physics 131(16):165102

[33] Salomon-Ferrer R, Götz AW, Poole D, Le Grand S, Walker RC. 2013. Routine microsecond molecular dynamics simulations with AMBER on GPUs. 2. Explicit solvent particle mesh Ewald. Journal of Chemical Theory and Computation 9(9):3878-3888

[34] Shaw DE, Grossman JP, Bank JA, Batson B, Butts JA, Chao JC, Deneroff MM, Dror RO, Even A, Fenton CH, Forte A, Gagliardo J, Gill G, Greskamp B, Ho CR, Ierardi DJ, Iserovich L, Kuskin JS, Larson RH, Layman T, Lee L-S, Lerer AK, Li C, Killebrew D, Mackenzie KM, Mok SY-H, Moraes MA, Mueller R, Nociolo LJ, Peticolas JL, Quan T, Ramot D, Salmon JK, Scarpazza DP, Ben Schafer U, Siddique N, Snyder CW, Spengler J, Tang PTP, Theobald M, Toma H, Towles B, Vitale B, Wang SC, Young C. 2014. Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer.

[35] Shirts MR, Chodera JD. 2008. Statistically optimal analysis of samples from multiple equilibrium states. Journal of Chemical Physics 129(12):124105

[36] Sindhikara DJ, Emerson DJ, Roitberg AE. 2010. Exchange often and properly in replica exchange molecular dynamics. Journal of Chemical Theory and Computation 6(9):2804-2808