A software for parameter optimization with Differential Evolution Entirely Parallel method
 Published
 Accepted
 Received
 Academic Editor
 Sandra Gesing
 Subject Areas
 Computational Biology, Distributed and Parallel Computing, Optimization Theory and Computation
 Keywords
 Differential Evolution, Parameter optimization, Mathematical modeling, Parallelization, Bioinformatics, Open source software
 Copyright
 © 2016 Kozlov et al.
 Licence
 This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
 Cite this article
 2016) A software for parameter optimization with Differential Evolution Entirely Parallel method. PeerJ Computer Science 2:e74 https://doi.org/10.7717/peerjcs.74 (
Abstract
Summary. Differential Evolution Entirely Parallel (DEEP) package is a software for finding unknown real and integer parameters in dynamical models of biological processes by minimizing one or even several objective functions that measure the deviation of model solution from data. Numerical solutions provided by the most efficient global optimization methods are often problemspecific and cannot be easily adapted to other tasks. In contrast, DEEP allows a user to describe both mathematical model and objective function in any programming language, such as R, Octave or Python and others. Being implemented in C, DEEP demonstrates as good performance as the top three methods from CEC2014 (Competition on evolutionary computation) benchmark and was successfully applied to several biological problems.
Availability. DEEP method is an open source and free software distributed under the terms of GPL licence version 3. The sources are available at http://deepmethod.sourceforge.net/ and binary packages for Fedora GNU/Linux are provided for RPM package manager at https://build.opensuse.org/project/repositories/home:mackoel:compbio.
Introduction
The estimation of dynamical model parameters minimizing the discrepancy between model solution and the set of observed data is among the most important, widely studied problems in applied mathematics, and is known as an inverse problem of mathematical modeling (Mendes & Kell, 1998; Moles, Mendes & Banga, 2003). Numerical strategies for solutions of an inverse problems often involve optimization methods. Many global and local, stochastic and deterministic optimization techniques, including the natureinspired ones, have been developed and implemented in a wide range of free, open source and commercial software packages.
Mathematical modeling being one of the primary tools of computational systems biology provides new insights into the mechanisms that control the biological systems. It becomes very attractive to experimentalists due to predictive abilities of carefully selected models, if any.
Researchers benefit from the ability of a model to predict in silico the consequences of a biological experiment, which was not used for training. The properties of the model are determined by the structure of the mathematical description and the values of the unknown constants and control parameters that represents the coefficients of underlying biochemical reactions. These unknowns are to be found as a best suited solution to an inverse problem of mathematical modeling, i.e., by the fitting model output to experimental observations. The parameter set is to be reliable, and different types of data are to be considered. Development of reliable and easytouse algorithms and programs for solution to the inverse problem remains a challenging task due to diversity and high computational complexity of biomedical applications, as well as the necessity to treat large sets of heterogeneous data.
In systems biology the commonly used global optimization algorithm is the parallel Simulated Annealing (SA) (Chu, Deng & Reinitz, 1999). This method requires considerable CPU time, but is capable to eventually find the global extremum and runs efficiently in parallel computations. However, the wide range of methods called Genetic Algorithms (GA) has been developed later and successfully applied to biological problems (Spirov & Kazansky, 2002). Modern Evolutionary algorithms such as Evolution Strategies (ESs) or Differential Evolution (DE) can outperform other methods in the estimation of parameters of several biological models (FomekongNanfack, Kaandorp & Blom, 2007; FomekongNanfack, 2009; Suleimenov, 2013). The general challenge in the efficient implementation of the global optimization methods is that they depend on problemspecific assumptions and thus are not able to be easily adapted to another problems. For example, in SA both the final result and computational time depend on the socalled cooling schedule, the success of the GA optimization strongly depends on selected mutation, recombination and selection rules, and the evolutionary algorithms heavily rely on the algorithmic parameters which define the model of evolution. Currently a lot of approaches exist based on metaheuristics for parameters estimation in biology. For example, enhanced Scatter Search (Egea, Martí & Banga, 2010), implemented in MEIGOR (Metaheuristics for systems biology and bioinformatics global optimization) package for R statistical language was reported to outperform the stateoftheart methods (Egea et al., 2014). This method can provide high quality solution for integer and real parameters, however, it is computationally expensive.
We developed DEEP, a software that implements the Differential Evolution Entirely Parallel (DEEP) method introduced recently (Kozlov & Samsonov, 2011). The rationale behind the design of this programme was to provide an open source software with performance comparable to the competitive packages, as well as to allow a user to implement both mathematical model and comparison of solution with experimental data in any software package or programming language, such as R, Octave, Python or others.
Problem Statement
DEEP method was developed to solve the inverse problem of mathematical modeling. For a given mathematical model with parameters q ∈ R^{K}, where K is number of parameters, and observable data Y we seek the vector $\stackrel{\u02c6}{q}$: (1)$\stackrel{\u02c6}{q}=\mathrm{argmin}F\left(q,Y\right)$ where F is a measure of the deviation of model prediction from observable data. Additional constraints may be imposed: (2)${h}_{j}\left(q\right)=0,\phantom{\rule{10.00002pt}{0ex}}j=1,\dots ,NH$ (3)${g}_{m}\left(q\right)\le 0,\phantom{\rule{10.00002pt}{0ex}}m=1,\dots ,NG$ (4)$q}_{k}^{L}<{q}_{k}<{q}_{k}^{U},\phantom{\rule{20.00003pt}{0ex}}k\in I\subset \left\{1,\dots ,K\right\$ where NH and NG are numbers of constraints in a form of equality and inequality respectively, and I ⊂ {1, …, K} denotes the set of indices for parameters with box constraints. Several objective functions can be combined: (5)$F\left(q,Y\right)=\underset{i=1}{\overset{NF}{\odot}}{F}_{i}\left(q,Y\right)$ where ⊙ denotes one of the aggregation methods—summation or maximization.
Differential Evolution Entirely Parallel Method
The main algorithm
Differential Evolution (DE) was proposed in Storn & Price (1995) as an effective stochastic method for function minimization. DE deals with a set (population) of randomly generated parameter vectors (individuals). The population on each iteration is referred to as generation, moreover, a size of population NP is fixed.
DEEP method (Kozlov & Samsonov, 2011) incorporates two enhancements found in literature, as well as some elaborated modifications. These enhancements consist in the “trigonometric mutation” rule proposed in Fan & Lampinen (2003) and used to take into account a value of the objective function for each individual at the recombination step, and the adaptive scheme for selection of internal parameters based on the control of the population diversity developed in Zaharie (2002). The motivation to select these enhancements was to make algorithm more suitable for biological problems containing big data sets and not properly defined objective function.
The key concept behind the DEEP method is the age of individual defined as a number of generations, during which the individual survived without any change of its parameter vector. The failure to update the parameter values indicates a convergence to a local minimum. To avoid it we propose to substitute the number of oldest individuals Ψ with the same number of the best ones after the predefined number of iterations Θ.
To enhance the method further we incorporated an optional Scatter Search step (Egea, Martí & Banga, 2010) that is performed each Ξ iteration, where Ξ is a control parameter. The specified number of best individuals Ψ is used to produce NP offsprings.
To increase the reliability of the DEEP method we implemented a new selection rule, described in detail in Kozlov et al. (2013), in which several different objective functions are considered in order to accept or reject an offspring to new generation. This feature permits to combine different types of experimental observations and the a priori information in one and the same fitting procedure.
The operation of the method is described in the Algorithm 1 insertion. The execution starts with Initialization block in which a set of NP parameter vectors v^{g,0} of length K is randomly assigned the values satisfying the given constraints ${q}_{k}^{L}<{q}_{k}<{q}_{k}^{U}$, k = 1, …, K.
The main loop of the algorithm consists of Recombination, Evaluation, Selection and Adaptation blocks that are detailed below. Calculations are terminated when the objective function variation becomes less than a predefined value during the several consecutive steps or the maximal number of generations G_{max} is exceeded. Objective function F for several vectors is calculated in parallel.
The size of population NP, the frequencies Θ and Ξ for old population members substitution and scatter search step together with the number Ψ of substituted individuals and maximal number of iterations G_{max} are the main control parameters of the method.
Constraints handling
Using the DEEP method, one can solve both unconstrained and constrained optimization problems. Upper and lower bounds are to be defined for each parameter for initialization. Constraints may be imposed in the form of inequalities or equalities for a subset of parameters or their combinations. Constraints in form (2), (3) can be reduced by maximization or summation: $H\left(q,Y\right)=\underset{j=1}{\overset{NH}{\odot}}{H}_{j}\left(q,Y\right);\phantom{\rule{20.00003pt}{0ex}}G\left(q,Y\right)=\underset{m=1}{\overset{NG}{\odot}}{G}_{m}\left(q,Y\right),$ where H_{j} and G_{m} is the violation of the corresponding constraint h_{j} or g_{m} and ⊙ denotes one of the aggregation methods—summation or maximization. $F\left(q,Y\right)=F\left(q,Y\right)+H\left(q,Y\right)+G\left(q,Y\right).$
However, the box constraints in the form (4), that may be imposed for a subset I of parameters ${q}_{k}^{g}$, k ∈ I are to be transformed. To do it let us introduce the new parameters u_{k}: ${q}_{k}={\alpha}_{k}+{\beta}_{k}sin\left({u}_{k}\right)\phantom{\rule{10.00002pt}{0ex}}\mathrm{or}\phantom{\rule{10.00002pt}{0ex}}{q}_{k}={\alpha}_{k}+{\beta}_{k}tanh\left({u}_{k}\right),$ where ${\alpha}_{k}=\left({q}_{k}^{U}+{q}_{k}^{L}\right)\u22152;\phantom{\rule{10.00002pt}{0ex}}{\beta}_{k}=\left({q}_{k}^{U}{q}_{k}^{L}\right)\u22152.$
Consequently, DEEP is applied to determine unconstrained parameters u_{k}. The impact of different algorithmic parameters on method convergence was discussed in Kozlov & Samsonov (2011).
Recombination strategy
The recombination step is demonstrated in the Algorithm 2 insertion. Let v^{i,0}, i = 1, …, NP denote a set of the randomly generated parameter vectors for the initial generation j = 0, ${v}^{i,j}=\left\{{v}^{i,j}\left(k\right)\right\}={\left({q}_{k}^{i}\right)}_{k=1,\dots ,K}$, where K is the number of parameters q and NP is the fixed size of the set. The first trial vector for index g is calculated by: ${v}_{1}^{g,j+1}={v}^{a,j}+S\circ \left({v}^{b,j}{v}^{c,j}\right)$ where v^{a}, v^{b} and v^{c} are different members of the current generation j with randomly chosen indices a, b and c, and S is a vector of scaling constants for each parameter and ∘ denotes elementwise multiplication.
The second optional trial vector is calculated using “trigonometric mutation rule” (Fan & Lampinen, 2003). ${v}_{2}^{g,j+1}=\frac{{v}^{a,j}+{v}^{b,j}+{v}^{c,j}}{3}+\left({\phi}_{b}{\phi}_{a}\right)\left({v}^{a,j}{v}^{b,j}\right)+\phantom{\rule{1em}{0ex}}\left({\phi}_{c}{\phi}_{b}\right)\left({v}^{b,j}{v}^{c,j}\right)+\left({\phi}_{a}{\phi}_{c}\right)\left({v}^{c,j}{v}^{a,j}\right)$
where φ_{•} = F(v^{•,j})∕φ^{∗}, • = a, b, c, φ^{∗} = F(v^{a,j}) + F(v^{b,j}) + F(v^{c,j}), and F(x) is the main objective function to be minimized.
The combined trial vector in case of binomial recombination type is defined as follows: (6)${v}^{g,j+1}\left(k\right)=\left\{\begin{array}{cc}{v}_{1}^{g,j+1}\left(k\right),\phantom{\rule{10.00002pt}{0ex}}\hfill & \mathrm{if}\phantom{\rule{10.00002pt}{0ex}}{r}_{k}<{p}_{k},\hfill \\ {v}_{2}^{g,j+1}\left(k\right),\phantom{\rule{10.00002pt}{0ex}}\hfill & \mathrm{else}\phantom{\rule{10.00002pt}{0ex}}\mathrm{if}\phantom{\rule{10.00002pt}{0ex}}{r}_{k}<1{p}_{k},\hfill \\ {v}^{g,j}\left(k\right),\phantom{\rule{10.00002pt}{0ex}}\hfill & \mathrm{otherwise}\hfill \end{array}\right.$ where r_{k} = U(0, 1) is a random number uniformly distributed between 0 and 1, p is the vector of crossover probabilities for all parameters. The first trial vector ${v}_{1}^{g,j+1}$ is used continuously for all parameters k until the random number becomes bigger than p in case of the exponential type of recombination.
Scatter search step
Let us consider v^{g,j}, g = 1, …, Ψ as the best members of current generation j sorted according to the value of the objective function F such that F(v^{1,j}) < F(v^{2,j}) < ⋯ < F(v^{Ψ,j}) as described in Egea, Martí & Banga (2010). Each vector v^{b,j} is to be combined with the rest of vectors v^{a,j}, ∀a, a ∈ [1, 2, …, Ψ], a ≠ b. Two new points within the search space are defined: ${c}_{1}={v}^{b,j}d\left(1+\alpha \beta \right);\phantom{\rule{10.00002pt}{0ex}}{c}_{2}={v}^{b,j}+d\left(1\alpha \beta \right);\phantom{\rule{10.00002pt}{0ex}}d=\frac{{v}^{a,j}{v}^{b,j}}{2},$ where $\alpha =\left\{\begin{array}{ccc}1\hfill & \mathrm{if}\hfill & b<a\hfill \\ 1\hfill & \mathrm{if}\hfill & b>a,\hfill \end{array}\right.\phantom{\rule{10.00002pt}{0ex}}\beta =\frac{\leftab\right1}{\mathrm{\Psi}2}.$ Then the offspring is created according to the formula: ${v}^{b,j+1}={c}_{1}+\left({c}_{2}{c}_{1}\right)\circ r,$ where r = {r_{k}}, r_{k} = U(0, 1), k = 1, …, K is a random number uniformly distributed between 0 and 1.
Selection rule
In order to increase the robustness of the procedure we have implemented the following selection rule for DE, described in detail in Kozlov et al. (2013) (see the Algorithm 3 insertion). Briefly, several different objective functions are used to decide if an offspring will be selected for a new generation. Firstly, the main objective function is checked. The offspring replaces its parent if the value of this function for offspring’s set of parameters is less than that for the parental one. In the opposite case the additional objective functions are considered. The offspring replaces its parent if the value of any other objective function is better, and a randomly selected value is less than the predefined parameter for this function.
Preserving population diversity
The original DE algorithm was highly dependent on internal parameters as reported by other authors, see, for example (Gaemperle, Mueller & Koumoutsakos, 2002). An efficient adaptive scheme for selection of internal parameters S_{k} and p_{k} based on the control of the population diversity was proposed in Zaharie (2002). Let us consider the variation for parameter k in the current generation: $\mathrm{var}}_{k}=\frac{1}{NP}\sum _{i=1}^{NP}{\left({q}_{k}^{i}\frac{1}{NP}\sum _{l=1}^{NP}{q}_{k}^{l}\right)}^{2$ where k = 1, …, n. For the next generation the scaling constant is calculated by ${S}_{k}=\left\{\begin{array}{cc}\sqrt{\frac{NP\cdot \left({\rho}_{k}1\right)+{p}_{k}\left(2{p}_{k}\right)}{2\cdot NP\cdot {p}_{k}}}\phantom{\rule{10.00002pt}{0ex}}\hfill & NP\cdot \left({\rho}_{k}1\right)+{p}_{k}\left(2{p}_{k}\right)\ge 0\hfill \\ {S}_{\mathrm{inf}}\phantom{\rule{10.00002pt}{0ex}}\hfill & NP\cdot \left({\rho}_{k}1\right)+{p}_{k}\left(2{p}_{k}\right)<0\hfill \end{array}\right.$ or alternatively the crossover probability is adopted as ${p}_{k}=\left\{\begin{array}{cc}\left(NP\cdot {S}_{k}^{2}1\right)+\sqrt{{\left(NP\cdot {S}_{k}^{2}1\right)}^{2}NP\cdot \left(1{\rho}_{k}\right)}\phantom{\rule{10.00002pt}{0ex}}\hfill & {\rho}_{k}\ge 1\hfill \\ {p}_{\mathrm{inf}}\phantom{\rule{10.00002pt}{0ex}}\hfill & {\rho}_{k}<1\hfill \end{array}\right.$ where ${S}_{\mathrm{inf}}=1\u2215\sqrt{NP}$, p_{inf} = 0, ${\rho}_{k}=\gamma \left({\mathrm{var}}_{k}^{\mathrm{previous}}\u2215{\mathrm{var}}_{k}\right)$ and γ is a new constant that controls the decrease of the variability of parameters in the course of iteration process.
Mixed integerreal problems
DE operates on floating point parameters, while many real problems contain integer parameters, e.g., indices of some kind. Two algorithms for parameter conversion from real to integer are implemented in DEEP method as described in Kozlov et al. (2013). The first method rounds off a real value to the nearest integer number. The second procedure consists of the following steps:

The values are sorted in ascending order.

The index of the parameter in the floating point array becomes the value of the parameter in the integer array.
Parallelization of objective function calculation
DEEP can be effectively parallelized due to independent evaluation of each population member. Various models for evolutionary algorithms parallelization have been developed, such as masterslave, island, cellular or hybrid versions (Tasoulis et al., 2004).
The approach implemented in DEEP (see the Algorithm 4 insertion) utilizes the multicore architecture of modern CPUs and employs the pool of worker threads with asynchronous queue of tasks to evaluate the individual solutions in parallel. The calculation of objective function for each trial vector using the command supplied by a user is pushed to the asynchronous queue and starts as soon as there is an available thread in the pool. Such approach is similar to “guided” schedule in OpenMP but gives us more flexibility and control. The command output is automatically recognized according to the specified format. All threads started in the current iteration are to be finished before the next one starts.
Implementation
DEEP is implemented in C programming language as console application and employs interfaces from GLIB project (https://developer.gnome.org/glib/), e.g., Thread Pool API. The architecture allows a user to utilize any programming language or computer system, such as R, Octave or Python to implement both mathematical model and comparison of solution with experimental data.
Control parameters
All the control parameters are specified in the single input file as a keyvalue pairs in INIformat supplied to the DEEP executable on the command line. The control parameters are arranged into three groups described below.
Mathematical Model section specifies the parameter number, both the lower and upper parameter bounds, as well as the software itself necessary to run a model. A possibility is provided to indicate parameters that are to be kept unchanged.
Objective Function section defines the aggregation methods for constraints and multiple objectives. The type of function, i.e., main or additional objective, equality or inequality constraint, is denoted by special keyword. Ranks and weights are to be given here.
Method Settings section allows the user to tune the settings, namely, population size, stopping criterion, offspring generation strategy, the number of the oldest individuals to be substituted in the next generation Ψ, the maximal number of working threads and the seed for random number generator. Two options for offspring generation are provided, namely the selection of best individual or “trigonometric mutation.” The stopping criterion can limit the convergence rate, absolute or relative value of the objective function, number of generations or the wall clock time. The initial population is by default generated randomly within the limits given; however, it is also possible to define one initial guess and generate the individuals in the specified vicinity of it.
Programming interfaces
The DEEP method can be used as the static or dynamic shared object and embedded in another software package. Application programming interfaces (APIs) can be used to connect with existing code implementing mathematical model and objective function. This approach is often preferred in academic and industrial applications where the high level modeling system language is not sufficient or the computation time should be reduced.
Results
Method testing on benchmark functions
To evaluate the performance of DEEP we used three simple multimodal test functions of dimension D = 30 from the Competition on Real Parameter Single Objective Optimization 2014 (CEC2014) test suite (Liang, Qu & Suganthan, 2014), namely:
Shifted and Rotated Griewank’s Function. $H\left(x\right)=h\left(M\left(\frac{600\left(x{o}_{H}\right)}{100}\right)\right)+700;\phantom{\rule{10.00002pt}{0ex}}h\left(x\right)=\sum _{i=1}^{D}\frac{{x}_{i}^{2}}{4000}\prod _{i=1}^{D}cos\left(\frac{{x}_{i}}{\sqrt{i}}\right)+1$
Shifted Rastrigin’s Function. $R\left(x\right)=r\left(\frac{5.12\left(x{o}_{r}\right)}{100}\right)+800;\phantom{\rule{10.00002pt}{0ex}}r\left(x\right)=\sum _{i=1}^{D}\left({x}_{i}^{2}10cos\left(2\pi {x}_{i}\right)+10\right)$
Shifted Schwefel’s Function. $S\left(x\right)=s\left(\frac{1000\left(x{o}_{s}\right)}{100}\right)+1000;\phantom{\rule{10.00002pt}{0ex}}s\left(x\right)=418.9829\times D\sum _{i=1}^{D}g\left({z}_{i}\left({x}_{i}\right)\right),$ where z_{i} = x_{i} + 4.209687462275036x10^{2}, and $g\left({z}_{i}\right)=\left\{\begin{array}{cc}\hfill {z}_{i}sin\left({\left{z}_{i}\right}^{1\u22152}\right)\hfill & \hfill \text{if}\left{z}_{i}\right<500\text{,}\hfill \\ \hfill \left(500\mathrm{mod}\left({z}_{i},500\right)\right)\ast \hfill & \hfill \hfill \\ \hfill \ast sin\left(\sqrt{\left500\mathrm{mod}\left({z}_{i},500\right)\right}\right)\hfill & \hfill \hfill \\ \hfill \frac{{\left({z}_{i}500\right)}^{2}}{1000D}\hfill & \hfill \text{if}{z}_{i}>500\text{,}\hfill \\ \hfill \left(\mathrm{mod}\left(\left{z}_{i}\right,500\right)500\right)\ast \hfill & \hfill \hfill \\ \hfill \ast sin\left(\sqrt{\left\mathrm{mod}\left({z}_{i},500\right)500\right}\right)\hfill & \hfill \hfill \\ \hfill \frac{{\left({z}_{i}+500\right)}^{2}}{1000D}\hfill & \hfill \text{if}{z}_{i}<500\text{,}\hfill \end{array}\right.$ and the global optimum is shifted to o_{i} = [o_{i1}, o_{i2}, …, o_{iD}]^{T} and rotated using the rotation matrix M_{i}.
For each function 51 runs were performed with identical settings and with random initial population. The maximal allowed number of functional evaluations was set to 3 × 10^{5}. Other DEEP settings were NP = 200, G_{max} = 1, 499 and Ψ = 40. The measured error was the difference between the known optimal value and the obtained solution.
Following the methodology described in Tanabe & Fukunaga (2014) we used the Wilcoxon ranksum test with significance level p < 0.05 to compare the evaluation results for 51 runs with the results of the top three methods from CEC2014 (Liang, Qu & Suganthan, 2014) taken from CEC2014 report:

1.
Covariance Matrix Learning and Searching Preference (CMLP) (Chen et al., 2014),

2.
SuccessHistory Based Parameter Adaptation for Differential Evolution (LSHADE) (Tanabe & Fukunaga, 2014),

3.
United MultiOperator Evolutionary Algorithms (UMOEAs) (Elsayed et al., 2014).
The number of benchmark functions from three tested (+), (−), (≈) is presented in Table 1. DEEP demonstrated the same or better performance.
DEEP vs  CMLP  LSHADE  UMOEAs 

− (worse)  0  0  0 
≈ (no sig.)  3  3  1 
+ (better)  0  0  2 
The method test on reduced model of gene regulation
To demonstrate how DEEP works in applications we developed a realistic benchmark problem based on real biological model of gap gene regulatory network (Kozlov et al., 2015b). A model provides a dynamical description of gap gene regulatory system, using detailed DNAbased information, as well as spatial TF concentration data at varying time points. The gap gene regulatory network controls segment determination in the early Drosophila embryo (Akam, 1987; Jaeger, 2011; Surkova et al., 2008).
The state variables of this model are the concentrations of mRNAs and proteins encoded by four gap genes hb, Kr, gt, and kni. The model implements the thermodynamic approach in the form proposed in He et al. (2010) to calculate the expression of a target gene at the RNA level. This level is proportional to the gene activation level also called the promoter occupancy, and is determined by concentrations of eight transcription factors Hb, Kr, Gt, Kni, Bcd, Tll, Cad and Hkb: (7)$E}_{i}^{a}\left(t\right)=\frac{{Z}_{ON,i}^{a}\left(t\right)}{{Z}_{ON,i}^{a}\left(t\right)+{Z}_{OFF,i}^{a}\left(t\right)$ where ${Z}_{ON,i}^{a}\left(t\right)$ and ${Z}_{OFF,i}^{a}\left(t\right)$ are statistical weights of the enhancer with the basal transcriptional complex bound and unbound, respectively.
Two sets of the reaction–diffusion differential equations for mRNA ${u}_{i}^{a}\left(t\right)$ and protein concentrations ${v}_{i}^{a}\left(t\right)$ describe the dynamics of the system (Reinitz & Sharp, 1995; Jaeger et al., 2004; Kozlov et al., 2012): (8)$d{u}_{i}^{a}\u2215dt={R}_{u}^{a}{E}_{i}^{a}\left(t\right)+{D}_{u}^{a}\left(n\right)\left[\left({u}_{i1}^{a}{u}_{i}^{a}\right)+\left({u}_{i+1}^{a}{u}_{i}^{a}\right)\right]{\lambda}_{u}^{a}{u}_{i}^{a},$ (9)$d{v}_{i}^{a}\u2215dt={R}_{v}^{a}{u}_{i}^{a}\left(t{\tau}_{v}^{a}\right)+{D}_{v}^{a}\left(n\right)\left[\left({v}_{i1}^{a}{v}_{i}^{a}\right)+\left({v}_{i+1}^{a}{v}_{i}^{a}\right)\right]{\lambda}_{v}^{a}{v}_{i}^{a},$
where n is the cleavage cycle number, ${R}_{v}^{a}$ and ${R}_{u}^{a}$ are the maximum synthesis rates, ${D}_{v}^{a}$, ${D}_{u}^{a}$ (to smooth the resulting model output) are the diffusion coefficients, ${\lambda}_{v}^{a}$ and ${\lambda}_{u}^{a}$ are the decay rates for protein and mRNA of gene a. The model spans the time period of cleavage cycles 13 and 14A (c13 and c14 resp.) and the interval of AP axis from 35% to 92% (58 nuclei) of embryo length. The number of nuclei along the AP axis is doubled when going from c13 to c14. The model is fitted to data on gap protein concentrations from the FlyEx database (Pisarev et al., 2008) and mRNA concentrations from SuperFly (CicinSain et al., 2015).
To fit the model we used the residual sum of squared differences between the model output and data and we used the weighted Pattern Generation Potential proposed in Samee & Sinha (2013) as the second objective function: $\mathrm{RSS}\left(x,y\right)=\sum _{\forall g,n,t:\exists {y}_{n}^{g}\left(t\right)}{\left({x}_{n}^{g}\left(t\right){y}_{n}^{g}\left(t\right)\right)}^{2}\phantom{\rule{15.00002pt}{0ex}}\mathrm{wPGP}\left(x,y\right)=\frac{1+\left(\mathrm{penalty}\left(x,y\right)\mathrm{reward}\left(x,y\right)\right)}{2}$ where g, n and t are gene, nucleus and time point respectively and $\mathrm{reward}\left(x,y\right)=\frac{\sum _{i}{y}_{i}\ast min\left({y}_{i},{x}_{i}\right)}{\sum _{i}{y}_{i}\ast {y}_{i}}\phantom{\rule{20.00003pt}{0ex}}\mathrm{penalty}\left(x,y\right)=\frac{\sum _{i}\left({y}_{\mathrm{max}}{y}_{i}\right)\ast max\left({x}_{i}{y}_{i},0\right)}{\sum _{i}\left({y}_{\mathrm{max}}{y}_{i}\right)\ast \sum _{i}\left({y}_{\mathrm{max}}{y}_{i}\right)}$ were x_{i} and y_{i} are respectively predicted and experimentally observed expression in nucleus i, and y_{max} is the maximum level of experimentally observed expression. Consequently, the combined objective function is defined by: (10)$F\left(q,Y\right)=2\ast 1{0}^{7}\ast \mathrm{RSS}\left(v\left(q\right),V\right)+1.5\ast 1{0}^{7}\ast \mathrm{RSS}\left(u\left(q\right),U\right)+\phantom{\rule{1em}{0ex}}\mathrm{wPGP}\left(v\left(q\right),V\right)+0.6\ast \mathrm{wPGP}\left(u\left(q\right),U\right)+\phantom{\rule{1em}{0ex}}1{0}^{8}\ast \mathrm{Penalty}\left(q\right),$
where Y = {V, U} contains data for u and v, and the function Penalty limits the growth of regulatory parameters, and the weights were obtained experimentally.
We simplified the original computationally expensive model (Kozlov et al., 2015b) to use it as a benchmark in our calculations as follows. Firstly, we reduced the number of nuclei from 58 to 10 and considered only one target gene with DNA sequence from kni. Consequently, the number of parameters was reduced to 34, two of which are of integer type. Biologically feasible box constraints in the form (4) are imposed for 28 parameters. Next, we fitted this reduced model to the coarsened data and used the obtained solution and model parameters as the synthetic data for benchmark. Thus, the exact parameters of benchmark optimization problem are known.
To compare DEEP and MEIGOR (Egea et al., 2014) we run both methods in the same conditions and record the final value of the objective function (11), final parameters and the number of functional evaluations. We considered those solutions for which the final functional value is less than 0.005 that corresponds to parameters close to exact known values. The Welch two sample ttest demonstrated that DEEP used less objective function evaluations than MEIGOR with p < 0.005 (see Fig. 1).
Real applications
DEEP software was successfully applied to explain the dramatic decrease in gap gene expression in early Drosophila embryo caused by a null mutation in Kr gene. Figure 2A presents the topology of regulatory network inferred by fitting the dynamical model with 44 parameters of gap gene expression to the wild type and Kr mutant data simultaneously (Kozlov et al., 2012). Other DEEP applications include different problems described in Ivanisenko et al. (2014); Nuriddinov et al. (2013).
Recently, DEEP was used in the online ancestry prediction tool reAdmix that can identify the biogeographic origins of highly mixed individuals (Kozlov et al., 2015a). reAdmix is available at http://chcb.sabanchla.usc.edu/reAdmix/.
Two applications are discussed below in details.
Subgenomic Hepatitis C virus replicon replication model
The hepatitis C virus (HCV) causes hazardous liver diseases leading frequently to cirrhosis and hepatocellular carcinoma. No effective antiHCV therapy is available up to date. Design of the effective antiHCV medicine is a challenging task due to the ability of the hepatitis C virus to rapidly acquire drug resistance. The cells containing HCV subgenomic replicon are widely used for experimental studies of the HCV genome replication mechanisms and the in vitro testing of the tentative medicine. HCV NS3/4A protease is essential for viral replication and therefore it has been one of the most attractive targets for development of specific antiviral agents for HCV.
We used the new algorithm and software package to determine 18 parameters (kinetic reaction constants) of the mathematical model of the subgenomic Hepatitis C virus (HCV) replicon replication in Huh7 cells in the presence of the HCV NS3 protease inhibitor, see Ivanisenko et al. (2013).
The experimental data include kinetic curves of the viral RNA suppression at various inhibitor concentrations of the VX950 and BILN2061 inhibitors (Lin et al., 2004; Lin et al., 2006). We seek for the set of parameters that minimizes three criteria. The main criterion (RSS) is the residual sum of squared differences between the model output and data. Additional criteria 2 (F_{2}) and 3 (F_{3}) penalize the deviation of the time to steady state and the number of viral vesicles at the steady state, respectively.
The combined criterion was defined as follows: (11)$F}_{\mathrm{combined}}=\mathrm{RSS}+0.1\cdot {F}_{2}+0.1\cdot {F}_{3$ where the weights were obtained experimentally. The dependence of the best value of the combined criterion (11) in population of individuals on the generation number for 10 runs is plotted in Fig. 3A. The objective function is to be evaluated once for each member of the generation, the size of which was set to 200.
The plot of the criteria in the close vicinity of the optimal values of the two parameters from the set is shown in Figs. 3B and 3C. Despite of the fact that the criteria do not take a minimal values in one and the same point, the algorithm produces reliable approximation of the optimum.
The comparison of the model output and experimental dependencies of the viral RNA suppression rate on inhibitor concentration is shown in Figs. 3D and 3E. It is worth to note that, the model correctly reproduces experimental kinetics of the viral RNA suppression.
The predictive power of the model was estimated using the experimental data on dependencies of the viral RNA suppression rate on the increasing concentration of the SCH503034 (Malcolm et al., 2006) and ITMN191 (Seiwert et al., 2008) inhibitors. These data were not used for parameter estimation. As it can be seen in Figs. 3F and 3G, the model correctly reproduces experimental observations and thus can be used for in silico studies.
Sequencebased model of the gap gene regulatory network
Recently, DEEP method was successfully applied to recover 68 parameters of the DNA sequencebased model (7)–(8) of regulatory network of 4 gap genes—hb, Kr, gt, and kni—and 8 transcription factors: Hb, Kr, Gt, Kni, Bcd, Tll, Cad and Hkb (Kozlov et al., 2015b). The trained model provides a tool to estimate the importance of each TF binding site for the model output (see Fig. 2B). We showed that functionally important sites are not exclusively located in cisregulatory elements and that sites with low regulatory weight are important for the model output (Kozlov et al., 2014).
The evolution of the three objective functions during one optimization run is shown in Fig. 2C. Note that the wPGP and the Penalty functions do not decline monotonically and simultaneously. In a few first steps these functions reach their maximal values while RSS falls sharply, that corresponds to the adaptation of the control parameters of the algorithm and substitution of old parameter sets with good ones. Then wPGP starts to decay, and Penalty fluctuates at high level, while RSS decays approximately at the same rate as wPGP. As Penalty depends only on regulatory parameters, its behaviour at this stage illustrates that it disallows the process to be trapped in some local minimum with extreme values of parameters. During the second half of the optimization process, Penalty reaches its final low level and stays at it almost constant till convergence while the RSS and wPGP exhibit a modest growth and then converge. This illustrates the ability of DEEP to balance several objective functions. The model output at this stage is not changed much as indicated by RSS though the absolute values of regulatory parameters are fine tuned.
Conclusions
The parallelization of objective function calculation implemented in DEEP method considerably reduces the computational time. Several members of the current generation are evaluated in parallel, which in our experience with Sequencebased Model of the Gap Gene Regulatory Network, resulted in 24 times speedup on 24 core computational node (Intel Xeon 5670, Joint Supercomputer Center of the Russian Academy of Sciences, Moscow). The calculation of 24 objective functions in parallel threads took approximately the same 20 s as one sequential job, and the optimization runs were able to converge in 14 h after approximately 60,000 functional evaluations.
To sum up, we elaborated both the method and the software, which demonstrated high performance on test functions and biological problems of finding parameters in dynamic models of biological processes by minimizing one or even several objective functions that measure the deviation of model solution from data.