PeerJ Computer Science:Optimization Theory and Computationhttps://peerj.com/articles/index.atom?journal=cs&subject=10800Optimization Theory and Computation articles published in PeerJ Computer ScienceAdaptive divergence for rapid adversarial optimizationhttps://peerj.com/articles/cs-2742020-05-182020-05-18Maxim BorisyakTatiana GaintsevaAndrey Ustyuzhanin
Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence between these distribution, and estimation of the divergence involves a secondary optimization task, which, typically, requires training a model to discriminate between these distributions. The choice of the model has its trade-off: high-capacity models provide good estimations of the divergence, but, generally, require large sample sizes to be properly trained. In contrast, low-capacity models tend to require fewer samples for training; however, they might provide biased estimations. Computational costs of Adversarial Optimization becomes significant when sampling from the generator is expensive. One of the practical examples of such settings is fine-tuning parameters of complex computer simulations. In this work, we introduce a novel family of divergences that enables faster optimization convergence measured by the number of samples drawn from the generator. The variation of the underlying discriminator model capacity during optimization leads to a significant speed-up. The proposed divergence family suggests using low-capacity models to compare distant distributions (typically, at early optimization steps), and the capacity gradually grows as the distributions become closer to each other. Thus, it allows for a significant acceleration of the initial stages of optimization. This acceleration was demonstrated on two fine-tuning problems involving Pythia event generator and two of the most popular black-box optimization algorithms: Bayesian Optimization and Variational Optimization. Experiments show that, given the same budget, adaptive divergences yield results up to an order of magnitude closer to the optimum than Jensen-Shannon divergence. While we consider physics-related simulations, adaptive divergences can be applied to any stochastic simulation.
Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence between these distribution, and estimation of the divergence involves a secondary optimization task, which, typically, requires training a model to discriminate between these distributions. The choice of the model has its trade-off: high-capacity models provide good estimations of the divergence, but, generally, require large sample sizes to be properly trained. In contrast, low-capacity models tend to require fewer samples for training; however, they might provide biased estimations. Computational costs of Adversarial Optimization becomes significant when sampling from the generator is expensive. One of the practical examples of such settings is fine-tuning parameters of complex computer simulations. In this work, we introduce a novel family of divergences that enables faster optimization convergence measured by the number of samples drawn from the generator. The variation of the underlying discriminator model capacity during optimization leads to a significant speed-up. The proposed divergence family suggests using low-capacity models to compare distant distributions (typically, at early optimization steps), and the capacity gradually grows as the distributions become closer to each other. Thus, it allows for a significant acceleration of the initial stages of optimization. This acceleration was demonstrated on two fine-tuning problems involving Pythia event generator and two of the most popular black-box optimization algorithms: Bayesian Optimization and Variational Optimization. Experiments show that, given the same budget, adaptive divergences yield results up to an order of magnitude closer to the optimum than Jensen-Shannon divergence. While we consider physics-related simulations, adaptive divergences can be applied to any stochastic simulation.A new non-monotonic infeasible simplex-type algorithm for Linear Programminghttps://peerj.com/articles/cs-2652020-03-302020-03-30Charalampos P. TriantafyllidisNikolaos Samaras
This paper presents a new simplex-type algorithm for Linear Programming with the following two main characteristics: (i) the algorithm computes basic solutions which are neither primal or dual feasible, nor monotonically improving and (ii) the sequence of these basic solutions is connected with a sequence of monotonically improving interior points to construct a feasible direction at each iteration. We compare the proposed algorithm with the state-of-the-art commercial CPLEX and Gurobi Primal-Simplex optimizers on a collection of 93 well known benchmarks. The results are promising, showing that the new algorithm competes versus the state-of-the-art solvers in the total number of iterations required to converge.
This paper presents a new simplex-type algorithm for Linear Programming with the following two main characteristics: (i) the algorithm computes basic solutions which are neither primal or dual feasible, nor monotonically improving and (ii) the sequence of these basic solutions is connected with a sequence of monotonically improving interior points to construct a feasible direction at each iteration. We compare the proposed algorithm with the state-of-the-art commercial CPLEX and Gurobi Primal-Simplex optimizers on a collection of 93 well known benchmarks. The results are promising, showing that the new algorithm competes versus the state-of-the-art solvers in the total number of iterations required to converge.An evolutionary decomposition-based multi-objective feature selection for multi-label classificationhttps://peerj.com/articles/cs-2612020-03-022020-03-02Azam Asilian BidgoliHossein Ebrahimpour-KomlehShahryar Rahnamayan
Data classification is a fundamental task in data mining. Within this field, the classification of multi-labeled data has been seriously considered in recent years. In such problems, each data entity can simultaneously belong to several categories. Multi-label classification is important because of many recent real-world applications in which each entity has more than one label. To improve the performance of multi-label classification, feature selection plays an important role. It involves identifying and removing irrelevant and redundant features that unnecessarily increase the dimensions of the search space for the classification problems. However, classification may fail with an extreme decrease in the number of relevant features. Thus, minimizing the number of features and maximizing the classification accuracy are two desirable but conflicting objectives in multi-label feature selection. In this article, we introduce a multi-objective optimization algorithm customized for selecting the features of multi-label data. The proposed algorithm is an enhanced variant of a decomposition-based multi-objective optimization approach, in which the multi-label feature selection problem is divided into single-objective subproblems that can be simultaneously solved using an evolutionary algorithm. This approach leads to accelerating the optimization process and finding more diverse feature subsets. The proposed method benefits from a local search operator to find better solutions for each subproblem. We also define a pool of genetic operators to generate new feature subsets based on old generation. To evaluate the performance of the proposed algorithm, we compare it with two other multi-objective feature selection approaches on eight real-world benchmark datasets that are commonly used for multi-label classification. The reported results of multi-objective method evaluation measures, such as hypervolume indicator and set coverage, illustrate an improvement in the results obtained by the proposed method. Moreover, the proposed method achieved better results in terms of classification accuracy with fewer features compared with state-of-the-art methods.
Data classification is a fundamental task in data mining. Within this field, the classification of multi-labeled data has been seriously considered in recent years. In such problems, each data entity can simultaneously belong to several categories. Multi-label classification is important because of many recent real-world applications in which each entity has more than one label. To improve the performance of multi-label classification, feature selection plays an important role. It involves identifying and removing irrelevant and redundant features that unnecessarily increase the dimensions of the search space for the classification problems. However, classification may fail with an extreme decrease in the number of relevant features. Thus, minimizing the number of features and maximizing the classification accuracy are two desirable but conflicting objectives in multi-label feature selection. In this article, we introduce a multi-objective optimization algorithm customized for selecting the features of multi-label data. The proposed algorithm is an enhanced variant of a decomposition-based multi-objective optimization approach, in which the multi-label feature selection problem is divided into single-objective subproblems that can be simultaneously solved using an evolutionary algorithm. This approach leads to accelerating the optimization process and finding more diverse feature subsets. The proposed method benefits from a local search operator to find better solutions for each subproblem. We also define a pool of genetic operators to generate new feature subsets based on old generation. To evaluate the performance of the proposed algorithm, we compare it with two other multi-objective feature selection approaches on eight real-world benchmark datasets that are commonly used for multi-label classification. The reported results of multi-objective method evaluation measures, such as hypervolume indicator and set coverage, illustrate an improvement in the results obtained by the proposed method. Moreover, the proposed method achieved better results in terms of classification accuracy with fewer features compared with state-of-the-art methods.Intrinsic RGB and multispectral images recovery by independent quadratic programminghttps://peerj.com/articles/cs-2562020-02-102020-02-10Alexandre KrebsYannick BenezethFranck Marzani
This work introduces a method to estimate reflectance, shading, and specularity from a single image. Reflectance, shading, and specularity are intrinsic images derived from the dichromatic model. Estimation of these intrinsic images has many applications in computer vision such as shape recovery, specularity removal, segmentation, or classification. The proposed method allows for recovering the dichromatic model parameters thanks to two independent quadratic programming steps. Compared to the state of the art in this domain, our approach has the advantage to address a complex inverse problem into two parallelizable optimization steps that are easy to solve and do not require learning. The proposed method is an extension of a previous algorithm that is rewritten to be numerically more stable, has better quantitative and qualitative results, and applies to multispectral images. The proposed method is assessed qualitatively and quantitatively on standard RGB and multispectral datasets.
This work introduces a method to estimate reflectance, shading, and specularity from a single image. Reflectance, shading, and specularity are intrinsic images derived from the dichromatic model. Estimation of these intrinsic images has many applications in computer vision such as shape recovery, specularity removal, segmentation, or classification. The proposed method allows for recovering the dichromatic model parameters thanks to two independent quadratic programming steps. Compared to the state of the art in this domain, our approach has the advantage to address a complex inverse problem into two parallelizable optimization steps that are easy to solve and do not require learning. The proposed method is an extension of a previous algorithm that is rewritten to be numerically more stable, has better quantitative and qualitative results, and applies to multispectral images. The proposed method is assessed qualitatively and quantitatively on standard RGB and multispectral datasets.HACSim: an R package to estimate intraspecific sample sizes for genetic diversity assessment using haplotype accumulation curveshttps://peerj.com/articles/cs-2432020-01-062020-01-06Jarrett D. PhillipsSteven H. FrenchRobert H. HannerDaniel J. Gillis
Assessing levels of standing genetic variation within species requires a robust sampling for the purpose of accurate specimen identification using molecular techniques such as DNA barcoding; however, statistical estimators for what constitutes a robust sample are currently lacking. Moreover, such estimates are needed because most species are currently represented by only one or a few sequences in existing databases, which can safely be assumed to be undersampled. Unfortunately, sample sizes of 5–10 specimens per species typically seen in DNA barcoding studies are often insufficient to adequately capture within-species genetic diversity. Here, we introduce a novel iterative extrapolation simulation algorithm of haplotype accumulation curves, called HACSim (Haplotype Accumulation Curve Simulator) that can be employed to calculate likely sample sizes needed to observe the full range of DNA barcode haplotype variation that exists for a species. Using uniform haplotype and non-uniform haplotype frequency distributions, the notion of sampling sufficiency (the sample size at which sampling accuracy is maximized and above which no new sampling information is likely to be gained) can be gleaned. HACSim can be employed in two primary ways to estimate specimen sample sizes: (1) to simulate haplotype sampling in hypothetical species, and (2) to simulate haplotype sampling in real species mined from public reference sequence databases like the Barcode of Life Data Systems (BOLD) or GenBank for any genomic marker of interest. While our algorithm is globally convergent, runtime is heavily dependent on initial sample sizes and skewness of the corresponding haplotype frequency distribution.
Assessing levels of standing genetic variation within species requires a robust sampling for the purpose of accurate specimen identification using molecular techniques such as DNA barcoding; however, statistical estimators for what constitutes a robust sample are currently lacking. Moreover, such estimates are needed because most species are currently represented by only one or a few sequences in existing databases, which can safely be assumed to be undersampled. Unfortunately, sample sizes of 5–10 specimens per species typically seen in DNA barcoding studies are often insufficient to adequately capture within-species genetic diversity. Here, we introduce a novel iterative extrapolation simulation algorithm of haplotype accumulation curves, called HACSim (Haplotype Accumulation Curve Simulator) that can be employed to calculate likely sample sizes needed to observe the full range of DNA barcode haplotype variation that exists for a species. Using uniform haplotype and non-uniform haplotype frequency distributions, the notion of sampling sufficiency (the sample size at which sampling accuracy is maximized and above which no new sampling information is likely to be gained) can be gleaned. HACSim can be employed in two primary ways to estimate specimen sample sizes: (1) to simulate haplotype sampling in hypothetical species, and (2) to simulate haplotype sampling in real species mined from public reference sequence databases like the Barcode of Life Data Systems (BOLD) or GenBank for any genomic marker of interest. While our algorithm is globally convergent, runtime is heavily dependent on initial sample sizes and skewness of the corresponding haplotype frequency distribution.A unified approach for cluster-wise and general noise rejection approaches for k-means clusteringhttps://peerj.com/articles/cs-2382019-11-182019-11-18Seiki Ubukata
Hard C-means (HCM; k-means) is one of the most widely used partitive clustering techniques. However, HCM is strongly affected by noise objects and cannot represent cluster overlap. To reduce the influence of noise objects, objects distant from cluster centers are rejected in some noise rejection approaches including general noise rejection (GNR) and cluster-wise noise rejection (CNR). Generalized rough C-means (GRCM) can deal with positive, negative, and boundary belonging of object to clusters by reference to rough set theory. GRCM realizes cluster overlap by the linear function threshold-based object-cluster assignment. In this study, as a unified approach for GNR and CNR in HCM, we propose linear function threshold-based C-means (LiFTCM) by relaxing GRCM. We show that the linear function threshold-based assignment in LiFTCM includes GNR, CNR, and their combinations as well as rough assignment of GRCM. The classification boundary is visualized so that the characteristics of LiFTCM in various parameter settings are clarified. Numerical experiments demonstrate that the combinations of rough clustering or the combinations of GNR and CNR realized by LiFTCM yield satisfactory results.
Hard C-means (HCM; k-means) is one of the most widely used partitive clustering techniques. However, HCM is strongly affected by noise objects and cannot represent cluster overlap. To reduce the influence of noise objects, objects distant from cluster centers are rejected in some noise rejection approaches including general noise rejection (GNR) and cluster-wise noise rejection (CNR). Generalized rough C-means (GRCM) can deal with positive, negative, and boundary belonging of object to clusters by reference to rough set theory. GRCM realizes cluster overlap by the linear function threshold-based object-cluster assignment. In this study, as a unified approach for GNR and CNR in HCM, we propose linear function threshold-based C-means (LiFTCM) by relaxing GRCM. We show that the linear function threshold-based assignment in LiFTCM includes GNR, CNR, and their combinations as well as rough assignment of GRCM. The classification boundary is visualized so that the characteristics of LiFTCM in various parameter settings are clarified. Numerical experiments demonstrate that the combinations of rough clustering or the combinations of GNR and CNR realized by LiFTCM yield satisfactory results.Randomized routing of virtual machines in IaaS data centershttps://peerj.com/articles/cs-2112019-09-022019-09-02Hadi KhaniHamed Khanmirza
Cloud computing technology has been a game changer in recent years. Cloud computing providers promise cost-effective and on-demand resource computing for their users. Cloud computing providers are running the workloads of users as virtual machines (VMs) in a large-scale data center consisting a few thousands physical servers. Cloud data centers face highly dynamic workloads varying over time and many short tasks that demand quick resource management decisions. These data centers are large scale and the behavior of workload is unpredictable. The incoming VM must be assigned onto the proper physical machine (PM) in order to keep a balance between power consumption and quality of service. The scale and agility of cloud computing data centers are unprecedented so the previous approaches are fruitless. We suggest an analytical model for cloud computing data centers when the number of PMs in the data center is large. In particular, we focus on the assignment of VM onto PMs regardless of their current load. For exponential VM arrival with general distribution sojourn time, the mean power consumption is calculated. Then, we show the minimum power consumption under quality of service constraint will be achieved with randomize assignment of incoming VMs onto PMs. Extensive simulation supports the validity of our analytical model.
Cloud computing technology has been a game changer in recent years. Cloud computing providers promise cost-effective and on-demand resource computing for their users. Cloud computing providers are running the workloads of users as virtual machines (VMs) in a large-scale data center consisting a few thousands physical servers. Cloud data centers face highly dynamic workloads varying over time and many short tasks that demand quick resource management decisions. These data centers are large scale and the behavior of workload is unpredictable. The incoming VM must be assigned onto the proper physical machine (PM) in order to keep a balance between power consumption and quality of service. The scale and agility of cloud computing data centers are unprecedented so the previous approaches are fruitless. We suggest an analytical model for cloud computing data centers when the number of PMs in the data center is large. In particular, we focus on the assignment of VM onto PMs regardless of their current load. For exponential VM arrival with general distribution sojourn time, the mean power consumption is calculated. Then, we show the minimum power consumption under quality of service constraint will be achieved with randomize assignment of incoming VMs onto PMs. Extensive simulation supports the validity of our analytical model.Matching individual attributes with task types in collaborative citizen sciencehttps://peerj.com/articles/cs-2092019-07-292019-07-29Shinnosuke NakayamaMarina TorreOded NovMaurizio Porfiri
In citizen science, participants’ productivity is imperative to project success. We investigate the feasibility of a collaborative approach to citizen science, within which productivity is enhanced by capitalizing on the diversity of individual attributes among participants. Specifically, we explore the possibility of enhancing productivity by integrating multiple individual attributes to inform the choice of which task should be assigned to which individual. To that end, we collect data in an online citizen science project composed of two task types: (i) filtering images of interest from an image repository in a limited time, and (ii) allocating tags on the object in the filtered images over unlimited time. The first task is assigned to those who have more experience in playing action video games, and the second task to those who have higher intrinsic motivation to participate. While each attribute has weak predictive power on the task performance, we demonstrate a greater increase in productivity when assigning participants to the task based on a combination of these attributes. We acknowledge that such an increase is modest compared to the case where participants are randomly assigned to the tasks, which could offset the effort of implementing our attribute-based task assignment scheme. This study constitutes a first step toward understanding and capitalizing on individual differences in attributes toward enhancing productivity in collaborative citizen science.
In citizen science, participants’ productivity is imperative to project success. We investigate the feasibility of a collaborative approach to citizen science, within which productivity is enhanced by capitalizing on the diversity of individual attributes among participants. Specifically, we explore the possibility of enhancing productivity by integrating multiple individual attributes to inform the choice of which task should be assigned to which individual. To that end, we collect data in an online citizen science project composed of two task types: (i) filtering images of interest from an image repository in a limited time, and (ii) allocating tags on the object in the filtered images over unlimited time. The first task is assigned to those who have more experience in playing action video games, and the second task to those who have higher intrinsic motivation to participate. While each attribute has weak predictive power on the task performance, we demonstrate a greater increase in productivity when assigning participants to the task based on a combination of these attributes. We acknowledge that such an increase is modest compared to the case where participants are randomly assigned to the tasks, which could offset the effort of implementing our attribute-based task assignment scheme. This study constitutes a first step toward understanding and capitalizing on individual differences in attributes toward enhancing productivity in collaborative citizen science.An integrated platform for intuitive mathematical programming modeling using LaTeXhttps://peerj.com/articles/cs-1612018-09-102018-09-10Charalampos P. TriantafyllidisLazaros G. Papageorgiou
This paper presents a novel prototype platform that uses the same LaTeX mark-up language, commonly used to typeset mathematical content, as an input language for modeling optimization problems of various classes. The platform converts the LaTeX model into a formal Algebraic Modeling Language (AML) representation based on Pyomo through a parsing engine written in Python and solves by either via NEOS server or locally installed solvers, using a friendly Graphical User Interface (GUI). The distinct advantages of our approach can be summarized in (i) simplification and speed-up of the model design and development process (ii) non-commercial character (iii) cross-platform support (iv) easier typo and logic error detection in the description of the models and (v) minimization of working knowledge of programming and AMLs to perform mathematical programming modeling. Overall, this is a presentation of a complete workable scheme on using LaTeX for mathematical programming modeling which assists in furthering our ability to reproduce and replicate scientific work.
This paper presents a novel prototype platform that uses the same LaTeX mark-up language, commonly used to typeset mathematical content, as an input language for modeling optimization problems of various classes. The platform converts the LaTeX model into a formal Algebraic Modeling Language (AML) representation based on Pyomo through a parsing engine written in Python and solves by either via NEOS server or locally installed solvers, using a friendly Graphical User Interface (GUI). The distinct advantages of our approach can be summarized in (i) simplification and speed-up of the model design and development process (ii) non-commercial character (iii) cross-platform support (iv) easier typo and logic error detection in the description of the models and (v) minimization of working knowledge of programming and AMLs to perform mathematical programming modeling. Overall, this is a presentation of a complete workable scheme on using LaTeX for mathematical programming modeling which assists in furthering our ability to reproduce and replicate scientific work.Interval Coded Scoring: a toolbox for interpretable scoring systemshttps://peerj.com/articles/cs-1502018-04-022018-04-02Lieven BillietSabine Van HuffelVanya Van Belle
Over the last decades, clinical decision support systems have been gaining importance. They help clinicians to make effective use of the overload of available information to obtain correct diagnoses and appropriate treatments. However, their power often comes at the cost of a black box model which cannot be interpreted easily. This interpretability is of paramount importance in a medical setting with regard to trust and (legal) responsibility. In contrast, existing medical scoring systems are easy to understand and use, but they are often a simplified rule-of-thumb summary of previous medical experience rather than a well-founded system based on available data. Interval Coded Scoring (ICS) connects these two approaches, exploiting the power of sparse optimization to derive scoring systems from training data. The presented toolbox interface makes this theory easily applicable to both small and large datasets. It contains two possible problem formulations based on linear programming or elastic net. Both allow to construct a model for a binary classification problem and establish risk profiles that can be used for future diagnosis. All of this requires only a few lines of code. ICS differs from standard machine learning through its model consisting of interpretable main effects and interactions. Furthermore, insertion of expert knowledge is possible because the training can be semi-automatic. This allows end users to make a trade-off between complexity and performance based on cross-validation results and expert knowledge. Additionally, the toolbox offers an accessible way to assess classification performance via accuracy and the ROC curve, whereas the calibration of the risk profile can be evaluated via a calibration curve. Finally, the colour-coded model visualization has particular appeal if one wants to apply ICS manually on new observations, as well as for validation by experts in the specific application domains. The validity and applicability of the toolbox is demonstrated by comparing it to standard Machine Learning approaches such as Naive Bayes and Support Vector Machines for several real-life datasets. These case studies on medical problems show its applicability as a decision support system. ICS performs similarly in terms of classification and calibration. Its slightly lower performance is countered by its model simplicity which makes it the method of choice if interpretability is a key issue.
Over the last decades, clinical decision support systems have been gaining importance. They help clinicians to make effective use of the overload of available information to obtain correct diagnoses and appropriate treatments. However, their power often comes at the cost of a black box model which cannot be interpreted easily. This interpretability is of paramount importance in a medical setting with regard to trust and (legal) responsibility. In contrast, existing medical scoring systems are easy to understand and use, but they are often a simplified rule-of-thumb summary of previous medical experience rather than a well-founded system based on available data. Interval Coded Scoring (ICS) connects these two approaches, exploiting the power of sparse optimization to derive scoring systems from training data. The presented toolbox interface makes this theory easily applicable to both small and large datasets. It contains two possible problem formulations based on linear programming or elastic net. Both allow to construct a model for a binary classification problem and establish risk profiles that can be used for future diagnosis. All of this requires only a few lines of code. ICS differs from standard machine learning through its model consisting of interpretable main effects and interactions. Furthermore, insertion of expert knowledge is possible because the training can be semi-automatic. This allows end users to make a trade-off between complexity and performance based on cross-validation results and expert knowledge. Additionally, the toolbox offers an accessible way to assess classification performance via accuracy and the ROC curve, whereas the calibration of the risk profile can be evaluated via a calibration curve. Finally, the colour-coded model visualization has particular appeal if one wants to apply ICS manually on new observations, as well as for validation by experts in the specific application domains. The validity and applicability of the toolbox is demonstrated by comparing it to standard Machine Learning approaches such as Naive Bayes and Support Vector Machines for several real-life datasets. These case studies on medical problems show its applicability as a decision support system. ICS performs similarly in terms of classification and calibration. Its slightly lower performance is countered by its model simplicity which makes it the method of choice if interpretability is a key issue.