PeerJ Computer Sciencehttps://peerj.com/articles/index.atom?journal=cs Articles published in PeerJ Computer ScienceParallelisation of equation-based simulation programs on heterogeneous computing systemshttps://peerj.com/articles/cs-1602018-08-132018-08-13Dragan D. Nikolić
Numerical solutions of equation-based simulations require computationally intensive tasks such as evaluation of model equations, linear algebra operations and solution of systems of linear equations. The focus in this work is on parallel evaluation of model equations on shared memory systems such as general purpose processors (multi-core CPUs and manycore devices), streaming processors (Graphics Processing Units and Field Programmable Gate Arrays) and heterogeneous systems. The current approaches for evaluation of model equations are reviewed and their capabilities and shortcomings analysed. Since stream computing differs from traditional computing in that the system processes a sequential stream of elements, equations must be transformed into a data structure suitable for both types. The postfix notation expression stacks are recognised as a platform and programming language independent method to describe, store in computer memory and evaluate general systems of differential and algebraic equations of any size. Each mathematical operation and its operands are described by a specially designed data structure, and every equation is transformed into an array of these structures (a Compute Stack). Compute Stacks are evaluated by a stack machine using a Last In First Out queue. The stack machine is implemented in the DAE Tools modelling software in the C99 language using two Application Programming Interface (APIs)/frameworks for parallelism. The Open Multi-Processing (OpenMP) API is used for parallelisation on general purpose processors, and the Open Computing Language (OpenCL) framework is used for parallelisation on streaming processors and heterogeneous systems. The performance of the sequential Compute Stack approach is compared to the direct C++ implementation and to the previous approach that uses evaluation trees. The new approach is 45% slower than the C++ implementation and more than five times faster than the previous one. The OpenMP and OpenCL implementations are tested on three medium-scale models using a multi-core CPU, a discrete GPU, an integrated GPU and heterogeneous computing setups. Execution times are compared and analysed and the advantages of the OpenCL implementation running on a discrete GPU and heterogeneous systems are discussed. It is found that the evaluation of model equations using the parallel OpenCL implementation running on a discrete GPU is up to twelve times faster than the sequential version while the overall simulation speed-up gained is more than three times.
Numerical solutions of equation-based simulations require computationally intensive tasks such as evaluation of model equations, linear algebra operations and solution of systems of linear equations. The focus in this work is on parallel evaluation of model equations on shared memory systems such as general purpose processors (multi-core CPUs and manycore devices), streaming processors (Graphics Processing Units and Field Programmable Gate Arrays) and heterogeneous systems. The current approaches for evaluation of model equations are reviewed and their capabilities and shortcomings analysed. Since stream computing differs from traditional computing in that the system processes a sequential stream of elements, equations must be transformed into a data structure suitable for both types. The postfix notation expression stacks are recognised as a platform and programming language independent method to describe, store in computer memory and evaluate general systems of differential and algebraic equations of any size. Each mathematical operation and its operands are described by a specially designed data structure, and every equation is transformed into an array of these structures (a Compute Stack). Compute Stacks are evaluated by a stack machine using a Last In First Out queue. The stack machine is implemented in the DAE Tools modelling software in the C99 language using two Application Programming Interface (APIs)/frameworks for parallelism. The Open Multi-Processing (OpenMP) API is used for parallelisation on general purpose processors, and the Open Computing Language (OpenCL) framework is used for parallelisation on streaming processors and heterogeneous systems. The performance of the sequential Compute Stack approach is compared to the direct C++ implementation and to the previous approach that uses evaluation trees. The new approach is 45% slower than the C++ implementation and more than five times faster than the previous one. The OpenMP and OpenCL implementations are tested on three medium-scale models using a multi-core CPU, a discrete GPU, an integrated GPU and heterogeneous computing setups. Execution times are compared and analysed and the advantages of the OpenCL implementation running on a discrete GPU and heterogeneous systems are discussed. It is found that the evaluation of model equations using the parallel OpenCL implementation running on a discrete GPU is up to twelve times faster than the sequential version while the overall simulation speed-up gained is more than three times.Temporal constrained objects for modelling neuronal dynamicshttps://peerj.com/articles/cs-1592018-07-232018-07-23Manjusha NairJinesh Manchan KannimoolaBharat JayaramanBipin NairShyam Diwakar
Background
Several new programming languages and technologies have emerged in the past few decades in order to ease the task of modelling complex systems. Modelling the dynamics of complex systems requires various levels of abstractions and reductive measures in representing the underlying behaviour. This also often requires making a trade-off between how realistic a model should be in order to address the scientific questions of interest and the computational tractability of the model.
Methods
In this paper, we propose a novel programming paradigm, called temporal constrained objects, which facilitates a principled approach to modelling complex dynamical systems. Temporal constrained objects are an extension of constrained objects with a focus on the analysis and prediction of the dynamic behaviour of a system. The structural aspects of a neuronal system are represented using objects, as in object-oriented languages, while the dynamic behaviour of neurons and synapses are modelled using declarative temporal constraints. Computation in this paradigm is a process of constraint satisfaction within a time-based simulation.
Results
We identified the feasibility and practicality in automatically mapping different kinds of neuron and synapse models to the constraints of temporal constrained objects. Simple neuronal networks were modelled by composing circuit components, implicitly satisfying the internal constraints of each component and interface constraints of the composition. Simulations show that temporal constrained objects provide significant conciseness in the formulation of these models. The underlying computational engine employed here automatically finds the solutions to the problems stated, reducing the code for modelling and simulation control. All examples reported in this paper have been programmed and successfully tested using the prototype language called TCOB. The code along with the programming environment are available at http://github.com/compneuro/TCOB_Neuron.
Discussion
Temporal constrained objects provide powerful capabilities for modelling the structural and dynamic aspects of neural systems. Capabilities of the constraint programming paradigm, such as declarative specification, the ability to express partial information and non-directionality, and capabilities of the object-oriented paradigm especially aggregation and inheritance, make this paradigm the right candidate for complex systems and computational modelling studies. With the advent of multi-core parallel computer architectures and techniques or parallel constraint-solving, the paradigm of temporal constrained objects lends itself to highly efficient execution which is necessary for modelling and simulation of large brain circuits.
Background
Several new programming languages and technologies have emerged in the past few decades in order to ease the task of modelling complex systems. Modelling the dynamics of complex systems requires various levels of abstractions and reductive measures in representing the underlying behaviour. This also often requires making a trade-off between how realistic a model should be in order to address the scientific questions of interest and the computational tractability of the model.
Methods
In this paper, we propose a novel programming paradigm, called temporal constrained objects, which facilitates a principled approach to modelling complex dynamical systems. Temporal constrained objects are an extension of constrained objects with a focus on the analysis and prediction of the dynamic behaviour of a system. The structural aspects of a neuronal system are represented using objects, as in object-oriented languages, while the dynamic behaviour of neurons and synapses are modelled using declarative temporal constraints. Computation in this paradigm is a process of constraint satisfaction within a time-based simulation.
Results
We identified the feasibility and practicality in automatically mapping different kinds of neuron and synapse models to the constraints of temporal constrained objects. Simple neuronal networks were modelled by composing circuit components, implicitly satisfying the internal constraints of each component and interface constraints of the composition. Simulations show that temporal constrained objects provide significant conciseness in the formulation of these models. The underlying computational engine employed here automatically finds the solutions to the problems stated, reducing the code for modelling and simulation control. All examples reported in this paper have been programmed and successfully tested using the prototype language called TCOB. The code along with the programming environment are available at http://github.com/compneuro/TCOB_Neuron.
Discussion
Temporal constrained objects provide powerful capabilities for modelling the structural and dynamic aspects of neural systems. Capabilities of the constraint programming paradigm, such as declarative specification, the ability to express partial information and non-directionality, and capabilities of the object-oriented paradigm especially aggregation and inheritance, make this paradigm the right candidate for complex systems and computational modelling studies. With the advent of multi-core parallel computer architectures and techniques or parallel constraint-solving, the paradigm of temporal constrained objects lends itself to highly efficient execution which is necessary for modelling and simulation of large brain circuits.Verifiability in computer-aided research: the role of digital scientific notations at the human-computer interfacehttps://peerj.com/articles/cs-1582018-07-232018-07-23Konrad Hinsen
Most of today’s scientific research relies on computers and software for processing scientific information. Examples of such computer-aided research are the analysis of experimental data or the simulation of phenomena based on theoretical models. With the rapid increase of computational power, scientific software has integrated more and more complex scientific knowledge in a black-box fashion. As a consequence, its users do not know, and do not even have a chance of finding out, which assumptions and approximations their computations are based on. This black-box nature of scientific software has made the verification of much computer-aided research close to impossible. The present work starts with an analysis of this situation from the point of view of human-computer interaction in scientific research. It identifies the key role of digital scientific notations at the human-computer interface, reviews the most popular ones in use today, and describes a proof-of-concept implementation of Leibniz, a language designed as a verifiable digital scientific notation for models formulated as mathematical equations.
Most of today’s scientific research relies on computers and software for processing scientific information. Examples of such computer-aided research are the analysis of experimental data or the simulation of phenomena based on theoretical models. With the rapid increase of computational power, scientific software has integrated more and more complex scientific knowledge in a black-box fashion. As a consequence, its users do not know, and do not even have a chance of finding out, which assumptions and approximations their computations are based on. This black-box nature of scientific software has made the verification of much computer-aided research close to impossible. The present work starts with an analysis of this situation from the point of view of human-computer interaction in scientific research. It identifies the key role of digital scientific notations at the human-computer interface, reviews the most popular ones in use today, and describes a proof-of-concept implementation of Leibniz, a language designed as a verifiable digital scientific notation for models formulated as mathematical equations.Combining active learning suggestionshttps://peerj.com/articles/cs-1572018-07-232018-07-23Alasdair TranCheng Soon OngChristian Wolf
We study the problem of combining active learning suggestions to identify informative training examples by empirically comparing methods on benchmark datasets. Many active learning heuristics for classification problems have been proposed to help us pick which instance to annotate next. But what is the optimal heuristic for a particular source of data? Motivated by the success of methods that combine predictors, we combine active learners with bandit algorithms and rank aggregation methods. We demonstrate that a combination of active learners outperforms passive learning in large benchmark datasets and removes the need to pick a particular active learner a priori. We discuss challenges to finding good rewards for bandit approaches and show that rank aggregation performs well.
We study the problem of combining active learning suggestions to identify informative training examples by empirically comparing methods on benchmark datasets. Many active learning heuristics for classification problems have been proposed to help us pick which instance to annotate next. But what is the optimal heuristic for a particular source of data? Motivated by the success of methods that combine predictors, we combine active learners with bandit algorithms and rank aggregation methods. We demonstrate that a combination of active learners outperforms passive learning in large benchmark datasets and removes the need to pick a particular active learner a priori. We discuss challenges to finding good rewards for bandit approaches and show that rank aggregation performs well.Comparison and benchmark of name-to-gender inference serviceshttps://peerj.com/articles/cs-1562018-07-162018-07-16Lucía SantamaríaHelena Mihaljević
The increased interest in analyzing and explaining gender inequalities in tech, media, and academia highlights the need for accurate inference methods to predict a person’s gender from their name. Several such services exist that provide access to large databases of names, often enriched with information from social media profiles, culture-specific rules, and insights from sociolinguistics. We compare and benchmark five name-to-gender inference services by applying them to the classification of a test data set consisting of 7,076 manually labeled names. The compiled names are analyzed and characterized according to their geographical and cultural origin. We define a series of performance metrics to quantify various types of classification errors, and define a parameter tuning procedure to search for optimal values of the services’ free parameters. Finally, we perform benchmarks of all services under study regarding several scenarios where a particular metric is to be optimized.
The increased interest in analyzing and explaining gender inequalities in tech, media, and academia highlights the need for accurate inference methods to predict a person’s gender from their name. Several such services exist that provide access to large databases of names, often enriched with information from social media profiles, culture-specific rules, and insights from sociolinguistics. We compare and benchmark five name-to-gender inference services by applying them to the classification of a test data set consisting of 7,076 manually labeled names. The compiled names are analyzed and characterized according to their geographical and cultural origin. We define a series of performance metrics to quantify various types of classification errors, and define a parameter tuning procedure to search for optimal values of the services’ free parameters. Finally, we perform benchmarks of all services under study regarding several scenarios where a particular metric is to be optimized.An algorithm for calculating top-dimensional bounding chainshttps://peerj.com/articles/cs-1532018-05-282018-05-28J. Frederico CarvalhoMikael Vejdemo-JohanssonDanica KragicFlorian T. Pokorny
We describe the Coefficient-Flow algorithm for calculating the bounding chain of an $(n-1)$-boundary on an $n$-manifold-like simplicial complex $S$. We prove its correctness and show that it has a computational time complexity of O(|S(n−1)|) (where S(n−1) is the set of $(n-1)$-faces of $S$). We estimate the big- $O$ coefficient which depends on the dimension of $S$ and the implementation. We present an implementation, experimentally evaluate the complexity of our algorithm, and compare its performance with that of solving the underlying linear system.
We describe the Coefficient-Flow algorithm for calculating the bounding chain of an $(n-1)$-boundary on an $n$-manifold-like simplicial complex $S$. We prove its correctness and show that it has a computational time complexity of O(|S(n−1)|) (where S(n−1) is the set of $(n-1)$-faces of $S$). We estimate the big- $O$ coefficient which depends on the dimension of $S$ and the implementation. We present an implementation, experimentally evaluate the complexity of our algorithm, and compare its performance with that of solving the underlying linear system.ClusterEnG: an interactive educational web resource for clustering and visualizing high-dimensional datahttps://peerj.com/articles/cs-1552018-05-212018-05-21Mohith ManjunathYi ZhangYeonsung KimSteve H. YeoOmar SobhNathan RussellChristian FollowellColleen BushellUmberto RavaioliJun S. Song
Background
Clustering is one of the most common techniques in data analysis and seeks to group together data points that are similar in some measure. Although there are many computer programs available for performing clustering, a single web resource that provides several state-of-the-art clustering methods, interactive visualizations and evaluation of clustering results is lacking.
Methods
ClusterEnG (acronym for Clustering Engine for Genomics) provides a web interface for clustering data and interactive visualizations including 3D views, data selection and zoom features. Eighteen clustering validation measures are also presented to aid the user in selecting a suitable algorithm for their dataset. ClusterEnG also aims at educating the user about the similarities and differences between various clustering algorithms and provides tutorials that demonstrate potential pitfalls of each algorithm.
Conclusions
The web resource will be particularly useful to scientists who are not conversant with computing but want to understand the structure of their data in an intuitive manner. The validation measures facilitate the process of choosing a suitable clustering algorithm among the available options. ClusterEnG is part of a bigger project called KnowEnG (Knowledge Engine for Genomics) and is available at http://education.knoweng.org/clustereng.
Background
Clustering is one of the most common techniques in data analysis and seeks to group together data points that are similar in some measure. Although there are many computer programs available for performing clustering, a single web resource that provides several state-of-the-art clustering methods, interactive visualizations and evaluation of clustering results is lacking.
Methods
ClusterEnG (acronym for Clustering Engine for Genomics) provides a web interface for clustering data and interactive visualizations including 3D views, data selection and zoom features. Eighteen clustering validation measures are also presented to aid the user in selecting a suitable algorithm for their dataset. ClusterEnG also aims at educating the user about the similarities and differences between various clustering algorithms and provides tutorials that demonstrate potential pitfalls of each algorithm.
Conclusions
The web resource will be particularly useful to scientists who are not conversant with computing but want to understand the structure of their data in an intuitive manner. The validation measures facilitate the process of choosing a suitable clustering algorithm among the available options. ClusterEnG is part of a bigger project called KnowEnG (Knowledge Engine for Genomics) and is available at http://education.knoweng.org/clustereng.Enhancing discovery in spatial data infrastructures using a search enginehttps://peerj.com/articles/cs-1522018-05-212018-05-21Paolo CortiAthanasios Tom KralidisBenjamin Lewis
A spatial data infrastructure (SDI) is a framework of geospatial data, metadata, users and tools intended to provide an efficient and flexible way to use spatial information. One of the key software components of an SDI is the catalogue service which is needed to discover, query and manage the metadata. Catalogue services in an SDI are typically based on the Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) standard which defines common interfaces for accessing the metadata information. A search engine is a software system capable of supporting fast and reliable search, which may use ‘any means necessary’ to get users to the resources they need quickly and efficiently. These techniques may include full text search, natural language processing, weighted results, fuzzy tolerance results, faceting, hit highlighting, recommendations and many others. In this paper we present an example of a search engine being added to an SDI to improve search against large collections of geospatial datasets. The Centre for Geographic Analysis (CGA) at Harvard University re-engineered the search component of its public domain SDI (Harvard WorldMap) which is based on the GeoNode platform. A search engine was added to the SDI stack to enhance the CSW catalogue discovery abilities. It is now possible to discover spatial datasets from metadata by using the standard search operations of the catalogue and to take advantage of the new abilities of the search engine, to return relevant and reliable content to SDI users.
A spatial data infrastructure (SDI) is a framework of geospatial data, metadata, users and tools intended to provide an efficient and flexible way to use spatial information. One of the key software components of an SDI is the catalogue service which is needed to discover, query and manage the metadata. Catalogue services in an SDI are typically based on the Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) standard which defines common interfaces for accessing the metadata information. A search engine is a software system capable of supporting fast and reliable search, which may use ‘any means necessary’ to get users to the resources they need quickly and efficiently. These techniques may include full text search, natural language processing, weighted results, fuzzy tolerance results, faceting, hit highlighting, recommendations and many others. In this paper we present an example of a search engine being added to an SDI to improve search against large collections of geospatial datasets. The Centre for Geographic Analysis (CGA) at Harvard University re-engineered the search component of its public domain SDI (Harvard WorldMap) which is based on the GeoNode platform. A search engine was added to the SDI stack to enhance the CSW catalogue discovery abilities. It is now possible to discover spatial datasets from metadata by using the standard search operations of the catalogue and to take advantage of the new abilities of the search engine, to return relevant and reliable content to SDI users.Supervised deep learning embeddings for the prediction of cervical cancer diagnosishttps://peerj.com/articles/cs-1542018-05-142018-05-14Kelwin FernandesDavide ChiccoJaime S. CardosoJessica Fernandes
Cervical cancer remains a significant cause of mortality all around the world, even if it can be prevented and cured by removing affected tissues in early stages. Providing universal and efficient access to cervical screening programs is a challenge that requires identifying vulnerable individuals in the population, among other steps. In this work, we present a computationally automated strategy for predicting the outcome of the patient biopsy, given risk patterns from individual medical records. We propose a machine learning technique that allows a joint and fully supervised optimization of dimensionality reduction and classification models. We also build a model able to highlight relevant properties in the low dimensional space, to ease the classification of patients. We instantiated the proposed approach with deep learning architectures, and achieved accurate prediction results (top area under the curve AUC = 0.6875) which outperform previously developed methods, such as denoising autoencoders. Additionally, we explored some clinical findings from the embedding spaces, and we validated them through the medical literature, making them reliable for physicians and biomedical researchers.
Cervical cancer remains a significant cause of mortality all around the world, even if it can be prevented and cured by removing affected tissues in early stages. Providing universal and efficient access to cervical screening programs is a challenge that requires identifying vulnerable individuals in the population, among other steps. In this work, we present a computationally automated strategy for predicting the outcome of the patient biopsy, given risk patterns from individual medical records. We propose a machine learning technique that allows a joint and fully supervised optimization of dimensionality reduction and classification models. We also build a model able to highlight relevant properties in the low dimensional space, to ease the classification of patients. We instantiated the proposed approach with deep learning architectures, and achieved accurate prediction results (top area under the curve AUC = 0.6875) which outperform previously developed methods, such as denoising autoencoders. Additionally, we explored some clinical findings from the embedding spaces, and we validated them through the medical literature, making them reliable for physicians and biomedical researchers.Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructionshttps://peerj.com/articles/cs-1512018-04-302018-04-30Bérenger BramasPavel Kus
The sparse matrix-vector product (SpMV) is a fundamental operation in many scientific applications from various fields. The High Performance Computing (HPC) community has therefore continuously invested a lot of effort to provide an efficient SpMV kernel on modern CPU architectures. Although it has been shown that block-based kernels help to achieve high performance, they are difficult to use in practice because of the zero padding they require. In the current paper, we propose new kernels using the AVX-512 instruction set, which makes it possible to use a blocking scheme without any zero padding in the matrix memory storage. We describe mask-based sparse matrix formats and their corresponding SpMV kernels highly optimized in assembly language. Considering that the optimal blocking size depends on the matrix, we also provide a method to predict the best kernel to be used utilizing a simple interpolation of results from previous executions. We compare the performance of our approach to that of the Intel MKL CSR kernel and the CSR5 open-source package on a set of standard benchmark matrices. We show that we can achieve significant improvements in many cases, both for sequential and for parallel executions. Finally, we provide the corresponding code in an open source library, called SPC5.
The sparse matrix-vector product (SpMV) is a fundamental operation in many scientific applications from various fields. The High Performance Computing (HPC) community has therefore continuously invested a lot of effort to provide an efficient SpMV kernel on modern CPU architectures. Although it has been shown that block-based kernels help to achieve high performance, they are difficult to use in practice because of the zero padding they require. In the current paper, we propose new kernels using the AVX-512 instruction set, which makes it possible to use a blocking scheme without any zero padding in the matrix memory storage. We describe mask-based sparse matrix formats and their corresponding SpMV kernels highly optimized in assembly language. Considering that the optimal blocking size depends on the matrix, we also provide a method to predict the best kernel to be used utilizing a simple interpolation of results from previous executions. We compare the performance of our approach to that of the Intel MKL CSR kernel and the CSR5 open-source package on a set of standard benchmark matrices. We show that we can achieve significant improvements in many cases, both for sequential and for parallel executions. Finally, we provide the corresponding code in an open source library, called SPC5.