PeerJ Computer Science Preprints: Scientific Computing and Simulationhttps://peerj.com/preprints/index.atom?journal=cs&subject=11100Scientific Computing and Simulation articles published in PeerJ Computer Science PreprintsImplementation of adaptive integration method for free energy calculations in molecular systemshttps://peerj.com/preprints/279352019-09-042019-09-04Christopher A MirabzadehF. Marty Ytreberg
Estimating free energy differences by computer simulation is useful for a wide variety of applications such as virtual screening for drug design and for understanding how amino acid mutations modify protein interactions. However, calculating free energy differences remains challenging and often requires extensive trial and error and very long simulation times in order to achieve converged results. Here, we present an implementation of the adaptive integration method (AIM). We tested our implementation on two molecular systems and compared results from AIM to those from a suite of standard methods. The model systems tested here include calculating the solvation free energy of methane, and the free energy of mutating the peptide GAG to GVG. We show that AIM is more efficient than standard methods for these test cases, that is, AIM results converge to a higher level of accuracy and precision for a given simulation time.
Estimating free energy differences by computer simulation is useful for a wide variety of applications such as virtual screening for drug design and for understanding how amino acid mutations modify protein interactions. However, calculating free energy differences remains challenging and often requires extensive trial and error and very long simulation times in order to achieve converged results. Here, we present an implementation of the adaptive integration method (AIM). We tested our implementation on two molecular systems and compared results from AIM to those from a suite of standard methods. The model systems tested here include calculating the solvation free energy of methane, and the free energy of mutating the peptide GAG to GVG. We show that AIM is more efficient than standard methods for these test cases, that is, AIM results converge to a higher level of accuracy and precision for a given simulation time.12 Grand Challenges in Single-Cell Data Sciencehttps://peerj.com/preprints/278852019-08-232019-08-23David LaehnemannJohannes KösterEwa SzczurekDavis J McCarthyStephanie C HicksMark D RobinsonCatalina A VallejosNiko BeerenwinkelKieran R CampbellAhmed MahfouzLuca PinelloPavel SkumsAlexandros StamatakisCamille Stephan-Otto AttoliniSamuel AparicioJasmijn BaaijensMarleen BalvertBuys de BarbansonAntonio CappuccioGiacomo CorleoneBas E DutilhMaria FlorescuVictor GuryevRens HolmerKatharina JahnThamar Jessurun LoboEmma M KeizerIndu KhatriSzymon M KiełbasaJan O KorbelAlexey M KozlovTzu-Hao KuoBoudewijn PF LelieveldtIon I MandoiuJohn C MarioniTobias MarschallFelix MölderAmir NiknejadŁukasz RączkowskiMarcel ReindersJeroen de RidderAntoine-Emmanuel SalibaAntonios SomarakisOliver StegleFabian J TheisHuan YangAlex ZelikovskyAlice C McHardyBenjamin J RaphaelSohrab P ShahAlexander Schönhuth
The recent upswing of microfluidics and combinatorial indexing strategies, further enhanced by very low sequencing costs, have turned single cell sequencing into an empowering technology; analyzing thousands—or even millions—of cells per experimental run is becoming a routine assignment in laboratories worldwide. As a consequence, we are witnessing a data revolution in single cell biology. Although some issues are similar in spirit to those experienced in bulk sequencing, many of the emerging data science problems are unique to single cell analysis; together, they give rise to the new realm of 'Single-Cell Data Science'.
Here, we outline twelve challenges that will be central in bringing this new field forward. For each challenge, the current state of the art in terms of prior work is reviewed, and open problems are formulated, with an emphasis on the research goals that motivate them.
This compendium is meant to serve as a guideline for established researchers, newcomers and students alike, highlighting interesting and rewarding problems in 'Single-Cell Data Science' for the coming years.
The recent upswing of microfluidics and combinatorial indexing strategies, further enhanced by very low sequencing costs, have turned single cell sequencing into an empowering technology; analyzing thousands—or even millions—of cells per experimental run is becoming a routine assignment in laboratories worldwide. As a consequence, we are witnessing a data revolution in single cell biology. Although some issues are similar in spirit to those experienced in bulk sequencing, many of the emerging data science problems are unique to single cell analysis; together, they give rise to the new realm of 'Single-Cell Data Science'.Here, we outline twelve challenges that will be central in bringing this new field forward. For each challenge, the current state of the art in terms of prior work is reviewed, and open problems are formulated, with an emphasis on the research goals that motivate them.This compendium is meant to serve as a guideline for established researchers, newcomers and students alike, highlighting interesting and rewarding problems in 'Single-Cell Data Science' for the coming years.Towards a quantitative model of epidemics during conflictshttps://peerj.com/preprints/276512019-08-042019-08-04Soumya Banerjee
Epidemics may both contribute to and arise as a result of conflict. The effects of conflict on infectious diseases are complex and there have been confounding observations of both increase and decrease in disease outbreaks during and after conflicts. However there is no unified mathematical model that explains all these counter-intuitive observations. There is an urgent need for a quantitative framework for modelling conflicts and epidemics. We introduce a set of mathematical models to understand the role of conflicts in epidemics. Our mathematical framework has the potential to explain the counterintuitive observations and the complex role of human conflicts in epidemics. Our work suggests that aid and peacekeeping organizations should take an integrated approach that combines public health measures, socio-economic development, and peacekeeping in the conflict zone. Our approach exemplifies the role of non-linear thinking in complex systems like human societies. We view our work as a step towards a quantitative model of disease spread in conflicts.
Epidemics may both contribute to and arise as a result of conflict. The effects of conflict on infectious diseases are complex and there have been confounding observations of both increase and decrease in disease outbreaks during and after conflicts. However there is no unified mathematical model that explains all these counter-intuitive observations. There is an urgent need for a quantitative framework for modelling conflicts and epidemics. We introduce a set of mathematical models to understand the role of conflicts in epidemics. Our mathematical framework has the potential to explain the counterintuitive observations and the complex role of human conflicts in epidemics. Our work suggests that aid and peacekeeping organizations should take an integrated approach that combines public health measures, socio-economic development, and peacekeeping in the conflict zone. Our approach exemplifies the role of non-linear thinking in complex systems like human societies. We view our work as a step towards a quantitative model of disease spread in conflicts.Lean healthcare integrated with discrete event simulation and design of experiments: an emergency department expansionhttps://peerj.com/preprints/278812019-08-012019-08-01Gustavo Teodoro GabrielAfonso Teberga CamposAline de Lima MagachoLucas Cavallieri SegismondiFlávio Fraga VilelaJose Antonio de QueirozJosé Arnaldo Barra Montevechi
Background. Discrete Event Simulation (DES) and Lean Healthcare are management tools that are efficient and assist in the quality and efficiency of health services. In this sense, the purpose of the study is to use lean principles jointly with DES to plan the expansion of a Canadian emergency department and to the demand that comes from small closed care centers.
Methods. For this, we used simulation and modeling method. We simulated the emergency department in FlexSim Healthcare® software and, with the Design of Experiments (DoE), we defined the optimal number of locations and resources for each shift.
Results. The results show that the ED cannot meet expected demand in the current state. Only 17.2% of the patients were completed treated, and the Length of Stay (LOS), on average, was 2213.7, with a confidence interval of (2131.8 - 2295.6) minutes. However, after changing decision variables, the number of treated patients increased to 95.7% (approximately 600%). Average LOS decreased to 461.2, with a confidence interval of (453.7 - 468.7) minutes, about 79.0%. In addition, the study shows that emergency department staff are balanced, according to Lean principles.
Background. Discrete Event Simulation (DES) and Lean Healthcare are management tools that are efficient and assist in the quality and efficiency of health services. In this sense, the purpose of the study is to use lean principles jointly with DES to plan the expansion of a Canadian emergency department and to the demand that comes from small closed care centers.Methods. For this, we used simulation and modeling method. We simulated the emergency department in FlexSim Healthcare® software and, with the Design of Experiments (DoE), we defined the optimal number of locations and resources for each shift.Results. The results show that the ED cannot meet expected demand in the current state. Only 17.2% of the patients were completed treated, and the Length of Stay (LOS), on average, was 2213.7, with a confidence interval of (2131.8 - 2295.6) minutes. However, after changing decision variables, the number of treated patients increased to 95.7% (approximately 600%). Average LOS decreased to 461.2, with a confidence interval of (453.7 - 468.7) minutes, about 79.0%. In addition, the study shows that emergency department staff are balanced, according to Lean principles.Improving the resolution of microscope by deconvolution after dense scanhttps://peerj.com/preprints/278492019-07-292019-07-29Yaohua Xie
Super-resolution microscopes (such as STED) illuminate samples with a tiny spot, and achieve very high resolution. But structures smaller than the spot cannot be resolved in this way. Therefore, we propose a technique to solve this problem. It is termed “Deconvolution after Dense Scan (DDS)”. First, a preprocessing stage is introduced to eliminate the optical uncertainty of the peripheral areas around the sample’s ROI (Region of Interest). Then, the ROI is scanned densely together with its peripheral areas. Finally, the high resolution image is recovered by deconvolution. The proposed technique does not need to modify the apparatus much, and is mainly performed by algorithm. Simulation experiments show that the technique can further improve the resolution of super-resolution microscopes.
Super-resolution microscopes (such as STED) illuminate samples with a tiny spot, and achieve very high resolution. But structures smaller than the spot cannot be resolved in this way. Therefore, we propose a technique to solve this problem. It is termed “Deconvolution after Dense Scan (DDS)”. First, a preprocessing stage is introduced to eliminate the optical uncertainty of the peripheral areas around the sample’s ROI (Region of Interest). Then, the ROI is scanned densely together with its peripheral areas. Finally, the high resolution image is recovered by deconvolution. The proposed technique does not need to modify the apparatus much, and is mainly performed by algorithm. Simulation experiments show that the technique can further improve the resolution of super-resolution microscopes.Grounded Design and GIScience - A framework for informing the design of geographical information systems and spatial data infrastructureshttps://peerj.com/preprints/278222019-06-252019-06-25Alexander KmochEvelyn UuemaaHermann Klug
Geographical Information Science (GIScience), also Geographical Information Science and Systems, is a multi-faceted research discipline and comprises a wide variety of topics. Investigation into data management and interoperability of geographical data and environmental data sets for scientific analysis, visualisation and modelling is an important driver of the Information Science aspect of GIScience, that underpins comprehensive Geographical Information Systems (GIS) and Spatial Data Infrastructure (SDI) research and development. In this article we present the 'Grounded Design' method, a fusion of Design Science Research (DSR) and Grounded Theory (GT), and how they can act as guiding principles to link GIScience, Computer Science and Earth Sciences into a converging GI systems development framework. We explain how this bottom-up research framework can yield holistic and integrated perspectives when designing GIS and SDI systems and software. This would allow GIScience academics, GIS and SDI practitioners alike to reliably draw from interdisciplinary knowledge to consistently design and innovate GI systems.
Geographical Information Science (GIScience), also Geographical Information Science and Systems, is a multi-faceted research discipline and comprises a wide variety of topics. Investigation into data management and interoperability of geographical data and environmental data sets for scientific analysis, visualisation and modelling is an important driver of the Information Science aspect of GIScience, that underpins comprehensive Geographical Information Systems (GIS) and Spatial Data Infrastructure (SDI) research and development. In this article we present the 'Grounded Design' method, a fusion of Design Science Research (DSR) and Grounded Theory (GT), and how they can act as guiding principles to link GIScience, Computer Science and Earth Sciences into a converging GI systems development framework. We explain how this bottom-up research framework can yield holistic and integrated perspectives when designing GIS and SDI systems and software. This would allow GIScience academics, GIS and SDI practitioners alike to reliably draw from interdisciplinary knowledge to consistently design and innovate GI systems.A new algorithm for band detection and pattern extraction on pulsed-field gel electrophoresis imageshttps://peerj.com/preprints/277712019-06-022019-06-02Mohammad RezaeiNaser ZohorianNemat SoltaniParviz Mohajeri
This paper presents a new approach for band detection and pattern recognition for molecule types. Although a few studies have examined band detection, but there is still no automatic method that can perform well despite the high noise. The band detection algorithm was designed in two parts, including band location and lane pattern recognition. In order to improve band detection and remove undesirable bands, the shape and light intensity of the bands were used as features. One-hundred lane images were selected for the training stage and 350 lane images for the testing stage to evaluate the proposed algorithm in a random fashion. All the images were prepared using PFGE BIORAD at the Microbiology Laboratory of Kermanshah University of Medical Sciences. An adaptive median filter with a filter size of 5x5 was selected as the optimal filter for removing noise. The results showed that the proposed algorithm has a 98.45% accuracy and is associated with less errors compared to other methods. The proposed algorithm has a good accuracy for band detection in pulsed-field gel electrophoresis images. Considering the shape of the peaks caused by the bands in the vertical projection profile of the signal, this method can reduce band detection errors. To improve accuracy, we recommend that the designed algorithm be examined for other types of molecules as well.
This paper presents a new approach for band detection and pattern recognition for molecule types. Although a few studies have examined band detection, but there is still no automatic method that can perform well despite the high noise. The band detection algorithm was designed in two parts, including band location and lane pattern recognition. In order to improve band detection and remove undesirable bands, the shape and light intensity of the bands were used as features. One-hundred lane images were selected for the training stage and 350 lane images for the testing stage to evaluate the proposed algorithm in a random fashion. All the images were prepared using PFGE BIORAD at the Microbiology Laboratory of Kermanshah University of Medical Sciences. An adaptive median filter with a filter size of 5x5 was selected as the optimal filter for removing noise. The results showed that the proposed algorithm has a 98.45% accuracy and is associated with less errors compared to other methods. The proposed algorithm has a good accuracy for band detection in pulsed-field gel electrophoresis images. Considering the shape of the peaks caused by the bands in the vertical projection profile of the signal, this method can reduce band detection errors. To improve accuracy, we recommend that the designed algorithm be examined for other types of molecules as well.GIS analysis of geological surfaces orientations: the qgSurf plugin for QGIShttps://peerj.com/preprints/276942019-04-302019-04-30Mauro Alberti
GIS techniques enable the quantitative analysis of geological structures. In particular, topographic traces of geological lineaments can be compared with the theoretical ones for geological planes, to determine the best fitting theoretical planes. qgSurf, a Python plugin for QGIS, implements this kind of processing, in addition to the determination of the best-fit plane to a set of topographic points, the calculation of the distances between topographic traces and geological planes and also basic stereonet plottings. By applying these tools to a case study of a Cenozoic thrust lineament in the Southern Apennines (Calabria, Southern Italy), we deduce the approximate orientations of the lineament in different fault-delimited sectors and calculate the misfits between the theoretical orientations and the actual topographic traces.
GIS techniques enable the quantitative analysis of geological structures. In particular, topographic traces of geological lineaments can be compared with the theoretical ones for geological planes, to determine the best fitting theoretical planes. qgSurf, a Python plugin for QGIS, implements this kind of processing, in addition to the determination of the best-fit plane to a set of topographic points, the calculation of the distances between topographic traces and geological planes and also basic stereonet plottings. By applying these tools to a case study of a Cenozoic thrust lineament in the Southern Apennines (Calabria, Southern Italy), we deduce the approximate orientations of the lineament in different fault-delimited sectors and calculate the misfits between the theoretical orientations and the actual topographic traces.Component-oriented acausal modeling of the dynamical systems in Python language on the example of the model of the sucker rod stringhttps://peerj.com/preprints/276122019-03-222019-03-22Volodymyr B KopeiOleh R OnyskoVitalii G Panchuk
As a rule, the limitations of specialized modeling languages for acausal modeling of the complex dynamical systems are: limited applicability, poor interoperability with the third party software packages, the high cost of learning, the complexity of the implementation of hybrid modeling and modeling systems with the variable structure, the complexity of the modifications and improvements. In order to solve these problems, it is proposed to develop the easy-to-understand and to modify component-oriented acausal hybrid modeling system that is based on: (1) the general-purpose programming language Python, (2) the description of components by Python classes, (3) the description of components behavior by difference equations using declarative tools SymPy, (4) the event generation using Python imperative constructs, (5) composing and solving the system of algebraic equations in each discrete time point of the simulation. The classes that allow creating the models in Python without the need to study and apply specialized modeling languages are developed. These classes can also be used to automate the construction of the system of difference equations, describing the behavior of the model in a symbolic form. The basic set of mechanical components is developed — 1D translational components "mass", "spring-damper", "force". Using these components, the models of sucker rods string are developed and simulated. These simulation results are compared with the simulation results in Modelica language. The replacement of differential equations by difference equations allow simplifying the implementation of the hybrid modeling and the requirements for the modules for symbolic mathematics and for solving equations.
As a rule, the limitations of specialized modeling languages for acausal modeling of the complex dynamical systems are: limited applicability, poor interoperability with the third party software packages, the high cost of learning, the complexity of the implementation of hybrid modeling and modeling systems with the variable structure, the complexity of the modifications and improvements. In order to solve these problems, it is proposed to develop the easy-to-understand and to modify component-oriented acausal hybrid modeling system that is based on: (1) the general-purpose programming language Python, (2) the description of components by Python classes, (3) the description of components behavior by difference equations using declarative tools SymPy, (4) the event generation using Python imperative constructs, (5) composing and solving the system of algebraic equations in each discrete time point of the simulation. The classes that allow creating the models in Python without the need to study and apply specialized modeling languages are developed. These classes can also be used to automate the construction of the system of difference equations, describing the behavior of the model in a symbolic form. The basic set of mechanical components is developed — 1D translational components "mass", "spring-damper", "force". Using these components, the models of sucker rods string are developed and simulated. These simulation results are compared with the simulation results in Modelica language. The replacement of differential equations by difference equations allow simplifying the implementation of the hybrid modeling and the requirements for the modules for symbolic mathematics and for solving equations.Data based intervention approach for Complexity-Causality measurehttps://peerj.com/preprints/274162018-12-072018-12-07Aditi KathpaliaNithin Nagaraj
Causality testing methods are being widely used in various disciplines of science. Model-free methods for causality estimation are very useful as the underlying model generating the data is often unknown. However, existing model-free measures assume separability of cause and effect at the level of individual samples of measurements and unlike model-based methods do not perform any intervention to learn causal relationships. These measures can thus only capture causality which is by the associational occurrence of ‘cause’ and ‘effect’ between well separated samples. In real-world processes, often ‘cause’ and ‘effect’ are inherently inseparable or become inseparable in the acquired measurements. We propose a novel measure that uses an adaptive interventional scheme to capture causality which is not merely associational. The scheme is based on characterizing complexities associated with the dynamical evolution of processes on short windows of measurements. The formulated measure, Compression- Complexity Causality is rigorously tested on simulated and real datasets and its performance is compared with that of existing measures such as Granger Causality and Transfer Entropy. The proposed measure is robust to presence of noise, long-term memory, filtering and decimation, low temporal resolution (including aliasing), non-uniform sampling, finite length signals and presence of common driving variables. Our measure outperforms existing state-of-the-art measures, establishing itself as an effective tool for causality testing in real world applications.
Causality testing methods are being widely used in various disciplines of science. Model-free methods for causality estimation are very useful as the underlying model generating the data is often unknown. However, existing model-free measures assume separability of cause and effect at the level of individual samples of measurements and unlike model-based methods do not perform any intervention to learn causal relationships. These measures can thus only capture causality which is by the associational occurrence of ‘cause’ and ‘effect’ between well separated samples. In real-world processes, often ‘cause’ and ‘effect’ are inherently inseparable or become inseparable in the acquired measurements. We propose a novel measure that uses an adaptive interventional scheme to capture causality which is not merely associational. The scheme is based on characterizing complexities associated with the dynamical evolution of processes on short windows of measurements. The formulated measure, Compression- Complexity Causality is rigorously tested on simulated and real datasets and its performance is compared with that of existing measures such as Granger Causality and Transfer Entropy. The proposed measure is robust to presence of noise, long-term memory, filtering and decimation, low temporal resolution (including aliasing), non-uniform sampling, finite length signals and presence of common driving variables. Our measure outperforms existing state-of-the-art measures, establishing itself as an effective tool for causality testing in real world applications.