PeerJ Computer Science Preprints: Algorithms and Analysis of Algorithmshttps://peerj.com/preprints/index.atom?journal=cs&subject=8200Algorithms and Analysis of Algorithms articles published in PeerJ Computer Science PreprintsSkill ranking of researchers via hypergraphhttps://peerj.com/preprints/274802019-01-122019-01-12Xiangjie KongLei LiuShuo YuAndong YangXiaomei BaiBo Xu
Researchers use various skills in their work, such as writing, data analyzing and experiments design. These research skills have greatly influenced quality of their research outputs, as well as their scientific impact. Although there are many indicators having been proposed to quantify the impact of researchers, studies of evaluating their scientific research skills are very rare. In this paper, we analyze the factors affecting researchers' skill ranking and propose a new model based on hypergraph theory to evaluate the scientific research skills. To validate our skill ranking model, we perform experiments on PLoS One dataset and compare the rank of researchers' skills with their papers citation counts and h-index. Finally, we analyze the patterns about how researchers' skill ranking increased over time. Our studies also show the change patterns of researchers between different skills.
Researchers use various skills in their work, such as writing, data analyzing and experiments design. These research skills have greatly influenced quality of their research outputs, as well as their scientific impact. Although there are many indicators having been proposed to quantify the impact of researchers, studies of evaluating their scientific research skills are very rare. In this paper, we analyze the factors affecting researchers' skill ranking and propose a new model based on hypergraph theory to evaluate the scientific research skills. To validate our skill ranking model, we perform experiments on PLoS One dataset and compare the rank of researchers' skills with their papers citation counts and h-index. Finally, we analyze the patterns about how researchers' skill ranking increased over time. Our studies also show the change patterns of researchers between different skills.A local search algorithm for the constrained max cut problem on hypergraphs.https://peerj.com/preprints/274342018-12-182018-12-18Nasim SameiRoberto Solis-Oba
In the constrained max k-cut problem on hypergraphs, we are given a weighted hypergraph H=(V, E), an integer k and a set c of constraints. The goal is to divide the set V of vertices into k disjoint partitions in such a way that the sum of the weights of the hyperedges having at least two endpoints in different partitions is maximized and the partitions satisfy all the constraints in c. In this paper we present a local search algorithm for the constrained max k-cut problem on hypergraphs and show that it has approximation ratio 1-1/k for a variety of constraints c, such as for the constraints defining the max Steiner k-cut problem, the max multiway cut problem and the max k-cut problem. We also show that our local search algorithm can be used on the max k-cut problem with given sizes of parts and on the capacitated max k-cut problem, and has approximation ratio 1-|Vmax|/|V|, where |Vmax| is the cardinality of the biggest partition. In addition, we present a local search algorithm for the directed max k-cut problem that has approximation ratio (k-1)/(3k-2).
In the constrained max k-cut problem on hypergraphs, we are given a weighted hypergraph H=(V, E), an integer k and a set c of constraints. The goal is to divide the set V of vertices into k disjoint partitions in such a way that the sum of the weights of the hyperedges having at least two endpoints in different partitions is maximized and the partitions satisfy all the constraints in c. In this paper we present a local search algorithm for the constrained max k-cut problem on hypergraphs and show that it has approximation ratio 1-1/k for a variety of constraints c, such as for the constraints defining the max Steiner k-cut problem, the max multiway cut problem and the max k-cut problem. We also show that our local search algorithm can be used on the max k-cut problem with given sizes of parts and on the capacitated max k-cut problem, and has approximation ratio 1-|Vmax|/|V|, where |Vmax| is the cardinality of the biggest partition. In addition, we present a local search algorithm for the directed max k-cut problem that has approximation ratio (k-1)/(3k-2).An automatic seating plan algorithmhttps://peerj.com/preprints/274202018-12-082018-12-08Kresten Lindorff-Larsen
An appropriate seating plan is an important prerequisite for any good party be it a formal wedding or an informal dinner. Yet anyone who has designed a seating plan knows that it can prove frustratingly difficult to find a solution that solves the large number of formal, physical and personal constraints associated. Here I present a flexible algorithm for automating the task. The algorithm matches guests to seats taking into account constraints provided through a dissimilarity matrix calculated, for example, using answers to questions designed to classify personality, as well as other formal or informal constraints.
An appropriate seating plan is an important prerequisite for any good party be it a formal wedding or an informal dinner. Yet anyone who has designed a seating plan knows that it can prove frustratingly difficult to find a solution that solves the large number of formal, physical and personal constraints associated. Here I present a flexible algorithm for automating the task. The algorithm matches guests to seats taking into account constraints provided through a dissimilarity matrix calculated, for example, using answers to questions designed to classify personality, as well as other formal or informal constraints.LSTM neural network for textual ngramshttps://peerj.com/preprints/273772018-11-232018-11-23Shaun C. D'Souza
Cognitive neuroscience is the study of how the human brain functions on tasks like decision making, language, perception and reasoning. Deep learning is a class of machine learning algorithms that use neural networks. They are designed to model the responses of neurons in the human brain. Learning can be supervised or unsupervised. Ngram token models are used extensively in language prediction. Ngrams are probabilistic models that are used in predicting the next word or token. They are a statistical model of word sequences or tokens and are called Language Models or Lms. Ngrams are essential in creating language prediction models. We are exploring a broader sandbox ecosystems enabling for AI. Specifically, around Deep learning applications on unstructured content form on the web.
Cognitive neuroscience is the study of how the human brain functions on tasks like decision making, language, perception and reasoning. Deep learning is a class of machine learning algorithms that use neural networks. They are designed to model the responses of neurons in the human brain. Learning can be supervised or unsupervised. Ngram token models are used extensively in language prediction. Ngrams are probabilistic models that are used in predicting the next word or token. They are a statistical model of word sequences or tokens and are called Language Models or Lms. Ngrams are essential in creating language prediction models. We are exploring a broader sandbox ecosystems enabling for AI. Specifically, around Deep learning applications on unstructured content form on the web.Eclipse CDT code analysis and unit testinghttps://peerj.com/preprints/273502018-11-152018-11-15Shaun C. D'Souza
In this paper we look at the Eclipse IDE and its support for CDT (C/C++ Development Tools). Eclipse is an open source IDE and supports a variety of programming languages including plugin functionality. Eclipse supports the standard GNU environment for compiling, building and debugging applications. The CDT is a plugin which enables development of C/C++ applications in eclipse. It enables functionality including code browsing, syntax highlighting and code completion. We verify a 50X improvement in LOC automation for Fake class .cpp / .h and class test .cpp code generation.
In this paper we look at the Eclipse IDE and its support for CDT (C/C++ Development Tools). Eclipse is an open source IDE and supports a variety of programming languages including plugin functionality. Eclipse supports the standard GNU environment for compiling, building and debugging applications. The CDT is a plugin which enables development of C/C++ applications in eclipse. It enables functionality including code browsing, syntax highlighting and code completion. We verify a 50X improvement in LOC automation for Fake class .cpp / .h and class test .cpp code generation.Pattern recognition techniques for the identification of Activities of Daily Living using mobile device accelerometerhttps://peerj.com/preprints/272252018-09-202018-09-20Ivan Miguel PiresNuno M. GarciaNuno PomboFrancisco Flórez-RevueltaSusanna SpinsanteMaria Canavarro TeixeiraEftim Zdravevski
This paper focuses on the recognition of Activities of Daily Living (ADL) applying pattern recognition techniques to the data acquired by the accelerometer available in the mobile devices. The recognition of ADL is composed by several stages, including data acquisition, data processing, and artificial intelligence methods. The artificial intelligence methods used are related to pattern recognition, and this study focuses on the use of Artificial Neural Networks (ANN). The data processing includes data cleaning, and the feature extraction techniques to define the inputs for the ANN. Due to the low processing power and memory of the mobile devices, they should be mainly used to acquire the data, applying an ANN previously trained for the identification of the ADL. The main purpose of this paper is to present a new method based on ANN for the identification of a defined set of ADL with a reliable accuracy. This paper also presents a comparison of different types of ANN in order to choose the type for the implementation of the final model. Results of this research probes that the best accuracies are achieved with Deep Neural Networks (DNN) with an accuracy higher than 80%. The results obtained are similar with other studies, but we compared tree types of ANN in order to discover the best method in order to obtain these results with less memory, verifying that, after the generation of the model, the DNN method, when compared with others, is also the fastest to obtain the results with better accuracy.
This paper focuses on the recognition of Activities of Daily Living (ADL) applying pattern recognition techniques to the data acquired by the accelerometer available in the mobile devices. The recognition of ADL is composed by several stages, including data acquisition, data processing, and artificial intelligence methods. The artificial intelligence methods used are related to pattern recognition, and this study focuses on the use of Artificial Neural Networks (ANN). The data processing includes data cleaning, and the feature extraction techniques to define the inputs for the ANN. Due to the low processing power and memory of the mobile devices, they should be mainly used to acquire the data, applying an ANN previously trained for the identification of the ADL. The main purpose of this paper is to present a new method based on ANN for the identification of a defined set of ADL with a reliable accuracy. This paper also presents a comparison of different types of ANN in order to choose the type for the implementation of the final model. Results of this research probes that the best accuracies are achieved with Deep Neural Networks (DNN) with an accuracy higher than 80%. The results obtained are similar with other studies, but we compared tree types of ANN in order to discover the best method in order to obtain these results with less memory, verifying that, after the generation of the model, the DNN method, when compared with others, is also the fastest to obtain the results with better accuracy.Resolving the optimal selection of a natural reserve using the particle swarm optimisation by applying transfer functionshttps://peerj.com/preprints/269412018-05-292018-05-29Boris Almonacid
The optimal selection of a natural reserve (OSRN) is an optimisation problem with a binary domain. To solve this problem the metaheuristic algorithm Particle Swarm Optimization (PSO) has been chosen. The PSO algorithm has been designed to solve problems in real domains. Therefore, a transfer method has been applied that converts the equations with real domains of the PSO algorithm into binary results that are compatible with the OSRN problem. Four transfer functions have been tested in four case studies to solve the OSRN problem. According to the tests carried out, it is concluded that two of the four transfer functions are apt to solve the problem of optimal selection of a natural reserve.
The optimal selection of a natural reserve (OSRN) is an optimisation problem with a binary domain. To solve this problem the metaheuristic algorithm Particle Swarm Optimization (PSO) has been chosen. The PSO algorithm has been designed to solve problems in real domains. Therefore, a transfer method has been applied that converts the equations with real domains of the PSO algorithm into binary results that are compatible with the OSRN problem. Four transfer functions have been tested in four case studies to solve the OSRN problem. According to the tests carried out, it is concluded that two of the four transfer functions are apt to solve the problem of optimal selection of a natural reserve.Estimating article influence scores for open access journalshttps://peerj.com/preprints/265862018-03-012018-03-01Bree NorlanderPeter LiJevin D. West
Motivated by a desire to curb "predatory" publishing, we created FlourishOA, a one-stop shop for authors, publishers, funders, librarians, and policy makers to find high-quality, cost-effective Open Access (OA) journals. FlourishOA provides Article Processing Charge and Article Influence (AI) score data for OA journals. AI scores are retrieved from InCites Journal Citations Reports (JCR). However, the FlourishOA database contains thousands of journals not indexed in JCR. In order to provide users with more data, our team gathered five years of citation counts from the Microsoft Academic Graph database via Microsoft Cognitive Services Academic Knowledge API and used a log-transformed linear regression to predict over 2,500 additional 2015 AI scores.
Motivated by a desire to curb "predatory" publishing, we created FlourishOA, a one-stop shop for authors, publishers, funders, librarians, and policy makers to find high-quality, cost-effective Open Access (OA) journals. FlourishOA provides Article Processing Charge and Article Influence (AI) score data for OA journals. AI scores are retrieved from InCites Journal Citations Reports (JCR). However, the FlourishOA database contains thousands of journals not indexed in JCR. In order to provide users with more data, our team gathered five years of citation counts from the Microsoft Academic Graph database via Microsoft Cognitive Services Academic Knowledge API and used a log-transformed linear regression to predict over 2,500 additional 2015 AI scores.A novel collaborative filtering algorithm by bit mining frequent itemsetshttps://peerj.com/preprints/264442018-01-182018-01-18Loc NguyenMinh-Phung T. Do
Collaborative filtering (CF) is a popular technique in recommendation study. Concretely, items which are recommended to user are determined by surveying her/his communities. There are two main CF approaches, which are memory-based and model-based. I propose a new CF model-based algorithm by mining frequent itemsets from rating database. Hence items which belong to frequent itemsets are recommended to user. My CF algorithm gives immediate response because the mining task is performed at offline process-mode. I also propose another so-called Roller algorithm for improving the process of mining frequent itemsets. Roller algorithm is implemented by heuristic assumption “The larger the support of an item is, the higher it’s likely that this item will occur in some frequent itemset”. It models upon doing white-wash task, which rolls a roller on a wall in such a way that is capable of picking frequent itemsets. Moreover I provide enhanced techniques such as bit representation, bit matching and bit mining in order to speed up recommendation process. These techniques take advantages of bitwise operations (AND, NOT) so as to reduce storage space and make algorithms run faster.
Collaborative filtering (CF) is a popular technique in recommendation study. Concretely, items which are recommended to user are determined by surveying her/his communities. There are two main CF approaches, which are memory-based and model-based. I propose a new CF model-based algorithm by mining frequent itemsets from rating database. Hence items which belong to frequent itemsets are recommended to user. My CF algorithm gives immediate response because the mining task is performed at offline process-mode. I also propose another so-called Roller algorithm for improving the process of mining frequent itemsets. Roller algorithm is implemented by heuristic assumption “The larger the support of an item is, the higher it’s likely that this item will occur in some frequent itemset”. It models upon doing white-wash task, which rolls a roller on a wall in such a way that is capable of picking frequent itemsets. Moreover I provide enhanced techniques such as bit representation, bit matching and bit mining in order to speed up recommendation process. These techniques take advantages of bitwise operations (AND, NOT) so as to reduce storage space and make algorithms run faster.Linear time-varying Luenberger observer applied to diabeteshttps://peerj.com/preprints/33412017-10-122017-10-12Onofre Orozco LópezCarlos Eduardo Castañeda HernándezAgustín Rodríguez HerreroGema García SaézMaría Elena Hernando
We present a linear time-varying Luenberger observer (LTVLO) using compartmental models to estimate the unmeasurable states in patients with type 1 diabetes. The LTVLO proposed is based on the linearization in an operation point of the virtual patient (VP), where a linear time-varying system is obtained. LTVLO gains are obtained by selection of the asymptotic eigenvalues where the observability matrix is assured. The estimation of the unmeasurable variables is done using Ackermann's methodology. Additionally, it is shown the Lyapunov approach to prove the stability of the time-varying proposal. In order to evaluate the proposed methodology, we designed three experiments: A) VP obtained with the Bergman minimal model; B) VP obtained with the compartmental model presented by Hovorka in 2004; and C) real patients data set. For experiments A) and B), it is applied a meal plan to the VP, where the dynamic response of each state model is compared to the response of each variable of the time-varying observer. Once the observer is evaluated in experiment B), the proposal is applied to experiment C) with data extracted from real patients and the unmeasurable state space variables are obtained with the LTVLO. LTVLO methodology has the feature of being updated each instant of time to estimate the states under a known structure. The results are obtained using simulation with MatlabTM and SimulinkTM. The LTVLO estimates the unmeasurable states from in silico patients with high accuracy by means of the update of Luenberger gains at each iteration. The accuracy of the estimated state space variables is validated through fit parameter.
We present a linear time-varying Luenberger observer (LTVLO) using compartmental models to estimate the unmeasurable states in patients with type 1 diabetes. The LTVLO proposed is based on the linearization in an operation point of the virtual patient (VP), where a linear time-varying system is obtained. LTVLO gains are obtained by selection of the asymptotic eigenvalues where the observability matrix is assured. The estimation of the unmeasurable variables is done using Ackermann's methodology. Additionally, it is shown the Lyapunov approach to prove the stability of the time-varying proposal. In order to evaluate the proposed methodology, we designed three experiments: A) VP obtained with the Bergman minimal model; B) VP obtained with the compartmental model presented by Hovorka in 2004; and C) real patients data set. For experiments A) and B), it is applied a meal plan to the VP, where the dynamic response of each state model is compared to the response of each variable of the time-varying observer. Once the observer is evaluated in experiment B), the proposal is applied to experiment C) with data extracted from real patients and the unmeasurable state space variables are obtained with the LTVLO. LTVLO methodology has the feature of being updated each instant of time to estimate the states under a known structure. The results are obtained using simulation with MatlabTM and SimulinkTM. The LTVLO estimates the unmeasurable states from in silico patients with high accuracy by means of the update of Luenberger gains at each iteration. The accuracy of the estimated state space variables is validated through fit parameter.