Comprehensive model optimization in pulp quality prediction: a machine learning approach
- Published
- Accepted
- Subject Areas
- Artificial Intelligence, Data Mining and Machine Learning
- Keywords
- Pulp and paper quality prediction, Adaptive neuro-fuzzy inference system, Fuzzy logic-based modelling, Feature selection, Model Optimisation, Genetic algorithm-partial least square
- Copyright
- © 2017 Huang et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2017. Comprehensive model optimization in pulp quality prediction: a machine learning approach. PeerJ Preprints 5:e2749v1 https://doi.org/10.7287/peerj.preprints.2749v1
Abstract
Feature selection in machine learning is of great interest since it is reckoned as creating more efficient predictive models in several engineering domains. It is even of special importance in the pulp and paper transformation industry as the knowledge of this particular process is generally very limited. In this paper, we first compared the performance of rule-based genetic algorithm and that of adaptive neuro-fuzzy inference system; the latter is found to be more precise in predicting the pulp quality. We then combined several data mining algorithms such as genetic algorithm-partial least square regression, along with other statistical methods, to explore the relevance of all the potential variables that could be used to predict the pulp ISO brightness, an important property that is usually linked to model performance and hence pulp quality prediction. A few highly relevant variables are thereby determined, and the full set of 79 variables obtained from a Chip Management System was trimmed down to an optimized combination of 3 inputs depending on their relevancy. Peroxide charge (P), average luminance (L) and hue (H) were chosen as the optimal subset to describe the ISO brightness of the pulp and the model was simplified without losing much of its accuracy. Finally, we derived the numbers of membership functions for each variable to further refine the fuzzy logic-based prediction model. The error then reached 2.18%. The loss on accuracy was compensated by adjusting to the fittest membership function numbers
Author Comment
This is a preprint submission to PeerJ Preprints.