PeerJ Computer Science:Artificial Intelligencehttps://peerj.com/articles/index.atom?journal=cs&subject=8300Artificial Intelligence articles published in PeerJ Computer ScienceVisual resource extraction and artistic communication model design based on improved CycleGAN algorithmhttps://peerj.com/articles/cs-18892024-03-182024-03-18Anyu YangMuhammad Kashif Hanif
Through the application of computer vision and deep learning methodologies, real-time style transfer of images becomes achievable. This process involves the fusion of diverse artistic elements into a single image, resulting in the creation of innovative pieces of art. This article centers its focus on image style transfer within the realm of art education and introduces an ATT-CycleGAN model enriched with an attention mechanism to enhance the quality and precision of style conversion. The framework enhances the generators within CycleGAN. At first, images undergo encoder downsampling before entering the intermediate transformation model. In this intermediate transformation model, feature maps are acquired through four encoding residual blocks, which are subsequently input into an attention module. Channel attention is incorporated through multi-weight optimization achieved via global max-pooling and global average-pooling techniques. During the model’s training process, transfer learning techniques are employed to improve model parameter initialization, enhancing training efficiency. Experimental results demonstrate the superior performance of the proposed model in image style transfer across various categories. In comparison to the traditional CycleGAN model, it exhibits a notable increase in structural similarity index measure (SSIM) and peak signal-to-noise ratio (PSNR) metrics. Specifically, on the Places365 and selfi2anime datasets, compared with the traditional CycleGAN model, SSIM is increased by 3.19% and 1.31% respectively, and PSNR is increased by 10.16% and 5.02% respectively. These findings provide valuable algorithmic support and crucial references for future research in the fields of art education, image segmentation, and style transfer.
Through the application of computer vision and deep learning methodologies, real-time style transfer of images becomes achievable. This process involves the fusion of diverse artistic elements into a single image, resulting in the creation of innovative pieces of art. This article centers its focus on image style transfer within the realm of art education and introduces an ATT-CycleGAN model enriched with an attention mechanism to enhance the quality and precision of style conversion. The framework enhances the generators within CycleGAN. At first, images undergo encoder downsampling before entering the intermediate transformation model. In this intermediate transformation model, feature maps are acquired through four encoding residual blocks, which are subsequently input into an attention module. Channel attention is incorporated through multi-weight optimization achieved via global max-pooling and global average-pooling techniques. During the model’s training process, transfer learning techniques are employed to improve model parameter initialization, enhancing training efficiency. Experimental results demonstrate the superior performance of the proposed model in image style transfer across various categories. In comparison to the traditional CycleGAN model, it exhibits a notable increase in structural similarity index measure (SSIM) and peak signal-to-noise ratio (PSNR) metrics. Specifically, on the Places365 and selfi2anime datasets, compared with the traditional CycleGAN model, SSIM is increased by 3.19% and 1.31% respectively, and PSNR is increased by 10.16% and 5.02% respectively. These findings provide valuable algorithmic support and crucial references for future research in the fields of art education, image segmentation, and style transfer.Architecting an enterprise financial management model: leveraging multi-head attention mechanism-transformer for user information transformationhttps://peerj.com/articles/cs-19282024-03-152024-03-15Wan YuHabib Hamam
Financial management assumes a pivotal role as a fundamental information system contributing to enterprise development. Nonetheless, prevalent methodologies frequently encounter challenges in proficiently overseeing diverse information streams inherent to financial management. This study introduces an innovative paradigm for enterprise financial management centered on the transformation of user information signals. In its initial phases, the methodology augments the Transformer network and self-attention mechanism to extract features pertaining to both users and financial data, fostering a more cohesive integration of financial and user information. Subsequently, a reinforcement learning-based alignment method is implemented to reconcile disparities between financial and user information, thereby enhancing semantic alignment. Ultimately, a signal conversion technique employing generative adversarial networks is deployed to harness user information, elevating financial management efficacy and, consequently, optimizing overall financial operations. The empirical validation of this approach, achieving an impressive mAP score of 81.9%, not only outperforms existing methodologies but also underscores the tangible impact and enhanced execution prowess that this paradigm brings to financial management systems. As such, this work not only contributes to the state of the art but also holds promise for revolutionizing the landscape of enterprise financial management.
Financial management assumes a pivotal role as a fundamental information system contributing to enterprise development. Nonetheless, prevalent methodologies frequently encounter challenges in proficiently overseeing diverse information streams inherent to financial management. This study introduces an innovative paradigm for enterprise financial management centered on the transformation of user information signals. In its initial phases, the methodology augments the Transformer network and self-attention mechanism to extract features pertaining to both users and financial data, fostering a more cohesive integration of financial and user information. Subsequently, a reinforcement learning-based alignment method is implemented to reconcile disparities between financial and user information, thereby enhancing semantic alignment. Ultimately, a signal conversion technique employing generative adversarial networks is deployed to harness user information, elevating financial management efficacy and, consequently, optimizing overall financial operations. The empirical validation of this approach, achieving an impressive mAP score of 81.9%, not only outperforms existing methodologies but also underscores the tangible impact and enhanced execution prowess that this paradigm brings to financial management systems. As such, this work not only contributes to the state of the art but also holds promise for revolutionizing the landscape of enterprise financial management.An improved differential evolution algorithm for multi-modal multi-objective optimizationhttps://peerj.com/articles/cs-18392024-03-142024-03-14Dan QuHualin XiaoHuafei ChenHongyi Li
Multi-modal multi-objective problems (MMOPs) have gained much attention during the last decade. These problems have two or more global or local Pareto optimal sets (PSs), some of which map to the same Pareto front (PF). This article presents a new affinity propagation clustering (APC) method based on the Multi-modal multi-objective differential evolution (MMODE) algorithm, called MMODE_AP, for the suit of CEC’2020 benchmark functions. First, two adaptive mutation strategies are adopted to balance exploration and exploitation and improve the diversity in the evolution process. Then, the affinity propagation clustering method is adopted to define the crowding degree in decision space (DS) and objective space (OS). Meanwhile, the non-dominated sorting scheme incorporates a particular crowding distance to truncate the population during the environmental selection process, which can obtain well-distributed solutions in both DS and OS. Moreover, the local PF membership of the solution is defined, and a predefined parameter is introduced to maintain of the local PSs and solutions around the global PS. Finally, the proposed algorithm is implemented on the suit of CEC’2020 benchmark functions for comparison with some MMODE algorithms. According to the experimental study results, the proposed MMODE_AP algorithm has about 20 better performance results on benchmark functions compared to its competitors in terms of reciprocal of Pareto sets proximity (rPSP), inverted generational distances (IGD) in the decision (IGDX) and objective (IGDF). The proposed algorithm can efficiently achieve the two goals, i.e., the convergence to the true local and global Pareto fronts along with better distributed Pareto solutions on the Pareto fronts.
Multi-modal multi-objective problems (MMOPs) have gained much attention during the last decade. These problems have two or more global or local Pareto optimal sets (PSs), some of which map to the same Pareto front (PF). This article presents a new affinity propagation clustering (APC) method based on the Multi-modal multi-objective differential evolution (MMODE) algorithm, called MMODE_AP, for the suit of CEC’2020 benchmark functions. First, two adaptive mutation strategies are adopted to balance exploration and exploitation and improve the diversity in the evolution process. Then, the affinity propagation clustering method is adopted to define the crowding degree in decision space (DS) and objective space (OS). Meanwhile, the non-dominated sorting scheme incorporates a particular crowding distance to truncate the population during the environmental selection process, which can obtain well-distributed solutions in both DS and OS. Moreover, the local PF membership of the solution is defined, and a predefined parameter is introduced to maintain of the local PSs and solutions around the global PS. Finally, the proposed algorithm is implemented on the suit of CEC’2020 benchmark functions for comparison with some MMODE algorithms. According to the experimental study results, the proposed MMODE_AP algorithm has about 20 better performance results on benchmark functions compared to its competitors in terms of reciprocal of Pareto sets proximity (rPSP), inverted generational distances (IGD) in the decision (IGDX) and objective (IGDF). The proposed algorithm can efficiently achieve the two goals, i.e., the convergence to the true local and global Pareto fronts along with better distributed Pareto solutions on the Pareto fronts.SUTrans-NET: a hybrid transformer approach to skin lesion segmentationhttps://peerj.com/articles/cs-19352024-03-132024-03-13Yaqin LiTonghe TianJing HuCao Yuan
Melanoma is a malignant skin tumor that threatens human life and health. Early detection is essential for effective treatment. However, the low contrast between melanoma lesions and normal skin and the irregularity in size and shape make skin lesions difficult to detect with the naked eye in the early stages, making the task of skin lesion segmentation challenging. Traditional encoder-decoder built with U-shaped networks using convolutional neural network (CNN) networks have limitations in establishing long-term dependencies and global contextual connections, while the Transformer architecture is limited in its application to small medical datasets. To address these issues, we propose a new skin lesion segmentation network, SUTrans-NET, which combines CNN and Transformer in a parallel fashion to form a dual encoder, where both CNN and Transformer branches perform dynamic interactive fusion of image information in each layer. At the same time, we introduce our designed multi-grouping module SpatialGroupAttention (SGA) to complement the spatial and texture information of the Transformer branch, and utilize the Focus idea of YOLOV5 to construct the Patch Embedding module in the Transformer to prevent the loss of pixel accuracy. In addition, we design a decoder with full-scale information fusion capability to fully fuse shallow and deep features at different stages of the encoder. The effectiveness of our method is demonstrated on the ISIC 2016, ISIC 2017, ISIC 2018 and PH2 datasets and its advantages over existing methods are verified.
Melanoma is a malignant skin tumor that threatens human life and health. Early detection is essential for effective treatment. However, the low contrast between melanoma lesions and normal skin and the irregularity in size and shape make skin lesions difficult to detect with the naked eye in the early stages, making the task of skin lesion segmentation challenging. Traditional encoder-decoder built with U-shaped networks using convolutional neural network (CNN) networks have limitations in establishing long-term dependencies and global contextual connections, while the Transformer architecture is limited in its application to small medical datasets. To address these issues, we propose a new skin lesion segmentation network, SUTrans-NET, which combines CNN and Transformer in a parallel fashion to form a dual encoder, where both CNN and Transformer branches perform dynamic interactive fusion of image information in each layer. At the same time, we introduce our designed multi-grouping module SpatialGroupAttention (SGA) to complement the spatial and texture information of the Transformer branch, and utilize the Focus idea of YOLOV5 to construct the Patch Embedding module in the Transformer to prevent the loss of pixel accuracy. In addition, we design a decoder with full-scale information fusion capability to fully fuse shallow and deep features at different stages of the encoder. The effectiveness of our method is demonstrated on the ISIC 2016, ISIC 2017, ISIC 2018 and PH2 datasets and its advantages over existing methods are verified.Heart failure survival prediction using novel transfer learning based probabilistic featureshttps://peerj.com/articles/cs-18942024-03-122024-03-12Azam Mehmood QadriMuhammad Shadab Alam HashmiAli RazaSyed Ali Jafar ZaidiAtiq ur Rehman
Heart failure is a complex cardiovascular condition characterized by the heart’s inability to pump blood effectively, leading to a cascade of physiological changes. Predicting survival in heart failure patients is crucial for optimizing patient care and resource allocation. This research aims to develop a robust survival prediction model for heart failure patients using advanced machine learning techniques. We analyzed data from 299 hospitalized heart failure patients, addressing the issue of imbalanced data with the Synthetic Minority Oversampling (SMOTE) method. Additionally, we proposed a novel transfer learning-based feature engineering approach that generates a new probabilistic feature set from patient data using ensemble trees. Nine fine-tuned machine learning models are built and compared to evaluate performance in patient survival prediction. Our novel transfer learning mechanism applied to the random forest model outperformed other models and state-of-the-art studies, achieving a remarkable accuracy of 0.975. All models underwent evaluation using 10-fold cross-validation and tuning through hyperparameter optimization. The findings of this study have the potential to advance the field of cardiovascular medicine by providing more accurate and personalized prognostic assessments for individuals with heart failure.
Heart failure is a complex cardiovascular condition characterized by the heart’s inability to pump blood effectively, leading to a cascade of physiological changes. Predicting survival in heart failure patients is crucial for optimizing patient care and resource allocation. This research aims to develop a robust survival prediction model for heart failure patients using advanced machine learning techniques. We analyzed data from 299 hospitalized heart failure patients, addressing the issue of imbalanced data with the Synthetic Minority Oversampling (SMOTE) method. Additionally, we proposed a novel transfer learning-based feature engineering approach that generates a new probabilistic feature set from patient data using ensemble trees. Nine fine-tuned machine learning models are built and compared to evaluate performance in patient survival prediction. Our novel transfer learning mechanism applied to the random forest model outperformed other models and state-of-the-art studies, achieving a remarkable accuracy of 0.975. All models underwent evaluation using 10-fold cross-validation and tuning through hyperparameter optimization. The findings of this study have the potential to advance the field of cardiovascular medicine by providing more accurate and personalized prognostic assessments for individuals with heart failure.Efficient-gastro: optimized EfficientNet model for the detection of gastrointestinal disorders using transfer learning and wireless capsule endoscopy imageshttps://peerj.com/articles/cs-19022024-03-112024-03-11Shaha Al-OtaibiAmjad RehmanMuhammad MujahidSarah AlotaibiTanzila Saba
Gastrointestinal diseases cause around two million deaths globally. Wireless capsule endoscopy is a recent advancement in medical imaging, but manual diagnosis is challenging due to the large number of images generated. This has led to research into computer-assisted methodologies for diagnosing these images. Endoscopy produces thousands of frames for each patient, making manual examination difficult, laborious, and error-prone. An automated approach is essential to speed up the diagnosis process, reduce costs, and potentially save lives. This study proposes transfer learning-based efficient deep learning methods for detecting gastrointestinal disorders from multiple modalities, aiming to detect gastrointestinal diseases with superior accuracy and reduce the efforts and costs of medical experts. The Kvasir eight-class dataset was used for the experiment, where endoscopic images were preprocessed and enriched with augmentation techniques. An EfficientNet model was optimized via transfer learning and fine tuning, and the model was compared to the most widely used pre-trained deep learning models. The model’s efficacy was tested on another independent endoscopic dataset to prove its robustness and reliability.
Gastrointestinal diseases cause around two million deaths globally. Wireless capsule endoscopy is a recent advancement in medical imaging, but manual diagnosis is challenging due to the large number of images generated. This has led to research into computer-assisted methodologies for diagnosing these images. Endoscopy produces thousands of frames for each patient, making manual examination difficult, laborious, and error-prone. An automated approach is essential to speed up the diagnosis process, reduce costs, and potentially save lives. This study proposes transfer learning-based efficient deep learning methods for detecting gastrointestinal disorders from multiple modalities, aiming to detect gastrointestinal diseases with superior accuracy and reduce the efforts and costs of medical experts. The Kvasir eight-class dataset was used for the experiment, where endoscopic images were preprocessed and enriched with augmentation techniques. An EfficientNet model was optimized via transfer learning and fine tuning, and the model was compared to the most widely used pre-trained deep learning models. The model’s efficacy was tested on another independent endoscopic dataset to prove its robustness and reliability.Designing defensive techniques to handle adversarial attack on deep learning based modelhttps://peerj.com/articles/cs-18682024-03-082024-03-08Dhairya VyasViral V. Kapadia
Adversarial attacks pose a significant challenge to deep neural networks used in image classification systems. Although deep learning has achieved impressive success in various tasks, it can easily be deceived by adversarial patches created by adding subtle yet deliberate distortions to natural images. These attacks are designed to remain hidden from both human and computer-based classifiers. Considering this, we propose novel model designs that enhance adversarial strength with incorporating feature denoising blocks. Exclusively, proposed model utilizes Gaussian data augmentation (GDA) and spatial smoothing (SS) to denoise the features. These techniques are reasonable and can be mixed in a joint finding context to accomplish superior recognition levels versus adversarial assaults while also balancing other defenses. We tested the proposed approach on the ImageNet and CIFAR-10 datasets using 10-iteration projected gradient descent (PGD), fast gradient sign method (FGSM), and DeepFool attacks. The proposed method achieved an accuracy of 95.62% in under four minutes, which is highly competitive compared to existing approaches. We also conducted a comparative analysis with existing methods.
Adversarial attacks pose a significant challenge to deep neural networks used in image classification systems. Although deep learning has achieved impressive success in various tasks, it can easily be deceived by adversarial patches created by adding subtle yet deliberate distortions to natural images. These attacks are designed to remain hidden from both human and computer-based classifiers. Considering this, we propose novel model designs that enhance adversarial strength with incorporating feature denoising blocks. Exclusively, proposed model utilizes Gaussian data augmentation (GDA) and spatial smoothing (SS) to denoise the features. These techniques are reasonable and can be mixed in a joint finding context to accomplish superior recognition levels versus adversarial assaults while also balancing other defenses. We tested the proposed approach on the ImageNet and CIFAR-10 datasets using 10-iteration projected gradient descent (PGD), fast gradient sign method (FGSM), and DeepFool attacks. The proposed method achieved an accuracy of 95.62% in under four minutes, which is highly competitive compared to existing approaches. We also conducted a comparative analysis with existing methods.Predicting Chinese stock market using XGBoost multi-objective optimization with optimal weightinghttps://peerj.com/articles/cs-19312024-03-082024-03-08Jichen Liu
The application of artificial intelligence (AI) technology in various fields has been a recent research hotspot. As a representative technology of AI, the specific application of machine learning models in the field of economics and finance undoubtedly holds significant research value. This article proposes Extreme Gradient Boosting Multi-Objective Optimization Model with Optimal Weights (OW-XGBoost) to comprehensively balance the returns and risks of investment portfolios. The model utilizes fusing label with optimal weights to achieve multi-objective tasks, effectively controlling the impact of various risk and return indicators on the model, thus improving the interpretability and generalization ability of the model. In the experiments, we tested the model using China A-share data from October 2022 to April 2023 and conducted a series of robustness tests. The results indicate that: (1) The OW-XGBoost outperforms the XGBoost Model with Yield as Label (YL-XGBoost), XGBoost Multi-Label Classification Model (MLC-XGBoost) in controlling risk or achieving returns. (2) OW-XGBoost performs better overall compared to baseline models. (3) The robustness tests demonstrate that the model performs well under different market conditions, stock pools, and training set durations. The model performs best in moderately fluctuating stock markets, stock pools comprising high market value stocks, and training set durations measured in months. The methodology and results of this study provide a new perspective and approach for fundamental quantitative investment and also create new possibilities and avenues for the integration of AI, machine learning, and financial quantitative research.
The application of artificial intelligence (AI) technology in various fields has been a recent research hotspot. As a representative technology of AI, the specific application of machine learning models in the field of economics and finance undoubtedly holds significant research value. This article proposes Extreme Gradient Boosting Multi-Objective Optimization Model with Optimal Weights (OW-XGBoost) to comprehensively balance the returns and risks of investment portfolios. The model utilizes fusing label with optimal weights to achieve multi-objective tasks, effectively controlling the impact of various risk and return indicators on the model, thus improving the interpretability and generalization ability of the model. In the experiments, we tested the model using China A-share data from October 2022 to April 2023 and conducted a series of robustness tests. The results indicate that: (1) The OW-XGBoost outperforms the XGBoost Model with Yield as Label (YL-XGBoost), XGBoost Multi-Label Classification Model (MLC-XGBoost) in controlling risk or achieving returns. (2) OW-XGBoost performs better overall compared to baseline models. (3) The robustness tests demonstrate that the model performs well under different market conditions, stock pools, and training set durations. The model performs best in moderately fluctuating stock markets, stock pools comprising high market value stocks, and training set durations measured in months. The methodology and results of this study provide a new perspective and approach for fundamental quantitative investment and also create new possibilities and avenues for the integration of AI, machine learning, and financial quantitative research.Electroencephalography (EEG) based epilepsy diagnosis via multiple feature space fusion using shared hidden space-driven multi-view learninghttps://peerj.com/articles/cs-18742024-03-072024-03-07Xiujian HuYicheng XieHui ZhaoGuanglei ShengKhin Wee LaiYuanpeng Zhang
Epilepsy is a chronic, non-communicable disease caused by paroxysmal abnormal synchronized electrical activity of brain neurons, and is one of the most common neurological diseases worldwide. Electroencephalography (EEG) is currently a crucial tool for epilepsy diagnosis. With the development of artificial intelligence, multi-view learning-based EEG analysis has become an important method for automatic epilepsy recognition because EEG contains difficult types of features such as time-frequency features, frequency-domain features and time-domain features. However, current multi-view learning still faces some challenges, such as the difference between samples of the same class from different views is greater than the difference between samples of different classes from the same view. In view of this, in this study, we propose a shared hidden space-driven multi-view learning algorithm. The algorithm uses kernel density estimation to construct a shared hidden space and combines the shared hidden space with the original space to obtain an expanded space for multi-view learning. By constructing the expanded space and utilizing the information of both the shared hidden space and the original space for learning, the relevant information of samples within and across views can thereby be fully utilized. Experimental results on a dataset of epilepsy provided by the University of Bonn show that the proposed algorithm has promising performance, with an average classification accuracy value of 0.9787, which achieves at least 4% improvement compared to single-view methods.
Epilepsy is a chronic, non-communicable disease caused by paroxysmal abnormal synchronized electrical activity of brain neurons, and is one of the most common neurological diseases worldwide. Electroencephalography (EEG) is currently a crucial tool for epilepsy diagnosis. With the development of artificial intelligence, multi-view learning-based EEG analysis has become an important method for automatic epilepsy recognition because EEG contains difficult types of features such as time-frequency features, frequency-domain features and time-domain features. However, current multi-view learning still faces some challenges, such as the difference between samples of the same class from different views is greater than the difference between samples of different classes from the same view. In view of this, in this study, we propose a shared hidden space-driven multi-view learning algorithm. The algorithm uses kernel density estimation to construct a shared hidden space and combines the shared hidden space with the original space to obtain an expanded space for multi-view learning. By constructing the expanded space and utilizing the information of both the shared hidden space and the original space for learning, the relevant information of samples within and across views can thereby be fully utilized. Experimental results on a dataset of epilepsy provided by the University of Bonn show that the proposed algorithm has promising performance, with an average classification accuracy value of 0.9787, which achieves at least 4% improvement compared to single-view methods.Design of smart citrus picking model based on Mask RCNN and adaptive threshold segmentationhttps://peerj.com/articles/cs-18652024-03-042024-03-04Ziwei GuoYuanwu ShiIbrar Ahmad
Smart agriculture is steadily progressing towards automation and heightened efficacy. The rapid ascent of deep learning technology provides a robust foundation for this trajectory. Leveraging computer vision and the depths of deep learning techniques enables real-time monitoring and management within agriculture, facilitating swift detection of plant growth and autonomous assessment of ripeness. In response to the demands of smart agriculture, this exposition delves into automated citrus harvesting, presenting an ATT-MRCNN target detection model that seamlessly integrates channel attention and spatial attention mechanisms for discerning and identifying citrus images. This framework commences by subjecting diverse citrus image classifications to Mask Region-based CNN’s (Mask RCNN’s) discerning scrutiny, enhancing the model’s efficacy through the incorporation of attention mechanisms. During the model’s training phase, transfer learning is utilized to expand data performance and optimize training efficiency, culminating in parameter initialization. Empirical results notably demonstrate that this method achieves a recognition rate surpassing the 95% threshold across the three sensory recognition tasks. This provides invaluable algorithmic support and essential guidance for the imminent era of intelligent harvesting.
Smart agriculture is steadily progressing towards automation and heightened efficacy. The rapid ascent of deep learning technology provides a robust foundation for this trajectory. Leveraging computer vision and the depths of deep learning techniques enables real-time monitoring and management within agriculture, facilitating swift detection of plant growth and autonomous assessment of ripeness. In response to the demands of smart agriculture, this exposition delves into automated citrus harvesting, presenting an ATT-MRCNN target detection model that seamlessly integrates channel attention and spatial attention mechanisms for discerning and identifying citrus images. This framework commences by subjecting diverse citrus image classifications to Mask Region-based CNN’s (Mask RCNN’s) discerning scrutiny, enhancing the model’s efficacy through the incorporation of attention mechanisms. During the model’s training phase, transfer learning is utilized to expand data performance and optimize training efficiency, culminating in parameter initialization. Empirical results notably demonstrate that this method achieves a recognition rate surpassing the 95% threshold across the three sensory recognition tasks. This provides invaluable algorithmic support and essential guidance for the imminent era of intelligent harvesting.