Deep learning-based anomaly detection using one-dimensional convolutional neural networks (1D CNN) in machine centers (MCT) and computer numerical control (CNC) machines

Ali Athar; Md Ariful Islam Mozumder; Abdullah; Sikandar Ali; Hee-Cheol Kim

doi:10.7717/peerj-cs.2389

Deep learning-based anomaly detection using one-dimensional convolutional neural networks (1D CNN) in machine centers (MCT) and computer numerical control (CNC) machines

Ali Athar ¹, Md Ariful Islam Mozumder¹, Abdullah², Sikandar Ali¹, Hee-Cheol Kim ¹

1Digital Anti-aging Healthcare, Inje University, GIMHAE, Gyeongsangnam-do, Republic of South Korea

2James Cook University of North Queensland, Queensland, Australia

DOI: 10.7717/peerj-cs.2389

Published: 2024-10-17
Accepted: 2024-09-13
Received: 2024-04-19

Academic Editor: Paulo Jorge Coelho

Subject Areas: Human-Computer Interaction, Artificial Intelligence, Data Mining and Machine Learning, Data Science, Neural Networks
Keywords: Long short-term memory, Deep learning, 1D convolutional neural network, Machine learning, Computer numerical controls (CNCs), Machine center (MCT)

Copyright: © 2024 Athar et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.

Cite this article: Athar A, Mozumder MAI, Abdullah, Ali S, Kim H. 2024. Deep learning-based anomaly detection using one-dimensional convolutional neural networks (1D CNN) in machine centers (MCT) and computer numerical control (CNC) machines. PeerJ Computer Science 10:e2389 https://doi.org/10.7717/peerj-cs.2389

The authors have chosen to make the review history of this article public.

Abstract

Computer numerical control (CNC) and machine center (MCT) machines are mechanical devices that manipulate different tools using computer programming as inputs. Predicting failures in CNC and MCT machines before their actual failure time is crucial to reduce maintenance costs and increase productivity. This study is centered around a novel deep learning-based model using a 1D convolutional neural network (CNN) for early fault detection in MCT machines. We collected sensor-based data from CNC/MCT machines and applied various preprocessing techniques to prepare the dataset. Our experimental results demonstrate that the 1D-CNN model achieves a higher accuracy of 91.57% compared to traditional machine learning classifiers and other deep learning models, including Random Forest (RF) at 89.71%, multi-layer perceptron (MLP) at 87.45%, XGBoost at 89.67%, logistic regression (LR) at 75.93%, support vector machine (SVM) at 75.96%, K-nearest neighbors (KNN) at 82.93%, decision tree at 88.36%, naïve Bayes at 68.31%, long short-term memory (LSTM) at 90.80%, and a hybrid 1D CNN + LSTM model at 88.51%. Moreover, our proposed 1D CNN model outperformed all other mentioned models in precision, recall, and F-1 scores, with 91.87%, 91.57%, and 91.63%, respectively. These findings highlight the efficacy of the 1D CNN model in providing optimal performance with an MCT machine’s dataset, making it particularly suitable for small manufacturing companies seeking to automate early fault detection and classification in CNC and MCT machines. This approach enhances productivity and aids in proactive maintenance and safety measures, demonstrating its potential to revolutionize the manufacturing industry.

Introduction

Managing failures in manufacturing processes is necessary in the manufacturing industry. Manufacturing machines can cause a loss in the manufacturing process, sometimes because of their high complexity or default. An anomaly during production is not beneficial to a company because it affects its overall success. In the case of a manufacturing machine’s failure, an alternative solution should be used, such as a spare system in an emergency. This will be very costly and will require regular maintenance, even if they remain unused. Therefore, manufacturing industries seek cost-effective and efficient ways to handle machine failures. Predictive maintenance is one of the options used for this purpose; it provides a promising path to machine failures and avoids anomalies in the manufacturing process (Barlow & Hunter, 1960; Hoppenstedt et al., 2018; Saxena et al., 2008).

MCT and CNC machines are used in manufacturing processes to manufacture different metals. These machines are handled by coding on a PC via a programming application. Various kinds of commands are generated through coding to perform various operations. Multiple data types are produced from these machines, such as temperature, vibration, power, spindle, and noise. Many researchers have proved that from all kinds of data, vibration, spindle, energy, and noise data are the main reasons for machine failure (Jiang et al., 2019). Figure 1 shows a picture of an MCT machine (Kim et al., 2022).

Figure 1: Overview of machine center machine.
Image source credit: vertical 5-Axis CNC machining centres - Mills CNC.

Download full-size image

DOI: 10.7717/peerjcs.2389/fig-1

Moreover, the trend toward machine learning produces an interrogation of the possibility of predicting manufacturing machine failure, thus detecting anomalies in the manufacturing process. Although the manufacturing industry has witnessed some technical developments to address these defects and failures, their practical application remains a challenging task for many reasons (Hoppenstedt et al., 2018; Kammerer et al., 2019). Based on these considerations, our study presents a real-world case conducted in cooperation with a manufacturing company. The company uses computer numerical control (CNC) and machine centers (MCT) to drill metals. Several defects during these manufacturing processes can be observed, reducing the devices’ efficiency. Thus, for a company, defect detection and prevention remain of utmost importance as they will increase the productivity and efficiency of its machines. However, defect detection in these machines remains a complex task that requires considerable time and effort. Thus far, many manufacturing companies have employed selected choice experts to detect anomalies based on their long-term experience. However, such experts are generally expensive and sometimes ineffective. Moreover, these experts took too much time to observe several machines. Based on sensor data collected from MCT machines, we aim to minimize manual decisions regarding defect detection and improve productivity (Feng et al., 2019; Sadaf, Athar & Azam, 2016; Schlechtendahl et al., 2015; Vodenčarević et al., 2011).

The company’s real-world data are related to an MCT machine that monitors records during the drilling of the metals. The process must be continuously monitored to reduce costly downtime. Continuous monitoring generates large amounts of sensor data related to several MCT machine components.

This study aimed to assess and predict the early failure of MCT machines. Training our proposed one-dimensional convolutional neural network (1D-CNN) base model that can use sensor-based data collected from a manufacturing machine capable of predicting defects will significantly contribute to our research. Our experimental results demonstrate the 1D CNN’s superior performance in anomaly detection for MCT machines. The model’s ability to accurately identify anomalies, even with limited training data, underscores its practical value in real-world manufacturing settings. This innovation enhances the fault diagnosis process and contributes to the broader adoption of deep learning techniques in the industrial sector. In the past, researchers have proposed different techniques to predict early defects in these machines. Still, there are some drawbacks, such as some of them using only vibration data and some of them using only spindle values in their research. Also, some of the researchers did not perform feature engineering techniques. Moreover, there is a need for comprehensive evaluation metrics that go beyond accuracy to include precision, recall, F1-score, and statistical significance tests. Considering these factors, we have tried to fill the research gap using a combination of three types of datasets obtained from the CNC and MCT machine, and we also performed features engineering techniques to enhance the performance of our proposed model. This 1D-CNN-based deep-learning (Du, 2018; Lei, Pan & Huang, 2019) model assists small manufacturing companies in early predicting anomalies in CNC and MCT machines. Furthermore, we have conducted paired t-tests to statistically evaluate the performance differences between the proposed 1D CNN model and other traditional machine learning and deep learning models. This evaluation not only determines our model’s superior performance in terms of precision, recall, and F1 score but also confirms the statistical significance of the observed improvements, thereby reinforcing the reliability and robustness of our approach.

A summary of our work is provided below:

Applying a state-of-the-art 1D CNN deep learning model on a custom sensor dataset from a private manufacturing company’s CNC and MCT machine.
We are pioneers in using a combination of spindle datasets, power, and vibration data, as it is easy for small companies to record spindle values in MCT and CNC machines.
After applying different preprocessing techniques, we extracted seven features, a unique combination that improved the accuracy of our proposed model. No one else used this combination in research on manufacturing machines.
To address the class imbalance problem, we utilized the class weight technique, which helped balance both effective and defective classes.
We trained a state-of-the-art 1D CNN deep learning model and compared its performance with eight other machine learning classifiers, including Random Forest (RF), XGBoost, multi-layer perceptron, logistic regression (LR), K-nearest neighbor (KNN), Support vector machine (SVM), decision tree, and naïve Bayes.
Furthermore, the proposed model’s performance was compared with other deep learning models, such as long short-term memory (LSTM) and the hybrid CNN and LSTM (CNN + LSTM) model.
Finally, we have conducted paired t-tests to statistically evaluate the performance differences between the proposed 1D CNN model and all other models.

The remainder of this paper is organized as follows:

Related work is discussed in ‘Related Work’, and ‘Materials & Methods’ describes the methods and materials used. ‘Proposed Algorithm’ briefly explains the proposed algorithm. The results are given in ‘Results’, and finally, ‘Conclusions and Future Work’ provides conclusions and future directions.

Related Work

Various machine and deep learning algorithms have been employed with notable success in anomaly detection within manufacturing processes. However, the 1D-CNN utilized in our research presents distinct advantages over these methods, particularly in CNC and MCT machines and other manufacturing applications.

One-dimensional convolutional neural networks (1D CNNs) are particularly well-suited for analyzing sequential data, such as the time-series data generated by sensors in MCT and CNC machines. Unlike traditional machine learning methods, which often rely on handcrafted features, 1D CNNs automatically learn the temporal dependencies and patterns directly from the raw data. This capability is crucial for capturing the dynamic behavior of manufacturing processes, where anomalies may manifest as subtle temporal variations in the sensor readings (Hussain, Ali & Kim, 2022a; Yao et al., 2017).

Machine-learning algorithms detect anomalies using different datasets produced by manufacturing machines. Recently, Pittino et al. (2020) investigated anomaly detection methods in in-production manufacturing machines. They demonstrated that the nature of the available data, featuring any anomaly, was vital for algorithmic choice. In this study, automatic anomaly detection methods using statistical machine learning and control charts are developed and deployed effectively in an in-production environment. They used support vector machine-based fault diagnostics for the induction motors. Their study was based on vibration, current signals, and critical fault detection features. These features were fed into the SVM as inputs to diagnose faults. The obtained model was tested under several conditions and proven highly effective (Pittino et al., 2020). Chen et al. (2019) proposed a method for predicting transient stability in power systems using the XGBoost machine learning model. By simulating the New England 10-machine 39-bus power system with PSASP, the study generates time-domain data to train the model. The approach leverages feature selection techniques to enhance model performance and emphasizes interpretability through decision rules and feature importance scores. Their proposed XGBoost model outperforms other machine learning classifiers, including RF, DT, SVM, and neural network (NN), in accuracy and efficiency (Chen et al., 2019). Using NASA hypertext transfer protocol (HTTP) log datasets, Henriques et al. (2020) combined XGBoost and K-means algorithms to detect anomalies in the dataset. The built model can be used for forensic and compliance auditing analyses of security management (Henriques et al., 2020). Gao, Ma & Yang (2002) used classifiers to assign anomaly labels to a set of k-length windows obtained from a training dataset. The classifier can detect anomalous windows that are either ignored or assigned a defect label (Gao, Ma & Yang, 2002). For a sequence of sensor data collected from an operational aircraft system (Budalakoti et al., 2006; Budalakoti, Srivastava & Otey, 2008; Sriastava, 2005) Anomaly detection techniques were applied to handle sequence-based problem formulation and identify faulty functional runs. Selvaraj & Min (2023) retrofitted a CNC machine to monitor energy consumption and detect defects in real-time. Using data from a power meter, they extracted different features across time, frequency, and time-frequency domains. They used five machine learning classifiers for anomaly detection: DTs, RFs, k-NN, SVM, and Bagging classifiers. Their proposed model achieved high accuracy and was deployed using AWS for real-time monitoring (Selvaraj & Min, 2023).

Deep learning technology has risen quickly and achieved significant success in several domains, including object detection, intelligent robots, saliency detection, sound event detection for innovative city safety, UAV blade problem diagnostics, and parking garage sound event detection. Researchers have used deep learning methods to detect product defects and have improved productivity and quality (Yang et al., 2020). Intelligent manufacturing, a cornerstone of Industry 4.0, heavily relies on machine vision-based surface defect detection to ensure product quality. CNNs play a pivotal role in this context by providing robust and automated solutions for detecting surface defects across diverse industries, thus supporting the goals of Industry 4.0 (Khanam et al., 2024). Tao et al. research makes two contributions: The first is a multitask 1D-CNN system for wire defect detection, which combines object detection and object categorization networks. Second, they suggested that deep learning should be applied in this area first. The real-world dataset of an industrial production plant shows how well the proposed approach identifies defects (Tao et al., 2018). Kim et al. (2022) proposed an abnormal data classification method for intelligent devices in CNC machines. They proposed a hybrid method that combined singular spectrum analysis (SSA) techniques, which were very efficient in decreasing the noise in time-series data, with convolutional autoencoder (CAE), which performed very well in their time-series data (Kim et al., 2022). Jiang et al. (2022) presented a novel approach to counter-error prediction using deep learning and reinforcement. Gubernatorov et al. proposed a method to detect faults in CNC machine gearboxes using vibration data. They developed a CNN-based VGG model that distinguishes between healthy (OK) and faulty (NOK) processes during high-speed machine operations. Their proposed research integrates their model into an edge-to-cloud architecture for automated fault detection and data annotation (Gubernatorov & Gavrilenkov, 2024). Chengyang, Sitong & Wansheng (2021) proposed a method in which the axial and radial thermal errors of horizontal and vertical spindles are modeled using a deep-learning convolutional neural network (CNN). Their model blends thermocouple data with the thermal image to accurately reflect the temperature field of the spindle (Chengyang, Sitong & Wansheng, 2021). Zhao et al. (2021) proposed an improved target defect detection algorithm for steel surfaces. They improved the target detection accuracy on the NUE-DET dataset by using the R-CNN algorithm (Zhao et al., 2021).

Anomaly detection to natural language processing deep learning has been effectively used in time series modeling and prediction problems in various application domains. Recurrent neural networks (RNNs) were developed to address the limitations of conventional feed-forward neural networks for modeling sequences. However, LSTM networks have recently become popular for sequence and sequence-to-sequence learning because of the well-known vanishing gradients problem of RNN (Brunelli et al., 2019). They used the internal data of CNC machines to track errors in the feeding axis and modeled it as a nonlinear auto-regressor LSTM (Jiang et al., 2022). Masci et al. proposed a multiscale pooling technique that can detect flaws on steel surfaces and capture photos of various sizes as inputs. Their proposed method outperformed many traditional methods available in the market (Masci et al., 2013). Sharma, Chawla & Ram (2020) developed an automatic CNC program for operating multiple types of holes in CNC machines using support vector machines and restricted Boltzmann machines (RBM) with a deep belief network (DBN). Li, Bedi & Melek (2023) proposed an LSTM Autoencoder-based method to detect anomalies in CNC machines. They utilized transfer learning for this purpose. Their initial trained model could distinguish between stable and unstable cutting conditions based on vibration data obtained from the accelerometer. They introduced additional layers and re-trained their model to enhance the detection accuracy. Experimental results validate the ability of their proposed model to detect anomalies, with performance metrics showing promise for industrial application in the automated manufacturing industry (Li, Bedi & Melek, 2023).

Table 1 provides a comprehensive comparative analysis of several research studies, summarizing the machine used, dataset, model, and contribution of each study.

Table 1:

Summary of various studies on anomaly detection techniques and models applied in the manufacturing industry.

Paper	Machin used	Dataset	Proposed model	Feature used	Summary & Findings
Pittino et al. (2020)	Induction motors	Vibration current signal	SVM	Average standard deviation Maximum Minimum Kurtosis	The study demonstrates the effectiveness of combining control charts and anomaly classification algorithms to detect anomalies in the DEU.
Chen et al. (2019)	10-machine 39-Bus Power System	Generators’ Speeds Rotor angles Power	XGBoost	Rotor Angle Kinetic Energy	Utility of XGBoost in advancing transient stability prediction in power systems, offering a robust framework that balances computational efficiency with interpretability, thus enhancing its applicability in real-time operations and emergency response scenarios.
Henriques et al. (2020)	NASA Kennedy Space Centre	NASA HTTP logs	XGBoost K-means	IP address Page Operation Method response.	The proposed method combines k-means clustering and XGBoost classification to detect anomalies in large log datasets efficiently. It leverages parallel computing to handle big data scenarios, providing a scalable solution with interpretable results.
Selvaraj & Min (2023)	CNC/MCT Machines	Power consumption data	Decision Tree Random Forest K-NN SVM Bagging	Time domain Frequency Domain Time-Frequency Domain	fault detection in CNC machines by leveraging power consumption data and deploying advanced machine learning models
Kim et al. (2022)	CNC Machine	Vibration Data	Convolutional Autoencoder	N/A	This paper demonstrates a robust approach to abnormal data classification in CNC machines, leveraging advanced deep learning techniques and efficient data preprocessing methods to achieve high accuracy in real-world applications.
Gubernatorov & Gavrilenkov (2024)	CNC Machine	Vibration Data	VGG	Time Domain Frequency Domain	Detects CNC machine gearbox faults using vibration data and a VGG model, leveraging raw time-domain and frequency-domain features.
Brunelli et al. (2019)	CNC machines	CAD files containing information about the dimensions and features of the holes	Restricted Boltzmann Machine (RBM) SVM	Tool Characteristics	The study concludes that RBM with DBN is more suitable for large datasets and offers better accuracy and classification results, while SVM is more suitable for small datasets
Jiang et al. (2022)	CNC/MCT Machine	Contour errors	NAR-LSTM Time-Series Deep Q-Network	Reference positions of X & Y axes. Velocity Velocity Jerk	Effectively integrates deep learning and reinforcement learning to model and compensate for errors in CNC machining
Li, Bedi & Melek (2023)	CNC Machine	Vibration Data	LSTM	Frequency domain	Proposed an LSTM autoencoder network to detect anomalies in CNC machines.

DOI: 10.7717/peerjcs.2389/table-1

Materials & Methods

We followed different steps in this study. Each is explained below:

Study design and subjects

This was a cross-sectional and multicenter observational study carried out at Inje University. The institutional review board approved this study of Kyungnam University, and all the participants provided consent to participate in this study.

Data collection and data preprocessing

We used datasets from a single MCT machine, identified as number 0K43A16211, provided by the manufacturing company. To ensure a controlled environment where the machine’s operational parameters and external conditions remained consistent, we utilized a one-day labeled dataset from July 7, 2021. Also, its representative nature, including defective and effective classes, made it sufficient for our model’s training and validation. It demonstrates the model’s capability to detect anomalies with limited data, reflecting real-world manufacturing scenarios where extensive historical data may not always be available.

The company collected the data using three sensors: current sensors for measuring power, tachometers for spindle speed, and accelerometers for vibration levels. The sensors recorded data at an average rate of 10 Hz per second, providing a detailed and continuous data stream. This frequency sampling allowed for a thorough and granular analysis of the machine’s operational parameters.

Our dataset comprises 26,126 records, divided into effective (normal) and defective (abnormal) classes. Table 2 illustrates the division of the two classes.

Data processing and cleaning are essential steps in data science and play a vital role in increasing the overall performance of AI models. In the data extraction process, meaningless data were removed to achieve better results. In our dataset, unnecessary attributes not required for further processing were removed.

We used a one-hot (Potdar, Pardawala & Pai, 2017) representation to label defective and effective datasets. ‘0’ is used to represent defective, and ‘1’ is used to describe an effective dataset.

Our dataset contained some missing values, particularly in the power feature set. Incorrect handling of these missing values yields a less accurate model and can decrease the overall performance of the machine-learning models (Rahm & Do, 2000). We used medians to fill the empty values in our dataset. Finally, we used StandardScaler (Athar et al., 2021; Ghosh et al., 2021) to transform all values in a column to a range of mean 0 and standard deviation 1. Figure 2 shows the steps involved in the data preprocessing.

Experimental procedure

We processed and analyzed our dataset using a system with the following specifications: Windows 10 Pro-64-bit, 2.50 GHz Core i7-11700 processor, 16 GB RAM, Python, and TensorFlow 2.7.0, manufactured by HP and sourced from Gimhae, South Korea.

Statistical feature engineering

The model’s performance depends on the features selected after the data preprocessing. Identifying useful features and removing unnecessary features from large amounts of data are difficult. Overfitting in an efficient machine learning model can be avoided by selecting a smaller number of excellent feature subsets (Hussain et al., 2022b). Seven more different features for all three basic features were selected, including mean, standard deviation, minimum, maximum, energy measure, interquartile range, and skewness (Kankar, Sharma & Harsha, 2011; Rafiee, Rafiee & Tse, 2010). With the selection of these seven features, the 1D CNN model gave a maximum performance, as has been proved already in the research study done by Hussain, Ali & Kim (2022a).

Class weight technique

Different techniques are used for data balancing, including the class weight. Using this technique, we develop a weighting algorithm to calculate the loss function. Depending on the imbalanced dataset, weights were assigned to both the majority and minority classes. A threshold value should be defined to maintain balance among different classes. This threshold value helps to increase or decrease class weights, so ultimately, it will prevent the bias of the model toward any particular class (Hussain, Ali & Kim, 2022a). The class weight is expressed in Eq. (1): (1) $w^{i} = \frac{n_i n s t a n c e s}{n_c l a s s e s \times n_i n s t a n c e s i}$

where,

wi = wight of each class and I = Each Class

n_instances = Total number of instances in the whole dataset

n_classes = Total number of classes

n_instancesi = Total number of rows in each class.

Table 2:

Overview of data distribution.

Class	Total data	No. of values for power	No. of values for spindle	No. of values for vibration
Effective	9,811	9,811	9,811	9,811
Defective	16,315	16,315	16,315	16,315
Total	26,126	26,126	26,126	26,126

DOI: 10.7717/peerjcs.2389/table-2

Figure 2: Data preprocessing steps.

Download full-size image

DOI: 10.7717/peerjcs.2389/fig-2

Table 3 shows an implementation of the class weight technique in our study. We calculated the class weight using Eq. (1) and assigned 1.33 weights for the minority class i.e., effective class. This gave more weight to the minority class than the majority class, i.e., defective. Resultantly, it helped to address the data imbalance issue in our dataset and improved the performance of our model.

Table 3:

Class weights.

Class	Weight
Defective (0)	0.66
Effective (1)	1.33

DOI: 10.7717/peerjcs.2389/table-3

Complete architecture of proposed methodology

A complete picture of our work is given in Fig. 3.

Figure 3: Overview of proposed architecture.
Created using PowerPoint.

Download full-size image

DOI: 10.7717/peerjcs.2389/fig-3

Proposed Algorithm

We used three different types of sensor data to develop our prediction algorithm. Different types of data are power, spindle, and vibration. In addition, we applied the other preprocessing techniques discussed in ‘Proposed Algorithm’ to make our data ready for use in our proposed algorithm. Using this algorithm, we predicted the anomaly of the CNC and MCT machines. The function and Training of our proposed model are explained by the pseudo-code given in Table 4.

Table 4:

Pseudo-code of function and training of 1D-CNN.

(a) Function of 1D CNN Model	(b) Training of 1D CNN Model
*Code:*	*Code:*
$F U N C T I O N C N N_{M O D E L (x)}$	FEATURE←{”POWER”, ”SPINDLE”, ”VIBRATION”,
weight←DEFINEWEIGHTS	”POWER_MIN”, ”POWER_MAX”, ”POWER_IR”, ”POWER_SD”,
biases←DEFINEBIASES	”POWER_MEAN”, ”POWER_SK”, ”POWER_EM”, ”SPINDLE_MIN”,
x←RESHAPE(X)	”SPINDLE_MAX”, ”SPINDLE_IR”, ”SPINDLE_SD”,
$c o n v 1 \leftarrow R E L U_A C T I V A T I O N_F U N C (C o n v 1 D (x))$	”SPINDLE_MEAN”, ”SPINDLE_SK”, ”SPINDLE_EM”, ”VIBRATION_MIN”,
$c o n v 2 \leftarrow R E L U_A C T I V A T I O N_F U N C (C o n v 1 D (x))$	”VIBRATION_MAX”, ”VIBRATION_IR”, ”VIBRATION_SD”,
$c o n v 3 \leftarrow R E L U_A C T I V A T I O N_F U N C (C o n v 1 D (x))$	”VIBRATION_MEAN”, ”VIBRATION_SK”, ”VIBRATION_EM”, }
$c o n v 4 \leftarrow R E L U_A C T I V A T I O N_F U N C (C o n v 1 D (x))$	CLASSES←{”EFFECTIVE”, ”DEFECTIVE”}
$c o n v 5 \leftarrow R E L U_A C T I V A T I O N_F U N C (C o n v 1 D (x))$	$X \leftarrow d a t a s e t \{F E A T U R E S\} . v a l u e s$
conv5←DROPOUT(0.1)	$Y \leftarrow d a t a s e t \{C L A S S E S\} . v a l u e s$
conv5←FLATTEN(0.1)	$T R A I N_{D A T A}, T E S T_{D A T A}, V A L I D_{D A T A} T E S T_{T R A I N_{S P L I T (X, Y, 0.2, 0.5)}}$
conv5←DROPOUT(0.1)	BATCHSIZE←4
dense1←RELU_ACTIVATION_FUNC()	LOSSbinary_crossentropy, learning_rate←1e07, Epoches←250
dense2←RELU_ACTIVATION_FUNC()	1DCNN_MODEL.compile←(LOSS, learning_rate)
dense3←RELU_ACTIVATION_FUNC()	1DCNN_MODEL.train←(TRAIN_DATA, EPOCHS, BATCH_SIZE, VALID_DATA)
RETURNoutput
END FUNCTION

DOI: 10.7717/peerjcs.2389/table-4

Data format and measurements

We created a matrix of 24 ×30 = 720 feature vectors, where 24 denotes the total number of attributes, and 30 is the number of rows in each window. Figure 4 explains this process, later used for model training and testing while predicting the results.

Figure 4: Window size and data scanning.

Download full-size image

DOI: 10.7717/peerjcs.2389/fig-4

To avoid robustness and overfitting of results, the entire dataset was divided into 80% for training, 10% for validation, and 10% for testing. This division was implemented for all eight machine learning and three deep learning models. This data division simplifies the analysis and ensures that our model’s performance metrics are more straightforwardly interpretable.

Architecture of 1D CNN

In our study, we used deep learning techniques to predict the effectiveness of our sensor data. Among these deep learning techniques, the 1D CNN model is very efficient for a small dataset, as our results prove.

Deep learning has an additional benefit over traditional AI, which consists of classification and feature extraction (Bibbò & Carotenuto, 2021) We can automatically extract the most relevant and essential features using these deep learning models. The 1D CNN is another highly effective artificial neural network (ANN) model used for feature extraction and classification.

Input layer

The data with dimensions (30 ×24) were fed into the input layer of the 1D CNN. Where 30 is the window size, and 24 represents the number of features.

Convolution layer

Convolutional operations were applied to the input data with a stride size 1. The five convolutional layers use a kernel size of 4. The filter sizes in these four convolutional layers were 4, 4, 4, 4, and 8.

Dropout

Dropout helps avoid overfitting the model. In our dropout layers of the 1D CNN model, we used a dropout value of 0.5.

Output

The performance of the CNN model is highly dependent on the activation function. In our model, we used a rectified linear unit (ReLU) as an activation function in two dense layers in our experiments to obtain a better result, and the sigmoid function was used as an activation function in the third dense layer of our proposed model. We also applied Stochastic gradient descent (SGD) in our model, and a learning rate of 0.0001 was applied. Finally, binary_crossentropy was used as a loss function to measure the total loss between the predicted and actual numbers.

We conducted a series of preliminary experiments to fine-tune selected parameters and ensure that the chosen configuration offers a balance between model performance and training efficiency.

Table 5 gives a summary of these fine-tuned parameters.

Results

In this section, we discuss our study’s experimental results. We compared the performance of our proposed 1D CNN model with eight machine learning models—random forest, XGB, MLP, LR, SVM, KNN, DT, and naïve Bayes—and one deep learning model, the LSTM. Additionally, we evaluated a hybrid model combining 1D CNN and LSTM (1D CNN+LSTM).

Performance measures

We utilized various performance metrics to evaluate the models, including accuracy, precision, recall, F1-score, and the AUC-ROC curve. The following equations define these metrics: (2) $A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}$

(3) $P r e c i s i o n = \frac{T P}{T P + F P}$ (4) $R e c a l l = \frac{T P}{T P + F N}$ (5) $F 1 - S c o r e = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} .$

Table 6 highlights the superior performance of our proposed 1D CNN model compared to ten other models, which include eight machine learning models and two deep learning models. This table provides a comprehensive comparative analysis of all the models evaluated.

Table 5:

Parameters and their descriptions.

Parameter	Value	Description
Initial learning rate	0.0001	Starting learning rate for training.
Learning rate schedule	Exponential decay	Learning rate decay method.
Optimizer	SGD with nesterov momentum	Optimization algorithm used for training.
Momentum	0.9	Momentum parameter for SGD.
Epochs	250	Number of training iterations.
Batch size	4	Number of samples per gradient update.
Conv1D layers	6 Layers	The number of convolutional layers.
Filters	4, 4, 4, 8, 8	Number of filters in each Conv1D layer.
Kernel size	2	Size of the convolutional kernels.
Activation function	ReLU	Activation function for Conv1D layers.
Dense layers	3 Layers	Number of dense layers.
Activation	ReLU, ReLU, Sigmoid	Activation functions for the dense layers.
Loss function	Binary crossentropy	Loss function used for training.

DOI: 10.7717/peerjcs.2389/table-5

Table 6:

Precision, recall and F1-score of individual classes in all models.

		Precision (%)	Recall	F1-Score
Logistic regression	Defective (0)	81.32	79.78	80.54
Logistic regression	Effective (1)	67.39	69.52	68.44
Support vector classifier	Defective (0)	81.14	80.15	80.64
Support vector classifier	Effective (1)	67.63	69.01	68.31
K-Nearest neighbors	Defective (0)	87.11	85.29	86.19
K-Nearest neighbors	Effective (1)	76.35	79.00	77.66
Decision tree	Defective (0)	90.49	90.93	90.71
Decision tree	Effective (1)	84.79	84.10	84.44
Naive bayes	Defective (0)	97.74	50.43	66.53
Naive bayes	Effective (1)	54.32	98.06	69.91
Random forest	Defective (0)	91.84	91.67	91.75
Random forest	Effective (1)	86.18	86.44	86.31
XGBoost	Defective (0)	92.99	90.26	91.60
XGBoost	Effective (1)	84.55	88.69	86.57
MLP	Defective (0)	90.15	89.71	89.93
MLP	Effective (1)	83.01	83.69	83.35
LSTM	Defective (0)	93.21	92.07	92.64
LSTM	Effective (1)	86.87	88.66	87.76
Hybrid	Defective (0)	96.53	84.76	90.26
Hybrid	Effective (1)	78.63	94.85	85.98
1D-CNN (the proposed model)	Defective (0)	95.51	90.85	93.13
1D-CNN (the proposed model)	Effective (1)	85.71	92.78	89.11

DOI: 10.7717/peerjcs.2389/table-6

Table 7 summarizes the average accuracy achieved by the 1D CNN and the ten other models. Our proposed model performed very well on testing data and got maximum accuracy of 91.57%, 91.87% Precision, 91.57% Recall, and 91.63% F1-Score measures compared to other machine learning and deep learning models.

Table 7:

Overall result summary of all models.

	Precision	Recall (%)	F1-Score (%)	Accuracy (%)
RF	89.71	89.71	89.71	89.71
XGB	89.82	89.67	89.71	89.67
MLP	87.47	87.45	87.47	87.45
LR	76.09	75.93	76.00	75.93
SVM	76.07	75.96	76.01	75.96
K-NN	83.07	82.93	82.99	82.93
DT	88.35	88.36	88.36	88.36
NB	81.44	68.31	67.80	68.31
LSTM	90.85	90.80	90.82	90.80
1D CNN+ LSTM	89.88	88.51	88.67	88.51
1D-CNN (the proposed model)	91.87	91.57	91.63	91.57

DOI: 10.7717/peerjcs.2389/table-7

Additionally, it presents the confusion matrix, which is crucial for understanding the performance of the machine learning models. The confusion matrix includes four metrics: true positive (TP), true negative (TN), false positive (FP), and false negative (FN). True positives and true negatives represent correctly classified instances, while false positives and false negatives represent type I and type II errors, respectively. For a detailed performance breakdown, the normalized confusion matrices of each model are illustrated in Table 8.

Table 8:

Confusion matrix of each model.

Model
		Positive	Negative
Random forest (RF)	Positive	0.92	0.08
Random forest (RF)	Negative	0.14	0.86
XGBoost	Positive	0.90	0.10
XGBoost	Negative	0.11	0.89
MLP	Positive	0.90	0.10
MLP	Negative	0.16	0.84
LR	Positive	0.80	0.20
LR	Negative	0.30	0.70
SVM	Positive	0.80	0.20
SVM	Negative	0.31	0.69
K-NN	Positive	0.85	0.15
K-NN	Negative	0.21	0.79
DT	Positive	0.91	0.09
DT	Negative	0.16	0.84
NB	Positive	0.50	0.50
NB	Negative	0.02	0.98
LSTM	Positive	0.92	0.08
LSTM	Negative	0.11	0.89
Hybrid model	Positive	0.85	0.15
Hybrid model	Negative	0.05	0.95
1D CNN	Positive	0.91	0.09
1D CNN	Negative	0.07	0.93

DOI: 10.7717/peerjcs.2389/table-8

Auc-Roc curve

The AUC-ROC curve is a crucial performance criterion for evaluating deep learning and machine learning models. A higher AUC indicates a more robust and superior model for differentiating between classes. Figure 5A presents a comparison of the 1D-CNN model with naïve Bayes, XGBoost, LR, LSTM, and a hybrid model of LSTM and 1D-CNN. Similarly, the AUC-ROC curve in Fig. 5B compares the 1D-CNN with the other remaining five models. These images clearly demonstrate that our proposed 1D-CNN model significantly outperforms the other ten models.

Paired t-tests

We conducted paired t-tests to assess the statistical significance of the differences in accuracies between the 1D-CNN model and each of the other models. The results are summarized in Table 9. This table presents the results of paired t-tests assessing the statistical significance of differences in peak accuracy between the 1D-CNN and other models. The t-statistic and p-value for each comparison are provided to determine whether the differences are statistically significant. All comparisons show statistically significant differences (p < 0.05), indicating that the CNN model outperforms the other models’ accuracy.

Conclusions and Future Work

While several researchers have focused on manufacturing, research on MCT and CNC machines remains sparse. This study addresses this gap by applying a 1D-CNN deep learning model for anomaly detection in MCT/CNC machines. Our results demonstrate that the 1D-CNN model effectively predicts MCT machines’ operational status and defects, outperforming eight traditional machine learning models and one deep learning model. Our model achieved the highest accuracy and outperformed a hybrid 1D CNN and LSTM model, even with a small dataset.

Figure 5: AUC-ROC curve.

Download full-size image

DOI: 10.7717/peerjcs.2389/fig-5

Table 9:

T-test summary.

Model comparison	t-statistic	p-value	Statistical significance
1D-CNN vs LSTM	9.4632	2.0721e−20	Significant (p < 0.05)
1D-CNN vs Hybrid (LSTM+ 1D CNN)	36.8330	2.7106e−188	Significant (p < 0.05)
1D-CNN vs Naive Bayes	377.6534	0.0000e+00	Significant (p < 0.05)
1D-CNN vs Random Forest	353.1712	0.0000e+00	Significant (p < 0.05)
1D-CNN vs Logistic regression	345.0091	0.0000e+00	Significant (p < 0.05)
1D-CNN vs Support vector classifier	323.3016	0.0000e+00	Significant (p < 0.05)
1D-CNN vs K-Nearest neighbors	331.7052	0.0000e+00	Significant (p < 0.05)
1D-CNN vs Decision tree	313.1611	0.0000e+00	Significant (p < 0.05)
1D-CNN vs XGBoost	349.6871	0.0000e+00	Significant (p < 0.05)
1D-CNN vs MLP	342.2846	0.0000e+00	Significant (p < 0.05)

DOI: 10.7717/peerjcs.2389/table-9

Although our study was limited to data collected for a single day, the 1D CNN’s strong performance suggests its potential for effective anomaly detection with limited data. Future work will use data from multiple days and varying operational conditions to further validate our findings to ensure the model’s robustness and adaptability. Exploring more advanced deep learning techniques may enhance MCT machine productivity.

Supplemental Information

Main Code File

This Jupyter file contains our code where we implemented individual classifiers and hybrid models also.

DOI: 10.7717/peerj-cs.2389/supp-4

Download

XGB Model

DOI: 10.7717/peerj-cs.2389/supp-5

Download

Hybrid Model

DOI: 10.7717/peerj-cs.2389/supp-6

Download

Dataset

This File contains preprocessed data.

DOI: 10.7717/peerj-cs.2389/supp-7

Download

K-NN Model

DOI: 10.7717/peerj-cs.2389/supp-8

Download

1D-CNN model

DOI: 10.7717/peerj-cs.2389/supp-9

Download

DT Model

DOI: 10.7717/peerj-cs.2389/supp-10

Download

Naiva Bayes Model

DOI: 10.7717/peerj-cs.2389/supp-11

Download

MLP Model

DOI: 10.7717/peerj-cs.2389/supp-12

Download

SVM Model

DOI: 10.7717/peerj-cs.2389/supp-13

Download

[1] Athar A, Ali S, Sheeraz MM, Bhattachariee S, Kim H-C. 2021. Sentimental analysis of movie reviews using soft voting ensemble-based machine learning. In: 2021 Eighth international conference on social network analysis, management and security (SNAMS). Piscataway. IEEE. 01-05

[2] Barlow R, Hunter L. 1960. Optimum preventive maintenance policies. Operations Research 8:90-100

[3] Bibbò L, Carotenuto R. 2021. An integrated system for indoor people localization, tracking, and monitoring. Journal of International Scientific Publications, Materials, Methods & Technologies 15:253-273

[4] Brunelli L, Masiero C, Tosato D, Beghi A, Susto GA. 2019. Deep learning-based production forecasting in manufacturing: a packaging equipment case study. Procedia Manufacturing 38:248-255

[5] Budalakoti S, Srivastava AN, Akella R, Turkov E. 2006. Anomaly detection in large sets of high-dimensional symbol sequences. In: 2006 SLAM international conference on data mining.

[6] Budalakoti S, Srivastava AN, Otey ME. 2008. Anomaly detection and diagnosis algorithms for discrete symbol sequences with applications to airline safety. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 39:101-113

[7] Chen M, Liu Q, Chen S, Liu Y, Zhang C-H, Liu R. 2019. XGBoost-based algorithm interpretation and application on post-fault transient stability status prediction of power system. IEEE Access 7:13149-13158

[8] Chengyang W, Sitong X, Wansheng X. 2021. Spindle thermal error prediction approach based on thermal infrared images: a deep learning method. Journal of Manufacturing Systems 59:67-80

[9] Du J. 2018. Understanding of object detection based on CNN family and YOLO. Journal of Physics: Conference Series: IOP Publishing 1004:012029

[10] Feng J, Jia X, Zhu F, Moyne J, Iskandar J, Lee J. 2019. An online virtual metrology model with sample selection for the tracking of dynamic manufacturing processes with slow drift. IEEE Transactions on Semiconductor Manufacturing 32:574-582

[11] Gao B, Ma H-Y, Yang Y-H. 2002. Hmms (hidden markov models) based on anomaly intrusion detection method. In: Proceedings international conference on machine learning and cybernetics. Piscataway. IEEE. 381-385

[12] Ghosh P, Azam S, Karim A, Jonkman M, Hasan MZ. 2021. Use of efficient machine learning techniques in the identification of patients with heart diseases. In: 2021 the 5th international conference on information system and data mining. 14-20

[13] Gubernatorov K, Gavrilenkov S. 2024. CNC machine gearbox fault detection with convolutional neural network. World Journal of Advanced Engineering Technology and Sciences 11:379-388

[14] Henriques J, Caldeira F, Cruz T, Simões P. 2020. Combining k-means and xgboost models for anomaly detection using log datasets. Electronics 9(7):1164

[15] Hoppenstedt B, Pryss R, Stelzer B, Meyer-Brötz F, Kammerer K, Treß A, Reichert M. 2018. Techniques and emerging trends for state of the art equipment maintenance systems—a bibliometric analysis. Applied Sciences 8(6):916

[16] Hussain A, Ali S, Kim H-C. 2022a. Activity detection for the wellbeing of dogs using wearable sensors based on deep learning. IEEE Access 10:53153-53163

[17] Hussain A, Begum K, Armand TPT, Mozumder MAI, Ali S, Kim HC, Joo M-I. 2022b. Long short-term memory (LSTM)-based dog activity detection using accelerometer and gyroscope. Applied Sciences 12:9427

[18] Jiang W, Hong Y, Zhou B, He X, Cheng C. 2019. A GAN-based anomaly detection approach for imbalanced industrial time series. IEEE Access 7:143608-143619

[19] Jiang Y, Chen J, Zhou H, Yang J, Hu P, Wang J. 2022. Contour error modeling and compensation of CNC machining based on deep learning and reinforcement learning. The International Journal of Advanced Manufacturing Technology 118:551-570

[20] Kammerer K, Hoppenstedt B, Pryss R, Stökler S, Allgaier J, Reichert M. 2019. Anomaly detections for manufacturing systems based on sensor data—insights into two challenging real-world production settings. Sensors 19(24):5370

[21] Kankar PK, Sharma SC, Harsha SP. 2011. Rolling element bearing fault diagnosis using wavelet transform. Neurocomputing 74:1638-1645

[22] Khanam R, Hussain M, Hill R, Allen P. 2024. A comprehensive review of convolutional neural networks for defect detection in industrial applications. IEEE Access 12:94250-94295

[23] Kim D, Oh S, Lee J, Jeong J. 2022. SSA-CAE-based abnormal data classification method in edge intelligence device of CNC machine. Applied Sciences 12(12):5864

[24] Lei X, Pan H, Huang X. 2019. A dilated CNN model for image classification. IEEE Access 7:124087-124095

[25] Li E, Bedi S, Melek W. 2023. Anomaly detection in three-axis CNC machines using LSTM networks and transfer learning. The International Journal of Advanced Manufacturing Technology 127:5185-5198

[26] Masci J, Meier U, Fricout G, Schmidhuber J. 2013. Multi-scale pyramidal pooling network for generic steel defect classification. In: The 2013 international joint conference on neural networks (IJCNN). Piscataway. IEEE. 1-8

[27] Pittino F, Puggl M, Moldaschl T, Hirschl C. 2020. Automatic anomaly detection on in-production manufacturing machines using statistical learning methods. Sensors 20(8):2344

[28] Potdar K, Pardawala TS, Pai CD. 2017. A comparative study of categorical variable encoding techniques for neural network classifiers. International Journal of Computer Applications 175:7-9

[29] Rafiee J, Rafiee M, Tse P. 2010. Application of mother wavelet functions for automatic gear and bearing fault diagnosis. Expert Systems with Applications 37:4568-4579

[30] Rahm E, Do HH. 2000. Data cleaning: problems and current approaches. IEEE Data Engineering Bulletin 23:3-13

[31] Sadaf S, Athar A, Azam F. 2016. Evaluation of FED-CASE—a tool to convert class diagram into structural coding. In: 2016 international symposium on computer, consumer and control (IS3C). Piscataway. IEEE. 184-187

[32] Saxena A, Goebel K, Simon D, Eklund N. 2008. Damage propagation modeling for aircraft engine run-to-failure simulation. In: 2008 International conference on prognostics and health management. Piscataway. IEEE. 1-9

[33] Schlechtendahl J, Keinert M, Kretschmer F, Lechler A, Verl A. 2015. Making existing production systems Industry 4.0-ready. Production Engineering 9:143-148

[34] Selvaraj V, Min S. 2023. Real-time fault identification system for a retrofitted ultra-precision CNC machine from equipment’s power consumption data: a case study of an implementation. International Journal of Precision Engineering and Manufacturing-Green Technology 10:925-941

[35] Sharma N, Chawla V, Ram N. 2020. Comparison of machine learning algorithms for the automatic programming of computer numerical control machine. International Journal of Data and Network Science 4:1-14

[36] Sriastava A. 2005. Discovering system health anomalies using data mining techniques. Moffett Field: NASA Ames Research Center

[37] Tao X, Wang Z, Zhang Z, Zhang D, Xu D, Gong X, Zhang L. 2018. Wire defect recognition of spring-wire socket using multitask convolutional neural networks. IEEE Transactions on Components, Packaging and Manufacturing Technology 8:689-698