An autonomous mixed data oversampling method for AIOT-based churn recognition and personalized recommendations using behavioral segmentation

View article
PeerJ Computer Science

Main article text

 

Introduction

  • Data availability assumption: It is assumed that sufficient and relevant data related to IoT device usage, service packages, and customer behavior are available for analysis.

  • AI and IoT integration assumption: The research assumes the successful integration of artificial intelligence (AI) and Internet of Things (IoT) technologies within the telecom sector, as these technologies are central to the study.

  • Churn and customer segmentation relevance assumption: The study assumes that churn recognition and customer segmentation are critical factors for customer retention in the telecom sector.

  • Bi-level optimization feasibility assumption: The proposed unified customer analytics platform assumes that it is feasible and beneficial to address churn recognition and customer segmentation as a bi-level optimization problem, improving overall accuracy.

Research motivation and contribution

Literature review

Churn prediction

Customer segmentation

Integrated churn prediction and customer segmentation

  • • An oversampling method is presented which can handle mixed data and is also suitable for handling highly imbalanced-class distribution problem of telecom datasets.

  • • Factor analysis module is designed to govern the process of customer behavioral segmentation and where the user need not to prespecify the number of clusters?

  • • Customer churn prediction and segmentation are integrated as a single solution for achieving better insights into customer’s analytical datasets.

  • • Personalized recommendations are generated for each group of customers to help decision makers in effective service management for the given customers.

Materials and Methods

Datasets

Proposed methodology

Data preprocessing

Features selection

Data balancing with over-sampling methods

Machine learning models and evaluation metrics

Machine learning models

Evaluation metrics

  • i. Accuracy

  • ii. Precision

  • ii. Recall

  • iv. F1-measure

  • v. AUC

  • iv. Silhouette Score

Results and discussion

Results of Dataset 1

Results of Dataset 2

Results of Dataset 3

Average results analysis on all datasets

Bayesian analysis and customer segmentation using DBSCAN

Results analysis on dataset 1

Results analysis on dataset 2

Results analysis on dataset 3

Silhouette score analysis on all datasets

Conclusions

Supplemental Information

Additional Information and Declarations

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Ghulam Fatima conceived and designed the experiments, performed the experiments, performed the computation work, prepared figures and/or tables, and approved the final draft.

Salabat Khan conceived and designed the experiments, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Farhan Aadil analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

Do Hyuen Kim analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

Ghada Atteia analyzed the data, prepared figures and/or tables, and approved the final draft.

Maali Alabdulhafith analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

The code is available in the Supplemental File.

The data is available at Kaggle:

- IBM. (2018). Telco Customer Churn. https://www.kaggle.com/blastchar/telco-customer-churn

- Kaggle. (2018). Customer Churn Prediction 2020. https://www.kaggle.com/c/customer-churn-prediction-2020/data

- Cell2Cell. (2018). Telecom Churn (Cell2Cell). https://www.kaggle.com/jpacse/datasets-for-churn-telecom.

Funding

The work is supported and funded by the Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R407) and the Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. Dr. Salabat is working for National Research Foundation of Korea (NRF) under the Brain Pool Program (Grant No. 2022H1D3A2A02055024) and Creative Research Project (ID: RS-2023-00248526). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

3 Citations 1,141 Views 103 Downloads