Enhanced multi-class skin lesion classification using fine-tuned vision transformer and optimized classifiers
Abstract
Skin lesion classification is crucial for early detection of skin cancer, a common life-threatening disease worldwide. Despite growing interest in machine learning-based computer-aided diagnosis (ML-CAD) systems, diagnostic accuracy remains a major challenge due to various factors, including high data imbalance, visual similarity, and multi-class complexity. This study introduces an enhanced ML-CAD framework for accurately classifying seven types of skin lesions, comprising four phases: (i) image and metadata preprocessing, (ii) feature extraction and selection using a fine-tuned Vision Transformer (ViT), (iii) evaluation of individual and ensemble machine learning models, and (iv) hyperparameter tuning via Bayesian optimization. Experiments on the HAM10000 dataset using five-fold cross-validation demonstrate that hybrid models that combine fine-tuned ViT with optimized classifiers achieve 97.82% accuracy, improving upon existing approaches. These results indicate the potential of ML-CAD systems to deliver rapid, reliable, and accurate skin lesion classification, supporting dermatologists in clinical decision-making.