Unveiling poetic identity: A comparative evaluation of transformer models for Arabic poet classification
Abstract
Arabic poet classification presents unique challenges due to the morphological richness and stylistic complexity of classical and modern Arabic poetry. In this work, we conduct a comparative evaluation of transformer-based architectures to assess their effectiveness in this domain. We experiment with several pretrained Arabic models using two purpose-built datasets: FrequentPoets, which includes prolific poets with large verse corpora, and CrossEraPoets, which spans poets from different historical periods to capture stylistic and temporal variation. Our findings reveal that a domain-adapted model, AraPoemBERT, consistently outperforms other approaches, achieving 73.11\% accuracy and 73.00\% F1-score on FrequentPoets, and 77.06\% accuracy with 77.04\% F1-score on CrossEraPoe. In contrast, the general-purpose GPT-4o model demonstrated substantially lower performance, particularly on historical data. These results underscore the significance of domain-specific pretraining and highlight the limitations of generalized language models in poetry-related NLP tasks.