MP-DEC: Multi-Perspective deep embedding custering for protocol data
Abstract
Protocol clustering plays a vital role in network security and data management by enabling the identification of anomalous traffic, detection of potential threats, and optimization of data transmission strategies. However, traditional clustering algorithms often exhibit limited performance in unsupervised settings and typically focus solely on either protocol type or format, neglecting the correlation between them. To overcome these limitations, this paper proposes a multi-Perspective deep embedding clustering model for protocol data. Our approach is structured around three core components. First, a feature weighting mechanism quantifies the clustering significance of individual bytes for protocol type and format. Second, these weights guide a feature-enhanced autoencoder, where they are integrated into the model's initialization and loss function to extract highly discriminative embeddings. Finally, a novel joint optimization strategy within a deep embedding clustering framework leverages these embeddings to achieve the precise separation of protocol types and formats. Experimental results on two real-world datasets demonstrate that our method achieves a V-Measure of 91.8\% for protocol type clustering and a clustering purity of 93.6\% for protocol format.