A multi prototype classification algorithm and its application to multi class diagnostics

Biomathematics Working Group, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Greifswald - Insel Riems, Germany
DOI
10.7287/peerj.preprints.1180v1
Subject Areas
Algorithms and Analysis of Algorithms, Data Mining and Machine Learning, Data Science
Keywords
Global distance-based classification, Greedy algorithm, Prototype classifier, ROC analysis, Multi class diagnostics
Copyright
© 2015 Ziller
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Ziller M. 2015. A multi prototype classification algorithm and its application to multi class diagnostics. PeerJ PrePrints 3:e1180v1

Abstract

This paper introduces a novel, universal distance-based classification procedure. It is based on a simple geometric model. Considering all objects as points in a metric space, a class is imagined as covered by potentially differentsized hyperspheres, the centres of which are referred to as prototypes. The radii of the hyperspheres are individually optimised by a generalised ROC-analysis. For the approximate solution of the entire discrete optimisation problem, a greedy algorithm was developed and implemented in R. It runs in O(k2∙n2∙log(n)) time where k is the number of prototypes to be selected and n the number of training objects. For application to multi class problems, one against all approach is performed. The diagnostic decision is finalised for that class of maximum positive predictive value when in doubt. Objects not recognised as a member of any of the classes are assigned to an additional residual class. The performance of the classification system presented is demonstrated on various data examples, and in comparison with other methods.

Author Comment

This is a submission to PeerJ Computer Science for review.