CRPClustering: An R Package for Bayesian Nonparametric Chinese Restaurant Process Clustering with Entropy

Data Science, Okada Algorithm Private Invention Research Laboratory, Shizuoka, Japan
DOI
10.7287/peerj.preprints.26533v2
Subject Areas
Data Mining and Machine Learning, Data Science, Software Engineering
Keywords
bayes, bayesian nonparametrics, R, CRAN, GitHub, Clustering, Entropy, MCMC, Chinese Restaurant Process, Gibbs sampling
Copyright
© 2018 Okada
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Okada M. 2018. CRPClustering: An R Package for Bayesian Nonparametric Chinese Restaurant Process Clustering with Entropy. PeerJ Preprints 6:e26533v2

Abstract

Clustering is a scientific method which finds the clusters of data and many related methods are traditionally researched for long terms. Bayesian nonparametrics is statistics which can treat models having infinite parameters. Chinese restaurant process is used in order to compose Dirichlet process. The clustering which uses Chinese restaurant process does not need to decide the number of clusters in advance. This algorithm automatically adjusts it. Then, this package can calculate clusters in addition to entropy as the ambiguity of clusters.

Author Comment

I improved functions and documentation.