Email George Cross - georgecross@acm.org

 

Cross, G. R. and Jain, A. K., "Measurement of Clustering Tendency," IFAC Symposium on Digital Control, special invited session on Pattern Recognition, New Delhi, India, 24-29, January 1982.

Abstract:Determining the structure of multi-dimensional data is an important problem in exploratory data analysis and pattern recognition. Clustering methods have been extensively in this process. However, clustering algorithms will locate and specify clusters in data even if none are present. It is therefore appropriate to measure the clustering tendency or randomness of a data set before subjecting it to a clustering algorithm. Hopkins' method of testing for randomness is extended to high dimensions and is tested against data from clustered and hardcore processes along with the Fisher Iris data. As in two dimensions, it appears to be a powerful test for clustering tendency.