BIROn - Birkbeck Institutional Research Online

Algorithms for additive clustering of rectangular data tables

Depril, D. and van Mechelen, I. and Mirkin, Boris (2008) Algorithms for additive clustering of rectangular data tables. Computational Statistics & Data Analysis 52 (11), pp. 4923-4938. ISSN 0167-9473.

Full text not available from this repository.
Official URL: http://dx.doi.org/10.1016/j.csda.2008.04.014

Abstract

The overlapping additive clustering model or principal cluster model is a model for two-way two-mode object by variable data that implies an overlapping clustering of the objects and a set of profiles (characteristic variable values for each cluster). The model values of the variables of an object are the sum of the profiles of its corresponding clusters. In the associated data analysis the data matrix at hand is approximated by an overlapping additive clustering model of a prespecified rank by minimizing a least squares loss function. Recently an algorithm has been proposed for this purpose. This algorithm is a sequential fitting strategy, also called the method of principal clusters (PCL). Theoretical and empirical evidence that the PCL algorithm may have problems in revealing the true structure underlying a data set will be presented. As a way out, three new algorithms to fit the principal cluster model to empirical data will be presented: two of an alternating least squares (ALS) type, orthogonally combined with two different starting strategies, and one based on simulated annealing (SA). In a simulation study it is demonstrated that all three new algorithms outperform the existing PCL algorithm. The amount of objects that belong to more than one cluster (the overlap) is further found to have a considerable influence on the algorithmic performance of the ALS algorithms, with low amounts of overlap requiring a different starting strategy than high ones. As a consequence, for the analysis of real data sets in practice, a hybrid approach will be presented consisting of one of the ALS algorithms initialized by means of the two starting strategies under study.

Item Type: Article
School or Research Centre: Birkbeck Schools and Research Centres > School of Business, Economics & Informatics > Computer Science and Information Systems
Depositing User: Administrator
Date Deposited: 02 Feb 2011 12:11
Last Modified: 17 Apr 2013 12:18
URI: http://eprints.bbk.ac.uk/id/eprint/1896

Archive Staff Only (login required)

Edit/View Item Edit/View Item