BIROn - Birkbeck Institutional Research Online

    Algorithms for additive clustering of rectangular data tables

    Depril, D. and van Mechelen, I. and Mirkin, Boris (2008) Algorithms for additive clustering of rectangular data tables. Computational Statistics & Data Analysis 52 (11), pp. 4923-4938. ISSN 0167-9473.

    Full text not available from this repository.

    Abstract

    The overlapping additive clustering model or principal cluster model is a model for two-way two-mode object by variable data that implies an overlapping clustering of the objects and a set of profiles (characteristic variable values for each cluster). The model values of the variables of an object are the sum of the profiles of its corresponding clusters. In the associated data analysis the data matrix at hand is approximated by an overlapping additive clustering model of a prespecified rank by minimizing a least squares loss function. Recently an algorithm has been proposed for this purpose. This algorithm is a sequential fitting strategy, also called the method of principal clusters (PCL). Theoretical and empirical evidence that the PCL algorithm may have problems in revealing the true structure underlying a data set will be presented. As a way out, three new algorithms to fit the principal cluster model to empirical data will be presented: two of an alternating least squares (ALS) type, orthogonally combined with two different starting strategies, and one based on simulated annealing (SA). In a simulation study it is demonstrated that all three new algorithms outperform the existing PCL algorithm. The amount of objects that belong to more than one cluster (the overlap) is further found to have a considerable influence on the algorithmic performance of the ALS algorithms, with low amounts of overlap requiring a different starting strategy than high ones. As a consequence, for the analysis of real data sets in practice, a hybrid approach will be presented consisting of one of the ALS algorithms initialized by means of the two starting strategies under study.

    Statistics

    Downloads
    Activity Overview
    0Downloads
    115Hits

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item Edit/View Item