Learning to explore distillability and sparsability: a joint framework for model compression

Liu, Y. and Cao, J. and LI, B. and Hu, W. and Maybank, Stephen (2023) Learning to explore distillability and sparsability: a joint framework for model compression. IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (3), pp. 3378-3395. ISSN 0162-8828.

Preview

Text
LearningToExploreDistillability.pdf - Author's Accepted Manuscript
Download (4MB) | Preview

Official URL: https://doi.org/10.1109/TPAMI.2022.3185317

Abstract

Deep learning shows excellent performance usually at the expense of heavy computation. Recently, model compression has become a popular way of reducing the computation. Compression can be achieved using knowledge distillation or filter pruning. Knowledge distillation improves the accuracy of a lightweight network, while filter pruning removes redundant architecture in a cumbersome network. They are two different ways of obtaining model compression, but few methods simultaneously consider both of them. In this paper, we revisit model compression and define two attributes of a model: distillability and sparsability, which reflect how much useful knowledge can be distilled and how many pruned ratios can be obtained, respectively. Guided by our observations and considering both accuracy and model size, a dynamic distillability and sparsability learning framework (DDSL) is introduced for model compression. DDSL consists of teacher, student and dean. Knowledge is distilled from the teacher to guide the student. The dean controls the training produced by dynamically adjusting the distillation supervision and the sparsability supervision in a meta-learning framework. An alternating direction method of multiplier (ADMM)-based knowlege distillation-with-pruning (KDP) joint opimization algorithm is proposed to train the model. Extensive experimental results show that DDSL outperforms 24 state-of-the-art methods, including both knowledge distillation and filter pruning methods.

Metadata

Item Type:	Article
Keyword(s) / Subject(s):	Knowledge distillation, Filter pruning, Structured sparsity pruning, Deep learning
School:	Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences
Depositing User:	Steve Maybank
Date Deposited:	21 Jun 2022 13:48
Last Modified:	09 Aug 2023 12:53
URI:	https://eprints.bbk.ac.uk/id/eprint/48488

Statistics

DownloadsShow export options

Activity Overview

6 month trend

447Downloads

6 month trend

107Hits

Additional statistics are available via IRStats2.

Archive Staff Only (login required)

Edit/View Item