Long-tail hashing

Chen, Y. and Hou, Y. and Leng, S. and Zhang, Q. and Lin, Z. and Zhang, Dell (2021) Long-tail hashing. In: UNSPECIFIED (ed.) Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, pp. 1328-1338. ISBN 9781450380379. (In Press)

Preview

Text
LTH_SIGIR.pdf - Author's Accepted Manuscript
Download (1MB) | Preview

Official URL: https://doi.org/10.1145/3404835.3462888

Abstract

Hashing, which represents data items as compact binary codes, has been becoming a more and more popular technique, e.g., for large-scale image retrieval, owing to its super fast search speed as well as its extremely economical memory consumption. However, existing hashing methods all try to learn binary codes from artificially balanced datasets which are not commonly available in real-world scenarios. In this paper, we propose Long-Tail Hashing Network (LTHNet), a novel two-stage deep hashing approach that addresses the problem of learning to hash for more realistic datasets where the data labels roughly exhibit a long-tail distribution. Specifically, the first stage is to learn relaxed embeddings of the given dataset with its long-tail characteristic taken into account via an end-to-end deep neural network; the second stage is to binarize those obtained embeddings. A critical part of LTHNet is its dynamic meta-embedding module extended with a determinantal point process which can adaptively realize visual knowledge transfer between head and tail classes, and thus enrich image representations for hashing. Our experiments have shown that LTHNet achieves dramatic performance improvements over all state-of-the-art competitors on long-tail datasets, with no or little sacrifice on balanced datasets. Further analyses reveal that while to our surprise directly manipulating class weights in the loss function has little effect, the extended dynamic meta-embedding module, the usage of cross-entropy loss instead of square loss, and the relatively small batch-size for training all contribute to LTHNet's success.

Metadata

Item Type:	Book Section
Keyword(s) / Subject(s):	learning to hash, long-tail datasets, memory network, large-scale multimedia retrieval
School:	Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences
Research Centres and Institutes:	Birkbeck Knowledge Lab, Data Analytics, Birkbeck Institute for
Depositing User:	Dell Zhang
Date Deposited:	19 Jul 2021 10:28
Last Modified:	29 Jul 2025 04:18
URI:	https://eprints.bbk.ac.uk/id/eprint/45119

Statistics

6 month trend

477Downloads

6 month trend

197Hits

Additional statistics are available via IRStats2.

Archive Staff Only (login required)

Edit/View Item